Introduction to R Hands-On Basic of R

1- Basic arithmetic operations (R can be used as a calculator)

+ (addition)

- (subtraction)

* (multiplication)

/ (division)

and ^ (exponentiation).

Type directly the command below in the console:

Addition

3 + 7

Substraction

7 - 3

Multiplication

3 * 7

Exponentiation

2 ^ 3

Modulo: returns the remainder of the division of 8/3

8 %% 3

2- Basic arithmetic functions

sum(x, y) # sum total of x + y - also works with vectors (see below)
log2(x) # logarithms base 2 of x
log10(x) # logaritms base 10 of x
exp(x) # Exponential of x

3- Assigning values to variables

# This will assigning x =2

x <- 2
# or use this

x = 2
# to print the value of x you can type x or use print function

print(x)
# Multiply the value of variables with another number - variable x can be used where you'd use a number

x * 6

4 - Comparison operator

  • < (less than): returns TRUE if the left operand is less than the right operand

  • > (greater than): returns TRUE if the left operand is greater thanthe right operand

  • == (equal to)

  • != (not equal to)

  • <= (less than or equal to)

  • = greater than or equal

  • %in% is in - this is used with vectors (see below)

5 - Basic data type

  • 1- There are basic data types: numeric, character and logical.

  • Numeric object: How old are you?

  • my_age <- 28

  • Character object: What's your name?

  • my_name <- "Nicolas"

  • # logical object: Are you a data scientist?

  • (yes/no) = (TRUE/FALSE)

  • 2- Class function to see the type of your data

class(my_age)

class(my_name)

3- You can also use the functions is.numeric(), is.character(), is.logical() to check whether a variable is numeric, character or logical, respectively.

is.numeric(my_age)

is.numeric(my_age)

4- If you want to change the type of a variable to another one, use the as. functions, including as.numeric(), as.character(), as.logical(), etc.

# Convert my_age to a character variable

as.character(my_age)

6 - Vectors

Vectors combine multiple values (numeric, character or logical) in the same object. In this case, you can have numeric, character, or logical vectors.

A vector is created using the function c() (for concatenate), as follow:

# Store your friend\'s age in a numeric vector

friend_ages <- c(27, 25, 29, 26) # Create

friend_ages # Print

27 25 29 26
# Store your friend names in a character vector

my_friends <- c("Nicolas", "Thierry", "Bernard", "Jerome")

my_friends

"Nicolas", "Thierry", "Bernard", "Jerome"
# Store your friend\'s marital status in a logical vector

# Are they married? (yes/no = TRUE/FALSE)

are_married <- c(TRUE, FALSE, TRUE, TRUE)

are_married

TRUE FALSE TRUE TRUE

It’s possible to give a name to the elements of a vector using the function names().

# Vector without element names

friend_ages

27 25 29 26

# Vector with element names

names(friend_ages) <- c("Nicolas", "Thierry", "Bernard", "Jerome")

friend_ages

Nicolas Thierry Bernard Jerome

27 25 29 26
# You can also create a named vector as follows

friend_ages <- c(Nicolas = 27, Thierry = 25, Bernard = 29, Jerome = 26)

friend_ages

Introduction to dplyr

dplyr is a powerful R package designed for data manipulation and transformation tasks. Developed by Hadley Wickham and others, it provides a consistent set of functions that streamline manipulating data frames and other data structures in R.

  • Data Manipulation Verbs:

dplyr provides intuitive and easy-to-use functions, often called “data manipulation verbs” for common data manipulation tasks. These include functions like filter(), select(), arrange(), mutate(), and summarize().

  • Pipes (%>%):

dplyr integrates seamlessly with the %>% operator, also known as the pipe operator, which allows you to chain together multiple operations clearly and readably. This makes your code more concise and easier to understand.

  • Verb Functions:

Filter (): Used to subset data frame rows based on specified conditions.

Select (): This selects specific columns from a data frame.

Arrange (): Used to reorder data frame rows based on one or more variables.

Mutate (): Used to create or modify new variables based on calculations or transformations.

Summarize (): This function aggregates data and is often used in combination with functions like mean(), sum(), min(), max(), etc.

  • Grouping Operations:

dplyr supports grouped operations, allowing you to perform operations on data grouped by one or more variables. This is achieved using the group_by() function followed by other dplyr verbs.

  • Joins:

dplyr provides functions for joining data frames, including inner_join(), left_join(), right_join(), and full_join(). These functions enable you to combine data from multiple sources based on common variables.

Additional learning material

R for descriptive Statistics and Graphics

http://www.sthda.com/english/wiki/descriptive-statistics-and-graphics

Data visualization with ggplot2

https://rpubs.com/GeospatialEcologist/DataViz

The Epidemiologist R Handbook

https://epirhandbook.com/en/reports-with-r-markdown.html

Additional Learning Resources

1- This tutorial explains the installation, basics, and different types of statistical analysis, which can be found under the analysis tab

http://www.sthda.com/english/wiki/installing-r-and-rstudio-easy-r-programming

2- introduction to ggplot2 (https://www.analyticsvidhya.com/blog/2022/03/a-comprehensive-guide-on-ggplot2-in-r/ )

3- Introduction to dplyr https://datacarpentry.org/R-ecology-lesson/03-dplyr.html

3- introduction to tidyverse (https://rpubs.com/ruruu127/417821)

4- Automated data exploration process for analytic tasks and predictive modelling with DataExplorer https://cran.r-project.org/web/packages/DataExplorer/readme/README.html

R Learning Communities

  1. The R for Data Science Community: https://rfordatasci.com/

  2. R-Ladies (focused at women): https://rladies.org/


Previous submodule: