Introduction to R Hands-On Basic of R

1- Basic arithmetic operations (R can be used as a calculator)

+ (addition)

- (subtraction)

* (multiplication)

/ (division)

and ^ (exponentiation).

Type directly the command below in the console:

Addition

3 + 7

Substraction

7 - 3

Multiplication

3 * 7

Exponentiation

2 ^ 3

Modulo: returns the remainder of the division of 8/3

8 %% 3

2- Basic arithmetic functions

sum(x, y) # sum total of x + y - also works with vectors (see below)

log2(x) # logarithms base 2 of x

log10(x) # logaritms base 10 of x

exp(x) # Exponential of x

3- Assigning values to variables

# This will assigning x =2

x <- 2

# or use this

x = 2

# to print the value of x you can type x or use print function

print(x)

# Multiply the value of variables with another number - variable x can be used where you'd use a number

x * 6

4 - Comparison operator

< (less than): returns TRUE if the left operand is less than the right operand
> (greater than): returns TRUE if the left operand is greater thanthe right operand
== (equal to)
!= (not equal to)
<= (less than or equal to)
= greater than or equal
%in% is in - this is used with vectors (see below)

5 - Basic data type

1- There are basic data types: numeric, character and logical.
Numeric object: How old are you?
my_age <- 28
Character object: What's your name?
my_name <- "Nicolas"
# logical object: Are you a data scientist?
(yes/no) = (TRUE/FALSE)
2- Class function to see the type of your data

class(my_age)

class(my_name)

3- You can also use the functions is.numeric(), is.character(), is.logical() to check whether a variable is numeric, character or logical, respectively.

is.numeric(my_age)

4- If you want to change the type of a variable to another one, use the as. functions, including as.numeric(), as.character(), as.logical(), etc.

# Convert my_age to a character variable

as.character(my_age)

6 - Vectors

Vectors combine multiple values (numeric, character or logical) in the same object. In this case, you can have numeric, character, or logical vectors.

A vector is created using the function c() (for concatenate), as follow:

# Store your friend\'s age in a numeric vector

friend_ages <- c(27, 25, 29, 26) # Create

friend_ages # Print

27 25 29 26

# Store your friend names in a character vector

my_friends <- c("Nicolas", "Thierry", "Bernard", "Jerome")

my_friends

"Nicolas", "Thierry", "Bernard", "Jerome"

# Store your friend\'s marital status in a logical vector

# Are they married? (yes/no = TRUE/FALSE)

are_married <- c(TRUE, FALSE, TRUE, TRUE)

are_married

TRUE FALSE TRUE TRUE

It’s possible to give a name to the elements of a vector using the function names().

# Vector without element names

friend_ages

27 25 29 26

# Vector with element names

names(friend_ages) <- c("Nicolas", "Thierry", "Bernard", "Jerome")

friend_ages

Nicolas Thierry Bernard Jerome

27 25 29 26

# You can also create a named vector as follows

friend_ages <- c(Nicolas = 27, Thierry = 25, Bernard = 29, Jerome = 26)

friend_ages

Introduction to dplyr

dplyr is a powerful R package designed for data manipulation and transformation tasks. Developed by Hadley Wickham and others, it provides a consistent set of functions that streamline manipulating data frames and other data structures in R.

Data Manipulation Verbs:

dplyr provides intuitive and easy-to-use functions, often called “data manipulation verbs” for common data manipulation tasks. These include functions like filter(), select(), arrange(), mutate(), and summarize().

Pipes (%>%):

dplyr integrates seamlessly with the %>% operator, also known as the pipe operator, which allows you to chain together multiple operations clearly and readably. This makes your code more concise and easier to understand.

Verb Functions:

Filter (): Used to subset data frame rows based on specified conditions.

Select (): This selects specific columns from a data frame.

Arrange (): Used to reorder data frame rows based on one or more variables.

Mutate (): Used to create or modify new variables based on calculations or transformations.

Summarize (): This function aggregates data and is often used in combination with functions like mean(), sum(), min(), max(), etc.

Grouping Operations:

dplyr supports grouped operations, allowing you to perform operations on data grouped by one or more variables. This is achieved using the group_by() function followed by other dplyr verbs.

Joins:

dplyr provides functions for joining data frames, including inner_join(), left_join(), right_join(), and full_join(). These functions enable you to combine data from multiple sources based on common variables.