Table of Contents
DataFrame is a basic data type in R. The basic idea of a dataframe is based on a spreadsheet. We can see the data structure of a dataframe as a tabular and spreadsheet. It contains a collection of columns. If you don’t know how to create them, don’t worry. Through the article “How to Create DataFrame in R” we believe it will be useful for you.
What is DataFrame in R?
DataFrame (similar to an Excel spreadsheet or DataFrame Pandas in Python) in R can be created with the data.frame(…) function. Basically, data.frame is like a matrix, only the columns of data.frame can be different data types. Each column consists of a unique data type, but different columns can have different types, for example: the first column can include integers, while the second column includes boolean values etc.
How to Create DataFrame in R
The dataframe has a row and column index; It is like a dictionary where the key is the column name and the corresponding value is the Series data and they share the same row index. Create data.frame from vectors, by default the columns will take the variable name of the vector.
Here is the general syntax for creating a dataframe in R:
first_column <- c("value_1", "value_2", ...)
second_column <- c("value_1", "value_2", ...)
df <- data.frame(first_column, second_column)
or, if you want to return the same dataframe, you can use this syntax:
df <- data.frame (first_column = c("value_1", "value_2", ...),
second_column = c("value_1", "value_2", ...)
)
Example of creating a DataFrame in R
To make it easier for you to visualize the steps to create a DataFrame in R, we will give the following example.
Here is the raw data:
Cars | Quantity |
Audi | 200 |
Toyota | 700 |
Chevrolet | 250 |
Vinfast | 400 |
When declaring the syntax to create a DataFrame in R, you need to pay attention to the double quotes. For data declared as text (eg Cars column) you must enclose them in quotes. For numeric data this may not be necessary. Below is our sample program for your reference:
Cars <- c("Audi", "Toyota", "Chevrolet", "Vinfast")
Quantity <- c(200, 700, 250, 400)
df <- data.frame(Cars, Quantity)
print (df)
And here is the returned result:
Cars Quantity
1 Audi 200
2 Toyota 700
3 Chevrolet 250
4 Vinfast 400
In addition, you can also declare in the following way and return the same result:
df <- data.frame(Cars = c("Audi", "Toyota", "Chevrolet", "Vinfast"),
Quantity = c(200, 700, 250, 400)
)
print (df)
Output
Cars Quantity
1 Audi 200
2 Toyota 700
3 Chevrolet 250
4 Vinfast 400
Upload data.frame from file
R supports several commands to upload some spreadsheet files into data.frames. To read the csv file you can use the read.csv(…), note that you need to see which directory the working directory in R is in first, editing the working directory can use RStudio via Session -> Set Working Directory -> Choose Directory (or Ctrl + Shift + H).
In addition, you can completely create Dataframes by directly importing data into R. For example, as follows:
yourdata <- read.csv ("C: \\ Users \\ Ron \\ Desktop \\ Test \\ yourdata.csv")
df <- data.frame (yourdata)
print (df)
Once you have created a Dataframe in R, you can also apply mathematical operations or statistical analysis on it.
For example, if you need to find the maximum quantity, we have:
Cars <- c("Audi", "Toyota", "Chevrolet", "Vinfast")
Quantity <- c(200, 700, 250, 400)
df <- data.frame(Cars, Quantity)
print (max(df$Quantity))
Output:
700
Or if you want to merge columns, you can do it like this:
df1 <- data.frame(a=2:3, b=5:6)
df2 <- data.frame(a=5:9,b=2:7)
df3 <- data.frame(c=5:10, d=1:4)
cbind(df1, df3)
Output:
a b c d
2 5 5 1
3 6 10 4
Conclusion
Above we have given you more information to handle “How to Create DataFrame in R“. If you have any questions, please leave a comment so we can assist you as soon as possible.
Read more:
→ PySpark – Create DataFrame With Examples In Different Ways
Leave a comment