. Advertisement .
..3..
. Advertisement .
..4..
By following the tutorials below, you may learn more about Pandas join on multiple columns and how to make the best out of it. Jump right in for more information!
Dataframe Example
First, let’s create a dataframe:
# Create a pandas DataFrame.
import pandas as pd
roll_no = [501, 502, 503, 504, 505]
students_df = pd.DataFrame({
"Roll No": [500, 501, 503, 504, 505, 506],
'Name':["Simmon","Pinmark","Hanson","Putin","Plex","Ben"],
‘Gender': ["Female", "Male", "Male", "Female", "Female", "Male"],
'Age': [16, 17, 18, 15, 17, 15]
})
grades_df = pd.DataFrame({
"Roll No": [501, 502, 503, 504, 505, 506],
'Name': ["Simmon","Pinmark","Hanson","Putin","Plex","Ben"],
"Grades": ["A+", "B", "A", "B", "A+", "B"]
})
print("1st DataFrame:")
print(student_df, "\n")
print("2nd DataFrame:")
print(grades_df, "\n")
print("Merged df:")
print(merged_df)
In this manner, we produce the following result:
1st DataFrame:
Roll No Name Gender Age
0 500 Simmon Female 17
1 501 Pinmark Male 18
2 503 Hanson Male 17
3 504 Putin Female 16
4 505 Plex Female 18
5 506 Ben Male 16
2nd DataFrame:
Roll No Name Grades
0 501 Simmon A+
1 502 Pinmark B
2 503 Hanson A
3 504 Putin B
4 505 Plex A+
5 506 Ben B
DataFrame Merge by Default in Pandas With No Key Column Needed
The merge()
function will gather all similar columns from the two DataFrames and substitute them with another one if only two is the number of DataFrames given to be combined.
import pandas as pd
roll_no = [501, 502, 503, 504, 505]
students_df = pd.DataFrame({
"Roll No": [500, 501, 503, 504, 505, 506],
'Name':["Simmon","Pinmark","Hanson","Putin","Plex","Ben"],
‘Gender': ["Female", "Male", "Male", "Female", "Female", "Male"],
'Age': [17, 18, 17, 16, 18, 16]
})
grades_df = pd.DataFrame({
"Roll No": [501, 502, 503, 504, 505, 506],
'Name': ["Simmon","Pinmark","Hanson","Putin","Plex","Ben"],
"Grades": ["A+", "B", "A", "B", "A+", "B"]
})
merged_df = pd.merge(student_df, grades_df)
print("1st DataFrame:")
print(student_df, "\n")
print("2nd DataFrame:")
print(grades_df, "\n")
print("Merged df:")
print(merged_df)
Output:
1st DataFrame:
Roll No Name Gender Age
0 500 Simmon Female 17
1 501 Pinmark Male 18
2 503 Hanson Male 17
3 504 Putin Female 16
4 505 Plex Female 18
5 506 Ben Male 16
2nd DataFrame:
Roll No Name Grades
0 501 Simmon A+
1 502 Pinmark B
2 503 Hanson A
3 504 Putin B
4 505 Plex A+
5 506 Ben B
Merged df:
Roll No Name Gender Age Grades
0 503 Hanson Male 17 A-
1 504 Putin Female 16 A
2 505 Plex Female 18 B
3 506 Ben Male 16 A+
That’s how you allocate a merged_df after merging the DataFrames grades_df and student_df. Although Name and Roll No are shared by both DataFrames, the merge()
method will combine them into one column.
Wrapping It Up
Above are fundamentally what you can employ regarding Pandas join on multiple columns. Hopefully this post can be of great help to you somehow!
Read more:
Leave a comment