. Advertisement .
..3..
. Advertisement .
..4..
Want to learn how to use the Left Join in Pandas function? You are at the right place!
This guide will describe the importance of the Left Join and demonstrate how to use pandas DataFrames to perform it on several columns. We will go over each of them with detailed examples, as Pandas supports a variety of ways to combine DataFrames.
Left Join function (also known as Left Outer Join) gives back every row from the DataFrame on the left, no matter if a match was discovered on the DataFrame on the right. This function assigns the null value for the left record when the join expression is invalid and discards the right records.
Left Join Of DataFrame In Pandas Examples
Check out the brief examples of Left Join in Pandas below.
# pandas left join two DataFrames by Index
df3=df1.join(df2, lsuffix="_left", rsuffix="_right", how='left')
# pandas.merge() by Column
df3=pd.merge(df1,df2, how='left')
# DataFrame.merge() by Column
df3=df1.merge(df2, how='left')
# Merge DataFrames by Column
df3=pd.merge(df1,df2, on='Courses', how='left')
# When column names are different
df3=pd.merge(df1,df2, left_on='Courses', right_on='Courses', how='left')
We create a DataFrame to use for demonstrating examples for the Left Join.
# Create DataFrames
import pandas as pd
technologies = {
'Courses':["Spark","PySpark","Python","pandas"],
'Fee' :[20000,25000,22000,30000],
'Duration':['30days','40days','35days','50days'],
}
index_labels=['r1','r2','r3','r4']
df1 = pd.DataFrame(technologies,index=index_labels)
technologies2 = {
'Courses':["Spark","Java","Python","Go"],
'Discount':[2000,2300,1200,2000]
}
index_labels2=['r1','r6','r3','r5']
df2 = pd.DataFrame(technologies2,index=index_labels2)
print(df1)
print(df2)
Output:
Courses Fee Duration
r1 Spark 20000 30days
r2 PySpark 25000 40days
r3 Python 22000 35days
r4 pandas 30000 50days
Courses Discount
r1 Spark 20000
r6 Java 25000
r3 Python 22000
r5 Go 30000
Left Join In Pandas Utilizing Join()
By default, panads.DataFrame.join()
does the Left Join function on the row indices while offering a method to do a join on different join types. Additionally, this method allows several parameters. Consult the pandas join()
function for usage, syntax, and other samples.
# pandas join two DataFrames
df3=df1.join(df2, lsuffix="_left", rsuffix="_right")
print(df3)
Output:
Courses_left Fee Duration Courses_right Discount
r1 Spark 20000 30days Spark 2000.0
r2 PySpark 25000 40days NaN NaN
r3 Python 22000 35days Python 1200.0
r4 pandas 30000 50days NaN NaN
When you look at the results, you will see that indexes r4 and r2 are from the DataFrame on the right, and it includes NaN for several columns for the records that do not match.
Left Join In Pandas Utilizing Merge()
With the merge()
function, you can perform merging by columns or index, merging on several columns, and various join types. This function joins all shared columns between the two DataFrames by default and conducts one inner join. For the left join, utilize the how parameter.
# pandas.merge()
df3=pd.merge(df1,df2, how='left')
print(df3)
# DataFrame.merge()
df3=df1.merge(df2, how='left')
print(df3)
Output:
Courses Fee Duration Discount
0 Spark 20000 30days 2000.0
1 PySpark 25000 40days NaN
2 Python 22000 35days 1200.0
3 pandas 30000 50days NaN
Additionally, you can directly state the names of the columns.
# Merge DataFrames by Columns
df3=pd.merge(df1,df2, on='Courses', how='left')
What if you want to mix distinct column names from two DataFrames in Pandas? Here is an example of how to do so.
# When column names are different
df3=pd.merge(df1,df2, left_on='Courses', right_on='Courses', how='left')
print(df3)
Another thing to keep in mind is that the merge()
function also allows a variety of parameters. Consult the pandas merge()
if you want to understand usage and syntax with examples.
Complete Examples
Below are examples for you to use to better understand the methods of performing Left Join in Pandas.
import pandas as pd
technologies = {
'Courses':["Spark","PySpark","Python","pandas"],
'Fee' :[20000,25000,22000,30000],
'Duration':['30days','40days','35days','50days'],
}
index_labels=['r1','r2','r3','r4']
df1 = pd.DataFrame(technologies,index=index_labels)
technologies2 = {
'Courses':["Spark","Java","Python","Go"],
'Discount':[2000,2300,1200,2000]
}
index_labels2=['r1','r6','r3','r5']
df2 = pd.DataFrame(technologies2,index=index_labels2)
print(df1)
print(df2)
# pandas join two DataFrames
df3=df1.join(df2, lsuffix="_left", rsuffix="_right")
print(df3)
# pandas.merge()
df3=pd.merge(df1,df2, how='left')
print(df3)
# DataFrame.merge()
df3=df1.merge(df2, how='left')
print(df3)
# Merge DataFrames by Column
df3=pd.merge(df1,df2, on='Courses', how='left')
# When column names are different
df3=pd.merge(df1,df2, left_on='Courses', right_on='Courses', how='left')
The Bottom Line
A Left Join is applied to retrieve only the data from the left table. In this tutorial, we have provided different methods to use the Left Join in Pandas function, such as utilizing merge()
and join()
.
Hopefully, our included examples can make it easier for you to learn the above techniques. Continue practicing to become proficient in using this function since it is a frequently used Join type.If you want to learn more about other functions in Pandas, there is a Join on multiple columns guide that we suggest you take a look at. Along with what you have learned about joining DataFrame in this tutorial, we invite you to investigate that and use it in your upcoming project.
Leave a comment