. Advertisement .
..3..
. Advertisement .
..4..
Knowing how to convert a Pandas DataFrame column to a list is an essential skill for any data analyst or scientist working with the Python language. This tutorial will show you how to accomplish this task.
Convert A Pandas DataFrame Column To A List
Using pandas.Series.tolist()
The tolist() method belongs to the Series class – a one-dimensional array in Pandas. As the name implies, it creates a Python list based on the Series object in the input. You can take advantage of this method to convert a DataFrame column (which can be easily extracted as a Series) to a list. Example: The DataFrame below contains the names, ages, and where four employees work in a company. This snippet demonstrates how you can create a Python list using data stored in the ‘Department’ column.
# import the pandas library
import pandas as pd
# dictionary storing the original data
data = {
‘Name’: [“Cristian”, “Tony”, “Wayner”, “Lucas”],
‘Age’: [35, 28, 22, 38],
‘Position’: [“Director”, “Project manager”, “Staff”, “Leader”]
}
# create and print a dataframe from the dict
df = pd.DataFrame(data)
print(df, “\n”)
# extract and print the column
position_list = df[‘Position’].tolist()
print(position_list)
Output:
Name Age Position
0 Cristian 29 Director
1 Tony 28 Project manager
2 Wayner 22 Staff
3 Lucas 38 Leader
['Director', 'Project manager', 'Staff', 'Leader']
How it works: Create a Series to temporarily store the ‘Department’ column. When applied to a DataFrame with a name of a column, Python’s indexing operator [] generates a Series object. Convert this Series instance to a list using the tolist() method and assign it to the ‘department_list’ variable. Print the list that has just been generated.
Using list()
The built-in list() function provides another way to convert a Pandas DataFrame column to a list. Like the tolist() method, you will have to pass a Series object to this constructor. Example:
# import the pandas library
import pandas as pd
# dictionary storing the original data
data = {
‘Name’: [“Cristian”, “Tony”, “Wayner”, “Lucas”],
‘Age’: [35, 28, 22, 38],
‘Position’: [“Director”, “Project manager”, “Staff”, “Leader”]
}
# create and print a dataframe from the dict
df = pd.DataFrame(data)
print(df, “\n”)
# extract and print the column
position_list = list(df[‘Position’])
print(position_list)
Output:
Name Age Position
0 Cristian 29 Director
1 Tony 28 Project manager
2 Wayner 22 Staff
3 Lucas 38 Leader
['Director', 'Project manager', 'Staff', 'Leader']
How it works: The list() constructor builds a list with the same items and order as the df[‘Department’] Series. It then is assigned to the ‘department_list’ variable. Do tolist() and list() yield the same result? These two methods create identical lists if you feed the same Series object. You can verify this with the snippet below.
# import the pandas library
import pandas as pd
# dictionary storing the original data
data = {
‘Name’: [“Cristian”, “Tony”, “Wayner”, “Lucas”],
‘Age’: [35, 28, 22, 38],
‘Position’: [“Director”, “Project manager”, “Staff”, “Leader”]
}
# create and print a dataframe from the dict
df = pd.DataFrame(data)
print(df, “\n”)
# extract and print the column with tolist()
position_list1 = df[‘Position’].tolist()
print(“List created by pandas.Series.tolist():”, “\n”, position_list1, “\n”)
# extract and print the column with list()
position_list2 = list(df[‘Position’])
print(“List created by list():”, “\n”, position_list2, “\n”)
# compare two lists
if position_list1 == position_list2:
print(“Result: Two outputs are identical.”)
else:
print(“Result: Two outputs are not identical.”)
Output:
Name Age Position
0 Cristian 29 Director
1 Tony 28 Project manager
2 Wayner 22 Staff
3 Lucas 38 Leader
List created by pandas.Series.tolist():
['Director', 'Project manager', 'Staff', 'Leader']
List created by pandas.Series.tolist():
['Director', 'Project manager', 'Staff', 'Leader']
Result: Two outputs are identical.
Conclusion
There are two ways to convert a Pandas DataFrame column to a list. You can use either Series.tolist() in the Pandas library or Python’s built-in list() constructor, which both work with a Series object to do the trick. It is up to you to decide which function you want to rely on.
Leave a comment