Excel is a widely used and popular spreadsheet software around the world. However, if you need to do advanced data processing but excel can’t do it yet. Then you might think of another programming language like Python. In this article, we will introduce you to “Method to Import an Excel File into Python using Pandas“. Let’s follow the article below.
What is Pandas?
Pandas stands for Python Data Analysis Library, known as an open source library for the Python programming language, it is used for data analysis and manipulation.
When importing pandas data into objects in python, that’s called a dataframe. The data will be stored and displayed in rows or columns and can be manipulated by python with many functions provided.
Thus, to be able to import Excel files to Python with Pandas, you will first have to have pandas installed in your machine.
If not, please use the following command:
pip install pandas pip install xlrd
How to Import an Excel File into Python using Pandas?
To import an Excel file to Python using Pandas, you would use the pandas.read_excel() function. The function syntax is as follows:
pandas.read_excel(io, sheet_name=0, header=0, names=None,….)
Read_Excel function takes the file path of the Excel workbook and returns a DataFrame object with the contents of the workbook.
And here are the steps in detail:
- Step 1: Prepare Excel data
- Step 2: Writing Python scripts
- Step 3: Import Python library
- Step 4: Working with file paths
- Step 5: Extract Excel data with Pandas.Read_Excel()
- Step 6: Run Python script to import excel file
To help you easily visualize and manipulate, here we will give specific examples.
We are having an excel data sheet as follows:
This workbook is saved to the Desktop folder, here is the file path used:
Next you will enter the data and use the read_excel function as follows:
import pandas as pd df = pd.read_excel (r'C:\Users\Kim\Desktop\Flowers.xlsx') #place "r" before the path string to address special character, such as ''. Don't forget to put the file name at the end of the path + '.xlsx' print (df)
Python will pull data from “Flowers.xlsx” into new DataFrame and print DataFrame to terminal, like below:
Product Quantity 0 rose 600 1 tulip 500 2 peony 800 3 lily 900
If during the operation, you encounter this error
ImportError: Missing optional dependency 'xlrd', install the following command:
pip install openpyxl
In addition, if you only want to import a specific column in the excel file, we will do the following:
import pandas as pd data = pd.read_excel (r'C:\Users\Kim\Desktop\Flowers.xlsx') df = pd.DataFrame(data, columns= ['Product']) print (df)
Product 0 rose 1 tulip 2 peony 3 lily
or if you want to import multiple columns, you can simply separate the column names with commas, like this:
import pandas as pd data = pd.read_excel (r'C:\Users\Kim\Desktop\Flowers.xlsx') df = pd.DataFrame(data, columns= ['Product','Quantity']) print (df)
In conclusion, we have given the specific steps for you to Import an Excel File into Python using Pandas. If you have any questions, leave us a comment below. Thanks for reading!