. Advertisement .
..3..
. Advertisement .
..4..
The function to_datetime() function can convert strings to datetime in Pandas DataFrame. Read on to find out how you can make use of it in Python.
Convert Strings To Datetime In Pandas DataFrame
to_datetime()
This function has syntax as follows:
to_datetime(object, errors, dayfirst, yearfirst, utc, format, exact, unit, infer_datetime_format, origin, cache)
Here are the meanings of important parameters, and you can use them to customize your conversion:
- object: the object you want to convert to datetime. In addition to DataFrame, this object can be a Pandas Series, scalar, or any dict-like or array-like object. When you use a whole DataFrame as an argument, it should have at least 3 columns: “day”, “month”, and “year”.
- format: a string indicates the strftime used to parse datetime. You can check out Python’s official documentation on strftime for more information.
- infer_datetime_format: the boolean parameter controls whether the function should interpret the correct datetime format from the argument. It will attempt to do so when you give it no format, and this parameter is set to True.
- errors: this parameter controls how the function will behave when it encounters an invalid object that couldn’t be parsed into a datetime object.
The function to_datetime() returns different data types depending on its input. If you give it a Series or DataFrame, the returned object will be a Series of either:
- Pandas datetime objects (datetime64),
- featureless objects containing datetime.datetime.
Examples
We are going to import data from a CSV file as an example. It contains information on several movie rentals, including their ID, customer ID, and the date where they were rented.
import pandas as pd
df = pd.read_csv('rental.csv')
......... ADVERTISEMENT .........
..8..
We can verify the data type of each column with the property dtypes:
>>> df.dtypes
rental_id int64
rental_date object
customer_id int64
dtype: object
As you can see, entries of the column ‘rental_date’ are stored in the object dtype, which is meant for mixed types like strings. Converting this column into Pandas datetime type is as simple as providing the function to_datetime() with its label and no other parameters:
pd.to_datetime(df['rental_date'])
......... ADVERTISEMENT .........
..8..
The function clearly returns a Series containing every record of the original column. But this time around, they are stored as datetime objects, not strings. The function has no difficulty in doing the conversion because the original strings are presented as one of the default formats (ISO 8601): YYYY-MM-DD.
If you have a nearly identical DataFrame but its ‘rental_date’ looks like below, the same statement above will result in warnings.
......... ADVERTISEMENT .........
..8..
>>> pd.to_datetime(df2['rental_date'])
<stdin>:1: UserWarning: Parsing '24-01-2005' in DD/MM/YYYY format. Provide format or specify infer_datetime_format=True for consistent parsing.
...
There are two options for getting rid of these warnings: adding the option ‘infer_datetime_format = True’ or specifying the datetime format.
pd.to_datetime(df2['rental_date'], infer_datetime_format = True)
pd.to_datetime(df2['rental_date'], format = '%d-%m-%Y')
Both of these statements produce the same result:
......... ADVERTISEMENT .........
..8..
As mentioned above, you can also convert a whole DataFrame if it has columns labeled “year”, “month”, and “day”. (Learn more about creating a DataFrame from a dictionary here).
df = pd.DataFrame({
'year': [2020, 2021, 2022],
'month': [3, 6, 7],
'day': [18, 5, 22]
})
pd.to_datetime(df)
......... ADVERTISEMENT .........
..8..
Remember that this conversion can be applied to time as well. Let’s say you have a more detailed database with timestamps of rentals like this.
......... ADVERTISEMENT .........
..8..
You can use the function to_datetime() to convert the column ‘rental_time’ to a Series of Pandas datetime objects:
pd.to_datetime(df3['rental_date'])
......... ADVERTISEMENT .........
..8..
Conclusion
You can convert strings to datetime in Pandas DataFrame with the function to_datetime()
. It can automatically detect common datetime formats; otherwise, you must provide the correct format.
Leave a comment