Working with time-series data in pandas
Learn how to handle dates and time-series data using pandas.
Working with time-series data in pandas
A time-series is used to represent a sequence of observations usually spaced at regular intervals. Some example of time-series data are:
- Daily stock price changes
- Average temperature values across weeks
- Hourly record of a patient's vitals
In this lesson, you will learn how to work with time-series data in pandas.
First, let us create a DataFrame containing time-series data,
# Importing the pandas library as pd
import pandas as pd
# Creating a pandas DataFrame using a Python Dictionary
dict_values = {
"date": ["3/10/2000 10:12:02", "3/11/2000", "3/12/2000"],
"temperature": [20.5, 22, 25],
}
df = pd.DataFrame(dict_values)
# Printing the DataFrame
print(df)
# Printing a concise summary of the DataFrame
df.info()Converting the date column to a time-series data type called 'datetime' using the to_datetime() method from pandas,
# Converting the column to a datetime column
df["date"] = pd.to_datetime(df["date"])
# Printing the date column
print(df["date"])The datetime data type is really useful since we can extract various datatime information from the column.
# Getting the day
print("Day:\n", df["date"].dt.day)
# Getting the month
print("Month:\n", df["date"].dt.month)
# Getting the year
print("Year:\n", df["date"].dt.year)
# Getting the hour
print("Hour:\n", df["date"].dt.hour)
# Getting the minute
print("Minute:\n", df["date"].dt.minute)
# Getting the second
print("Second:\n", df["date"].dt.second)Creating a pandas datetime range
Sometimes you might need to create your own datetime values when creating a pandas DataFrame. A pandas datetime range can be created using the date_range() method from pandas.
# Creating a fixed frequency DatetimeIndex
print(pd.date_range("3/10/2000", periods=10))The frequency of the datetime values can also be changed as follows,
# Creating a fixed frequency DatetimeIndex (Yearly Frequency)
print(pd.date_range("3/10/2000", periods=10, freq="Y"))
# Creating a fixed frequency DatetimeIndex (Hourly Frequency)
print(pd.date_range("3/10/2000 03:00:00", periods=10, freq="H"))
# Creating a fixed frequency DatetimeIndex (Second Frequency)
print(pd.date_range("3/10/2000", periods=10, freq="S"))You can also specify the end date of the date range as shown below with a given frequency,
# Creating a fixed frequency DatetimeIndex
print(pd.date_range("3/10/2000", "3/12/2000", freq="D"))Note: You cannot set both period and frequency when you provide an end date.
# Creating a fixed frequency DatetimeIndex
print(pd.date_range("3/10/2000", "3/12/2000", periods=10))