The Ultimate Pandas Cheatsheet

Comprehensive reference guide for working with Series and DataFrames in Pandas, the essential Python library for data manipulation and analysis.


Creating Series and DataFrames

Action Code Example
Import the library import pandas as pd
Create a Series from a list s = pd.Series([909976, 8615246, 2872086, 2273305])
Create a Series with an index and name s = pd.Series([...], name="Population", index=["Stockholm", "London", ...])
Create a DataFrame from a nested list df = pd.DataFrame([[909976, "Sweden"], [8615246, "United Kingdom"], ...])
Create a DataFrame from a dictionary df = pd.DataFrame({"Population": [...], "State": [...]}, index=["Stockholm", ...])

Inspect Data

Action Code Example
Get the index s.index
Get the values (as a NumPy array) s.values
Get the columns df.columns
Show the first 5 rows df.head()
Get a summary of the DataFrame df.info()
Get the data types of each column df.dtypes
Get descriptive statistics s.describe()

Selecting and Indexing Data

Action Code Example
Select a column (returns a Series) df["Population"] or df.Population
Select a row by label df.loc["Stockholm"]
Select multiple rows by label df.loc[["Paris", "Rome"]]
Select specific rows and a column df.loc[["Paris", "Rome"], "Population"]

Data Manipulation and Cleaning

Action Code Example
Set the index of a Series or DataFrame s.index = ["Stockholm", "London", "Rome", "Paris"]
Set the column names of a DataFrame df.columns = ["Population", "State"]
Apply a function to a Series df.Population.apply(lambda x: int(x.replace(",", "")))
Set a column as the index df_pop2 = df_pop.set_index("City")
Set a hierarchical (multi-level) index df_pop3 = df_pop.set_index(["State", "City"])
Sort by the index df.sort_index(level=0)
Sort by column values df.sort_values(["State", "NumericPopulation"], ascending=[False, True])
Count unique values in a Series city_counts = df_pop.State.value_counts()
Group by an index level and aggregate df_pop3.groupby(level="State").sum()
Group by a column and aggregate df.groupby("State").sum()
Create a pivot table pd.pivot_table(df, values='outdoor', index=['month'], columns=['hour'])

Time Series

Action Code Example
Create a date range pd.date_range("2015-1-1", periods=31, freq="D")
Convert Unix timestamps to Datetime objects df.time = pd.to_datetime(df.time.values, unit="s")
Localize and convert timezone df.time.tz_localize('UTC').tz_convert('Europe/Stockholm')
Select a time slice df2["2014-1-1":"2014-1-31"]
Convert DatetimeIndex to PeriodIndex df.to_period("M")
Downsample a time series df1_day = df1.resample("D").mean()
Upsample with forward fill df1.resample("5min").ffill()

Plotting

Action Code Example
Create a plot from a Series or DataFrame s.plot(kind='bar', title='bar')
Plot multiple columns from a DataFrame df_temp.plot(y=["outdoor", "indoor"], ax=ax)

Data Input / Output (I/O)

Action Code Example
Read data from a CSV file df_pop = pd.read_csv("european_cities.csv")
Write a DataFrame to a CSV file df.to_csv("subset.csv")
Write to an HDF5 file store store = pd.HDFStore('store.h5')
store["df1"] = df
Read from an HDF5 file store df = store["df1"]
Write to a Parquet file with partitioning df.to_parquet("data.parquet", partition_cols=["dt"])
Read from a Parquet file df_new = pd.read_parquet("data.parquet")

Key Concepts Summary

Series vs DataFrame

Indexing Methods

Data Types in Pandas

Essential Import

import pandas as pd

This comprehensive Pandas cheatsheet covers the essential operations for data manipulation and analysis in Python. Regular practice with these methods will greatly improve your data handling skills.

Updated: January 15, 2025
Author: Danial Pahlavan
Category: Data Science & Analysis