Pandas Functions Worth Memorizing

Jake from Mito
2 min readJun 10, 2022

--

Find Unique Values in a Column

df["column_name"].unique()

This function allows you to easily see what data you have in a column and what the distribution of that data is.

Web Scraping

pd.read_html("URL")

Webscraping is a key reason that someone might use Python. Pandas has a powerful, yet little known function that allows you to pass in a URL and start handling the tabular data at that address. Here is the full documentation.

Correlation Matrix

df.corr()

Understanding the correlations between your numerical columns is a great first step in deciding what type of analysis you want to apply to the data. Pandas has a function that you can tack on to any dataframe and automatically produces a correlation matrix for the appropriate columns.

Replace Null Values with Zeros (for example)

df.replace(np.nan, "0", inplace = True)

Getting rid of null values can be a key aspect of data cleaning and data analysis, though not all analyses require this step. There many ways of going about this, but this simple, one line function takes all the null values and replaces them with zeros. You can switch out the “0” with anything else by replacing what you put in the quotations.

Export your Dataframe to an Excel File

df.to_excel('dir/myDataFrame.xlsx',  sheet_name='Sheet1')

Understanding how to go back and forth between Excel and Python can be tricky, but many data scientists will find themselves grappling with this workflow frequently. This function allows you to pass your dataframe to an existing Excel file. All you need to do is specify the file path and the sheet name as arguments in the function. Here is the full Pandas to Excel documentation.

Retrieve Dataframe Information

Return the amount of rows and columns in your dataframe:

df.shape()

Get summary statistics about your dataframe:

df.describe()

I hope these functions are helpful :)

--

--