Introduction to pandas

Learn about pandas, its benefits, and how to install and import it.

10 min read
Beginner

Introduction to pandas

In this lesson, you will learn about pandas along with how to install and import it in Python. You will also learn how to check the version of the installed pandas library.

What is pandas?

According to the official documentation, pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real-world data analysis in Python.

The pandas library is built on top of Numpy and it provides flexible data structures for manipulating numerical tables and time series. Additionally, it has the broader goal of becoming the most powerful and flexible open-source data analysis/manipulation tool available in any language and is working towards that goal.

Using only two kinds of data structures, pandas Series and pandas DataFrame, the library can handle the majority of data used in finance, statistics, and various other fields alike. You will be learning about these data structures in upcoming lessons.

Benefits of pandas

Here is a list of some of the benefits that pandas provides:

  • It provides tools for reading and writing data between in-memory data structures and different formats: CSV and text files, Microsoft Excel, SQL databases, and the fast HDF5 format.
  • It provides high performance merging and joining of data sets.
  • It provides time series-functionality: date range generation and frequency conversion, moving window statistics, date shifting and lagging. You can even create domain-specific time offsets and join time series without losing data.
  • It is highly optimized for performance, with critical code paths written in Cython or C.

Installing pandas using the Python Package Manager (pip)

pandas can be installed using the Python Package Manager, called ‘pip’. Using pip, you can run the following command in your command line/terminal to install pandas:

pip install pandas

To install pandas while you are in Jupyter Notebook/Lab, you can use the exclamation (!) syntax to execute commands from the underlying operating system.

python
!pip install pandas

This will install the latest stable version of pandas for you to import and work with.

Importing pandas in Python

Once you’ve installed pandas, you can use your favorite IDE (PyCharm, Jupyter Notebook, etc.) or the Python shell to import the library and use it.

It is a general convention to import pandas as pd in your Python code and you will find that a lot of Python programmers do the same. The following block of code illustrates how to import pandas in Python:

python
# Importing the pandas library as pd
import pandas as pd

If running this line of code doesn’t give you an error, then, you’ve successfully installed and imported pandas in Python.

Next, you can check the version of the installed pandas library by printing out the __version__ attribute off of the pandas package.

python
# Checking the version of the pandas library
print(pd.__version__)

If your installed version is 1.2.0 or newer, then, you are good to go with this course!