Pandas - Analyzing DataFrames

Dhanapriya D

Viewing Data in a DataFrame

When working with large datasets, it's essential to quickly preview the data. One of the most commonly used methods in Pandas for this purpose is the head() method.

The head() method displays the column headers along with a specified number of rows from the beginning of the DataFrame.

Program

Display the First 10 Rows

import pandas as pd

df = pd.read_csv('data.csv')

print(df.head(10))

In this example, we're using a CSV file named data.csv. You can either download this file or open it in your browser to follow along.

Note: If no number is specified, head() will return the first 5 rows by default.

Program

Display the First 5 Rows

import pandas as pd

df = pd.read_csv('data.csv')

print(df.head())

In addition to head(), Pandas also provides the tail() method to preview data from the end of the DataFrame.

The tail() method works similarly and displays the column headers and the specified number of rows from the bottom.

Program

Display the Last 5 Rows

print(df.tail())

Inspecting Data with info()

To gain a quick overview of your dataset’s structure, Pandas provides a built-in method called info() that displays essential metadata about the DataFrame.

Progarm

View Basic Information

print(df.info())

Output

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 169 entries, 0 to 168
Data columns (total 4 columns):

# Column Non-Null Count Dtype
0 Duration 169 non-null int64
1 Pulse 169 non-null int64
2 Maxpulse 169 non-null int64
3 Calories 164 non-null float64
dtypes: float64(1), int64(3)
memory usage: 5.4 KB
None

Result Explained

The result tells us there are 169 rows and 4 columns:

RangeIndex: 169 entries, 0 to 168

Data columns (total 4 columns):

And the name of each column, with the data type:

# Column Non-Null Count Dtype
0 Duration 169 non-null int64
1 Pulse 169 non-null int64
2 Maxpulse 169 non-null int64
3 Calories 164 non-null float64

Dealing with Null Values

Null or missing values can lead to inaccurate analysis if not handled properly. In this dataset, the "Calories" column has 5 rows without values. This is something you should address during the data cleaning process—an essential step in preparing your data for analysis.

Pandas - Analyzing DataFrames

Viewing Data in a DataFrame

Program

Display the First 5 Rows

import pandas as pd

Program

Inspecting Data with info()

Progarm

View Basic Information

Output

Result Explained

Dealing with Null Values

Translate

Related course

Social Plugin

Ads

Website by

Categories

Our Services

Footer Copyright

Contact form

GocourseAI

Pandas - Analyzing DataFrames

Viewing Data in a DataFrame

Program

Display the First 5 Rows

import pandas as pd

Program

Inspecting Data with info()

Progarm

View Basic Information

Output

Result Explained

Dealing with Null Values

You may like these posts

Footer Copyright

Contact form