Viewing Data in a DataFrame
When working with large datasets, it's essential to quickly preview the data. One of the most commonly used methods in Pandas for this purpose is the head() method.
The head() method displays the column headers along with a specified number of rows from the beginning of the DataFrame.
Program
Display the First 10 Rows
import pandas as pd
df = pd.read_csv('data.csv')
print(df.head(10))
In this example, we're using a CSV file named data.csv. You can either download this file or open it in your browser to follow along.
Note: If no number is specified, head() will return the first 5 rows by default.
Program
Display the First 5 Rows
import pandas as pd
df = pd.read_csv('data.csv')
print(df.head())
In addition to head(), Pandas also provides the tail() method to preview data from the end of the DataFrame.
The tail() method works similarly and displays the column headers and the specified number of rows from the bottom.
Program
print(df.tail())
Inspecting Data with info()
To gain a quick overview of your dataset’s structure, Pandas provides a built-in method called info() that displays essential metadata about the DataFrame.
Progarm
View Basic Information
print(df.info())
Output
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 169 entries, 0 to 168
Data columns (total 4 columns):
# Column Non-Null Count Dtype
0 Duration 169 non-null int64
1 Pulse 169 non-null int64
2 Maxpulse 169 non-null int64
3 Calories 164 non-null float64
dtypes: float64(1), int64(3)
memory usage: 5.4 KB
None
Result Explained
The result tells us there are 169 rows and 4 columns:
RangeIndex: 169 entries, 0 to 168
Data columns (total 4 columns):
And the name of each column, with the data type:
# Column Non-Null Count Dtype
0 Duration 169 non-null int64
1 Pulse 169 non-null int64
2 Maxpulse 169 non-null int64
3 Calories 164 non-null float64
Dealing with Null Values
Null or missing values can lead to inaccurate analysis if not handled properly. In this dataset, the "Calories" column has 5 rows without values. This is something you should address during the data cleaning process—an essential step in preparing your data for analysis.