Pandas DataFrames

Dhanapriya D

Data Frames

In Pandas, data is typically stored in multi-dimensional tables known as DataFrames.

While a Series represents a single column of data, a DataFrame is the entire table  containing rows and columns, much like a spreadsheet or SQL table.

Program:

import pandas as pd
data = {
    "calories": [420, 380, 390],
    "duration": [50, 40, 45]
}
df = pd.DataFrame(data)
print(df)

Output:

       calories  duration
 0       420        50
 1       380        40
 2       390        45


Locate Row

DataFrame is structured like a table with rows and columns.

To access specific row(s), Pandas provides the loc attribute, which allows you to retrieve one or more rows by their label(s).

Program:

#retrun row 0

print(df.loc[0])

Output:

calories    420
duration     50
Name: 0, dtype: int64

The program returns a panda series and using [] results in panda dataframe.

Named Indexes

You can assign custom names to the indexes in a Series or DataFrame using the index parameter. This allows you to label each row with meaningful identifiers instead of default numeric values.

Program:

import pandas as pd
data = {
    "calories": [420, 380, 390],
    "duration": [50, 40, 45]
}
# Create DataFrame with custom row labels
df = pd.DataFrame(data, index=["day1", "day2", "day3"])
print(df)

Output:

         calories  duration
day1       420        50
day2       380        40
day3       390        45

Locate Named Indexes:

When a DataFrame has custom row labels, you can use the .loc[] attribute to access rows by their names.

Program:

# Refer to the row using its name
print(df.loc["day2"])

Output:

calories    380
duration     40
Name: day2, dtype: int64

Load Files Into a DataFrame

If your dataset is stored in a file, Pandas can easily read the file and load the data into a DataFrame for analysis and manipulation.

Program:

calories,duration:

420,50

380,40

390,45

#You can load the above file into a DataFrame like this:

import pandas as pd
# Load CSV file into a DataFrame

df = pd.read_csv('data.csv')
print(df)

Output:
        
      calories  duration
0       420        50
1       380        40
2       390        45


Tags
Our website uses cookies to enhance your experience. Learn More
Accept !

GocourseAI

close
send