For any query, contact us at
+91-9872993883
+91-8283824812
info@ris-ai.com

☰

AI Demos Blog Thesis Services Pricing Contact Us Know More

Most Viewed Articles

Blogs >
Pandas Basic DataFrame Operations

Hello guys!,here we will work with some basic operations of pandas,we will learn the pandas operations in modules.so this is the first module in which we will go through how dataframe is created,how it is read,how we can apply various operation on rows and column .so rest we will disscuss in another module.

Pandas Basic DataFrame Operations ¶

Operations on Dataframe ¶

Pandas DataFrame is two-dimensional size-mutable, heterogeneous tabular data structure with labeled axes . A Data frame is a two-dimensional data structure, i.e., data is aligned in a tabular fashion in rows and columns. Pandas DataFrame consists of three principal components, the data, rows, and columns.

In [3]:

import pandas as pd
 
# intialise data of lists.
data = {'Name':['ram', 'sham', 'alpha', 'gamma'],
        'Age':[20, 21, 19, 18],
        'Address':['Delhi', 'Kanpur', 'Allahabad', 'Kannauj'],
        'Qualification':['Msc', 'MA', 'MCA', 'Phd']}
# Create DataFrame
df = pd.DataFrame(data)
# Print the output.
print(df)

    Name  Age    Address Qualification
0    ram   20      Delhi           Msc
1   sham   21     Kanpur            MA
2  alpha   19  Allahabad           MCA
3  gamma   18    Kannauj           Phd

In [6]:

#dataframe is created
df.to_csv("name.csv")

Column Selection ¶

Column Selection: In Order to select a column in Pandas DataFrame, we can either access the columns by calling them by their columns name.

In [4]:

df[['Name', 'Qualification']]

Out[4]:

	Name	Qualification
0	ram	Msc
1	sham	MA
2	alpha	MCA
3	gamma	Phd

Row Selection:¶

Row Selection: Pandas provide a unique method to retrieve rows from a Data frame. DataFrame.loc[] method is used to retrieve rows from Pandas DataFrame. Rows can also be selected by passing integer location to an iloc[] function.

In [8]:

df = pd.read_csv("name.csv", index_col ="Name")
first = df.loc["ram"]
second = df.loc["gamma"]
print(first, "\n\n\n", second)

Unnamed: 0           0
Age                 20
Address          Delhi
Qualification      Msc
Name: ram, dtype: object


 Unnamed: 0             3
Age                   18
Address          Kannauj
Qualification        Phd
Name: gamma, dtype: object

Indexing a DataFrame using .iloc[ ]¶

This function allows us to retrieve rows and columns by position. In order to do that, we’ll need to specify the positions of the rows that we want, and the positions of the columns that we want as well. The df.iloc indexer is very similar to df.loc but only uses integer locations to make its selections.

In [10]:

row2 = df.iloc[2]
row2

Out[10]:

Unnamed: 0               2
Age                     19
Address          Allahabad
Qualification          MCA
Name: alpha, dtype: object

Working with Missing Data ¶

Missing Data can occur when no information is provided for one or more items or for a whole unit. Missing Data is a very big problem in real life scenario. Missing Data can also refer to as NA(Not Available) values in pandas.

Checking for missing values using isnull() and notnull()

In [24]:

import numpy as np

dict = {'First ':[100, 90, np.nan, 95,89,0,100,np.nan],
        'Second ': [30, 45, 56, np.nan,1,40,np.nan,70],
        'Third ':[np.nan, 40, 80, 98,np.nan,np.nan,13,55]}

# creating a dataframe from list
df = pd.DataFrame(dict)
df

Out[24]:

	First	Second	Third
0	100.0	30.0	NaN
1	90.0	45.0	40.0
2	NaN	56.0	80.0
3	95.0	NaN	98.0
4	89.0	1.0	NaN
5	0.0	40.0	NaN
6	100.0	NaN	13.0
7	NaN	70.0	55.0

In [25]:

df.isnull()

Out[25]:

	First	Second	Third
0	False	False	True
1	False	False	False
2	True	False	False
3	False	True	False
4	False	False	True
5	False	False	True
6	False	True	False
7	True	False	False

Filling missing values using fillna(), replace() and interpolate() ¶

In order to fill null values in a datasets, we use fillna(), replace() and interpolate() function these function replace NaN values with some value of their own. All these function help in filling a null values in datasets of a DataFrame.

Interpolate() function is basically used to fill NA values in the dataframe but it uses various interpolation technique to fill the missing values rather than hard-coding the value.

In [26]:

df.fillna(0)

Out[26]:

	First	Second	Third
0	100.0	30.0	0.0
1	90.0	45.0	40.0
2	0.0	56.0	80.0
3	95.0	0.0	98.0
4	89.0	1.0	0.0
5	0.0	40.0	0.0
6	100.0	0.0	13.0
7	0.0	70.0	55.0

Dropping missing values using dropna() :¶

In order to drop a null values from a dataframe, we used dropna() function this fuction drop Rows/Columns of datasets with Null values in different ways.

In [28]:

dict = {'First Score':[100, 90, np.nan, 95],
        'Second Score': [30, np.nan, 45, 56],
        'Third Score':[52, 40, 80, 98],
        'Fourth Score':[np.nan, np.nan, np.nan, 65]}
df = pd.DataFrame(dict)
print(df)
df.dropna()

   First Score  Second Score  Third Score  Fourth Score
0        100.0          30.0           52           NaN
1         90.0           NaN           40           NaN
2          NaN          45.0           80           NaN
3         95.0          56.0           98          65.0

Out[28]:

	First Score	Second Score	Third Score	Fourth Score
3	95.0	56.0	98	65.0

In [ ]:

Most Viewed Articles

Pandas Basic DataFrame Operations ¶

Operations on Dataframe ¶

Column Selection ¶

Row Selection:¶

Indexing a DataFrame using .iloc[ ]¶

Working with Missing Data ¶

Filling missing values using fillna(), replace() and interpolate() ¶

Dropping missing values using dropna() :¶

Search Article

Popular ML Articles

Resources You Will Ever Need

Popular Searches

Go for Research

Consultation fee- 150 USD/hour

Select Thesis

Synopsis

Research Paper

Total cost (in USD): $0

PHD

Contact for custom package.

Most Viewed Articles

Pandas Basic DataFrame Operations ¶

Operations on Dataframe ¶

Column Selection ¶

Row Selection:¶

Indexing a DataFrame using .iloc[ ]¶

Working with Missing Data ¶

Filling missing values using fillna(), replace() and interpolate() ¶

Dropping missing values using dropna() :¶

Don't forget to share this Article!

Sharing is Caring

Search Article

Popular ML Articles

Resources You Will Ever Need

Popular Searches

Go for Research

Consultation fee- 150 USD/hour

Select Thesis

Synopsis

Research Paper

Total cost (in USD): $0

PHD

Contact for custom package.