Manipulating DataFrames with Python | Part 2 (Pivoting, Stacking and Melting)

Photo by Luke Chesser on Unsplash

This is a continuation of my last post. It’s required that you read that post first before diving into these topics. Here is the link to it.

In this part-2 of Manipulating DataFrames with Python, we’ll cover some of the following techniques:

  1. Pivoting DataFrames
  2. Stacking and Unstacking DataFrames
  3. Melting DataFrames

These posts are going to be a guide on how to get started playing with DataFrame. The topics in themselves are worth writing an article for each of them. However, I’ll give a basic understanding of them and in case if you want to dig deeper I’ll attach the links to in-depth tutorials about the topics.

Pivoting DataFrames

It is when you want to arrange the dataset indexed on the basis of some columns, or separate out the values for specific columns or both.

Let’s understand this more on the basis of given example:

Let’s say we a trials table given as follows

initial table

The pivot method pivots on the basis of three parameters.

index — Sets the index of new dataframe.

columns — Divides the rows based on the column values.

values — Tells which column to be used for filling values.

pivoted table 1
trials.pivot(index=’treatment’, columns=’gender’, values=’response’)

The statement produces a dataframe as shown in picture in left.

If we don’t set the values column, the table formed will be as follows. It’ll divide the values present in all columns based on the column passed in.

pivoted table 2
trials.pivot(index=’treatment’, columns=’gender’)

The column/s to be indexed should not contain duplicate entries.

Stacking and Unstacking DataFrames

Stacking is a form of grouping the dataframe based on one or more columns. Unstacking is the opposite of stacking where you get the same dataframe as it was before stacking.

The stacking is done as follows:

Stacked DataFrame
Stacked DataFrame
trials.stack(['treatment', 'gender'])

It sets the given column labels as the index of the dataframe.

You can also pass “dropna=True” to remove the missing values.

For Unstacking the dataframe we provide the column names as level parameter to the function. It defines which column to unstack.

Unstacked dataFrame
trials.unstack(level = gender)

You can also pass number to level defining which column to unstack. -1 means last stacked column, 0 means first stacked column and so on. refer here

Melting DataFrames

Melting pandas dataframes forms the table into a long format from wide-format leaving only two columns in place. One with the variable name, second with the value.

Let’s take the below dataframe “new_trails” for reference

melting dataframes initial table
Melted DataFrame 1
Melted DataFrame 1
import pandas as pd
pd.melt(new_trails)

This will horizontally stack all the columns as variables and put their values along with it. We can also specify id vars and other parameters to modify the melting behaviour.

Melted DataFrame 2
Melted dataframe 2
pd.melt(new_trials, id_vars=['treatment'])

Specifying the columns in id_vars sets them as identifier variables. Similarly, you can also set other parameters to melt dataframe. This is explained very nicely in this article.

The next part of this story will be published soon and the link will be added to this article. Happy Learning till then. Stay Connected 🙂

Let’s connect on LinkedIn. You may also reach out to me via ankita2108prasad@gmail.com.


Manipulating DataFrames with Python | Part 2 (Pivoting, Stacking and Melting) was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.


This content originally appeared on Level Up Coding - Medium and was authored by Ankita Prasad

Photo by Luke Chesser on Unsplash

This is a continuation of my last post. It’s required that you read that post first before diving into these topics. Here is the link to it.

In this part-2 of Manipulating DataFrames with Python, we’ll cover some of the following techniques:

  1. Pivoting DataFrames
  2. Stacking and Unstacking DataFrames
  3. Melting DataFrames

These posts are going to be a guide on how to get started playing with DataFrame. The topics in themselves are worth writing an article for each of them. However, I’ll give a basic understanding of them and in case if you want to dig deeper I’ll attach the links to in-depth tutorials about the topics.

Pivoting DataFrames

It is when you want to arrange the dataset indexed on the basis of some columns, or separate out the values for specific columns or both.

Let’s understand this more on the basis of given example:

Let’s say we a trials table given as follows

initial table

The pivot method pivots on the basis of three parameters.

index — Sets the index of new dataframe.

columns — Divides the rows based on the column values.

values — Tells which column to be used for filling values.

pivoted table 1
trials.pivot(index=’treatment’, columns=’gender’, values=’response’)

The statement produces a dataframe as shown in picture in left.

If we don’t set the values column, the table formed will be as follows. It’ll divide the values present in all columns based on the column passed in.

pivoted table 2
trials.pivot(index=’treatment’, columns=’gender’)
The column/s to be indexed should not contain duplicate entries.

Stacking and Unstacking DataFrames

Stacking is a form of grouping the dataframe based on one or more columns. Unstacking is the opposite of stacking where you get the same dataframe as it was before stacking.

The stacking is done as follows:

Stacked DataFrame
Stacked DataFrame
trials.stack(['treatment', 'gender'])

It sets the given column labels as the index of the dataframe.

You can also pass “dropna=True” to remove the missing values.

For Unstacking the dataframe we provide the column names as level parameter to the function. It defines which column to unstack.

Unstacked dataFrame
trials.unstack(level = gender)

You can also pass number to level defining which column to unstack. -1 means last stacked column, 0 means first stacked column and so on. refer here

Melting DataFrames

Melting pandas dataframes forms the table into a long format from wide-format leaving only two columns in place. One with the variable name, second with the value.

Let’s take the below dataframe “new_trails” for reference

melting dataframes initial table
Melted DataFrame 1
Melted DataFrame 1
import pandas as pd
pd.melt(new_trails)

This will horizontally stack all the columns as variables and put their values along with it. We can also specify id vars and other parameters to modify the melting behaviour.

Melted DataFrame 2
Melted dataframe 2
pd.melt(new_trials, id_vars=['treatment'])

Specifying the columns in id_vars sets them as identifier variables. Similarly, you can also set other parameters to melt dataframe. This is explained very nicely in this article.

The next part of this story will be published soon and the link will be added to this article. Happy Learning till then. Stay Connected :)

Let’s connect on LinkedIn. You may also reach out to me via ankita2108prasad@gmail.com.


Manipulating DataFrames with Python | Part 2 (Pivoting, Stacking and Melting) was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.


This content originally appeared on Level Up Coding - Medium and was authored by Ankita Prasad


Print Share Comment Cite Upload Translate Updates
APA

Ankita Prasad | Sciencx (2021-03-25T00:30:49+00:00) Manipulating DataFrames with Python | Part 2 (Pivoting, Stacking and Melting). Retrieved from https://www.scien.cx/2021/03/25/manipulating-dataframes-with-python-part-2-pivoting-stacking-and-melting/

MLA
" » Manipulating DataFrames with Python | Part 2 (Pivoting, Stacking and Melting)." Ankita Prasad | Sciencx - Thursday March 25, 2021, https://www.scien.cx/2021/03/25/manipulating-dataframes-with-python-part-2-pivoting-stacking-and-melting/
HARVARD
Ankita Prasad | Sciencx Thursday March 25, 2021 » Manipulating DataFrames with Python | Part 2 (Pivoting, Stacking and Melting)., viewed ,<https://www.scien.cx/2021/03/25/manipulating-dataframes-with-python-part-2-pivoting-stacking-and-melting/>
VANCOUVER
Ankita Prasad | Sciencx - » Manipulating DataFrames with Python | Part 2 (Pivoting, Stacking and Melting). [Internet]. [Accessed ]. Available from: https://www.scien.cx/2021/03/25/manipulating-dataframes-with-python-part-2-pivoting-stacking-and-melting/
CHICAGO
" » Manipulating DataFrames with Python | Part 2 (Pivoting, Stacking and Melting)." Ankita Prasad | Sciencx - Accessed . https://www.scien.cx/2021/03/25/manipulating-dataframes-with-python-part-2-pivoting-stacking-and-melting/
IEEE
" » Manipulating DataFrames with Python | Part 2 (Pivoting, Stacking and Melting)." Ankita Prasad | Sciencx [Online]. Available: https://www.scien.cx/2021/03/25/manipulating-dataframes-with-python-part-2-pivoting-stacking-and-melting/. [Accessed: ]
rf:citation
» Manipulating DataFrames with Python | Part 2 (Pivoting, Stacking and Melting) | Ankita Prasad | Sciencx | https://www.scien.cx/2021/03/25/manipulating-dataframes-with-python-part-2-pivoting-stacking-and-melting/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.