Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Is there a way to plot a long format data in matplotlib without pivoting the table ? #59953

Closed
3 tasks
infinity-void6 opened this issue Oct 3, 2024 · 4 comments
Labels
Needs Info Clarification about behavior needed to assess issue Usage Question Visualization plotting

Comments

@infinity-void6
Copy link

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

Problem

I am currently reading this book called Hands On Data Analysis using Pandas by Stefie Molin. There are two formats of data, wide and long format data. The author uses pandas and matplotlib to plot a wide format data while uses seaborn package for the long format data. I tried searching in the web and it seems to be the custom. I tried asking gpt as well, and I can plot the long format data without seaborn too but it seems that I have to pivot the dataset. Is there a way around it .

Wide Data Frame Sample

date TMAX TMIN TOBS
2018-10-28 8.3 5.0 7.2
2018-10-04 22.8 11.7 11.7
2018-10-20 15.0 -0.6 10.6
2018-10-24 16.7 4.4 6.7
2018-10-23 15.6 -1.1 10.0

Long Data Frame Sample

date datatype value
2018-10-01 TMAX 21.1
2018-10-01 TMIN 8.9
2018-10-01 TOBS 13.9
2018-10-02 TMAX 23.9
2018-10-02 TMIN 13.9
2018-10-02 TOBS 17.2

Long Data Frame after pivoting

image

plot command for wide df
ax = wide_df.plot( x='date', y=['TMAX', 'TMIN', 'TOBS'], figsize=(15, 5), title='Temperature in NYC in October 2018' )
plot command for long df after pivot
ax=long_df.pivot(index='date',columns='datatype',values='value')
and apply a similar command as above

plot command for long_df with seaborn
ax=sns.lineplot(data=long_df,x='date',y='value',hue='datatype')

Why isn't there a hue parameter or something similar in pandas for a long data format? My question can also be framed this way,
" Why is pandas not enough for plotting? Why do I need external packages like matplotlib and seaborn to plot pandas data structure?"

Forgive me for my ignorance but I really want to know why cann't the features available in pandas and seaborn be available in pandas.

Feature Description

Lets start with a hue feature in pandas for a long data format

Alternative Solutions

we might have to pivot the table if we have to plot without using seaborn if we just need to use pandas

Additional Context

No response

@infinity-void6 infinity-void6 added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels Oct 3, 2024
@rhshadrach
Copy link
Member

rhshadrach commented Oct 3, 2024

" Why is pandas not enough for plotting? Why do I need external packages like matplotlib and seaborn to plot pandas data structure?"

pandas plotting uses matplotlib by default. You can change the backend to use different packages.

https://pandas.pydata.org/docs/dev/user_guide/visualization.html#plotting-backends

Does this answer your question?

@rhshadrach rhshadrach added Closing Candidate May be closeable, needs more eyeballs Visualization plotting Usage Question Needs Info Clarification about behavior needed to assess issue and removed Needs Triage Issue that has not been reviewed by a pandas team member Enhancement labels Oct 3, 2024
@mfebrizio
Copy link

My two cents as a pandas user for a ~5 years:

  • Pandas isn't primarily a plotting library, so you'll very quickly run into contexts where you need matplotlib, seaborn, etc.
  • It's best to leave more complex visualization requirements to libraries dedicated to plotting, rather than adding more tertiary capabilities to pandas
  • Pivoting or reshaping the dataset is pretty straightforward, so this doesn't seem like a compelling need. Also there will almost always be some preprocessing needed before data viz.

@johnasiano
Copy link

"I tried asking gpt as well, and I can plot the long format data without seaborn too but it seems that I have to pivot the dataset."

What issue did you have with pivoting? Was ChatGPT unable to help with the pivot and/or did you think that the pivot would not be a good solution for you?

@rhshadrach
Copy link
Member

It appears to me this question has been answered. Closing.

@rhshadrach rhshadrach removed the Closing Candidate May be closeable, needs more eyeballs label Dec 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Info Clarification about behavior needed to assess issue Usage Question Visualization plotting
Projects
None yet
Development

No branches or pull requests

4 participants