Python/pandas
A library for handling Data frame. The name comes from “panel data”.
There are distributed libraries that accelerate pandas’ performance, such as Dask, Modin (more of a wrapper that uses Dask and Ray), and Xorbits. These libraries provide a drop-in replacement interface for basic tasks.
Tips
Position Letter
1 a
2 b
3 c
In [9]: Series(df.Letter.values,index=df.Position).to_dict()
Out[9]: {1: 'a', 2: 'b', 3: 'c', 4: 'd', 5: 'e'}
Tutorials
- Pandas Pivot Table Explained - Pivot table
- Intro to pandas data structures
- Modern pandas
- http://tomaugspurger.github.io/method-chaining.html
- http://tomaugspurger.github.io/modern-3-indexes.html
- http://tomaugspurger.github.io/modern-4-performance.html
- http://tomaugspurger.github.io/modern-5-tidy.html
- http://tomaugspurger.github.io/modern-6-visualization.html
- http://tomaugspurger.github.io/modern-7-timeseries.html
- Stephen Simmons - Pandas from the Inside / “Big Pandas”
- A Beginner’s Guide to Optimizing Pandas Code for Speed