Deep Analysis with Polars
Transforming and Visualizing Data for Insights
This book is a first draft, and I am actively collecting feedback to shape the final version. Let me know if you spot typos, errors in the code, or unclear explanations, your input would be greatly appreciated. And your suggestions will help make this book more accurate, readable, and useful for others. You can reach me at:
Email: contervalconsult@gmail.com
LinkedIn: www.linkedin.com/in/jorammutenge
Datasets: Download all datasets
Preface
I first used Polars in 2022, while I was a graduate student at the University of Illinois Urbana-Champaign. Up to that point, I had spent years using pandas for data analysis, ever since college, really. So when I came across a faster and more modern library, I decided to give it a try. I was hooked on the first day and haven’t looked back since.
Polars was introduced to the public in June 2020, though version 1.0 didn’t arrive until July 2024. You could say I was an early adopter.
After graduate school, I began working as a data analyst, focusing primarily on forecast data. When I joined my company, I introduced Polars to the team, and its simplicity quickly won everyone over. Today, nearly all our analysis workflows are built in Polars.
When I first started with Polars, there weren’t many resources to learn from. I read the Polars documentation cover to cover, testing each example to see how far I could take my analysis. When the documentation didn’t have the answers, I asked questions from the community in the Polars Discord channel. But mostly, I learned by experimenting. I would think in pandas (the tool I knew best) and then try to translate that logic into Polars. Over time, that thinking flipped. Now, I think and work in Polars first.
Data professionals getting into Polars today are in a much better position. A few books have been written as the library’s popularity has grown, and more tutorials and examples are available online. Still, many of these resources are introductory or surface-level. That’s great for getting started, but not as helpful when you’re deep into analytical work. Few, if any, focus on using Polars for advanced data analysis.
This book aims to fill that gap. My goal is to provide a practical reference for solving complex analysis problems with Polars. And, hopefully, to inspire new ways of uncovering insights from data using techniques you might not have seen before.
Writing Conventions Used
The typographical conventions used throughout this book are as follows:
Italic
For new terms, dataframe column names, filenames, and file extensions.
For links to web resources. These will be underlined and in blue.
Constant width
For code text, Polars methods and expressions, and any keyword in the Python programming language.
For a tip or suggestion.
For a general note.
For a warning or caution.
For additional information to enhance your knowledge.
Polars Version
This book uses Polars 1.37.1.
To ensure that your results match the output shown in the book, make sure you are running the same Polars version. Differences between versions can lead to changes in behavior, performance, or output, which may cause your results to differ from the examples presented in this book.
Data in this Book
One of my biggest pet peeves is when examples in a book use randomly generated numbers to demonstrate data analysis. That’s why the majority of datasets used in this book are real-world datasets, showing how Polars can be applied to actual problems. I’ve also included sources where you can download these datasets.
More importantly, I’ve carefully selected a variety of datasets to highlight how versatile Polars is for analyzing different types of data.
Acknowledgments
Give thanks to some useful people.