I’ve said it before, and I’ll say it again: it’s important to know the data you’re working with before you start analyzing it. Below is a dataframe showing cereal brands.
shape: (77, 4)
mfr
type
calories
protein
str
str
i64
i64
"Nabisco"
"Cold"
70
4
"Quaker Oats"
"Cold"
120
3
"Kellogs"
"Cold"
70
4
"Kellogs"
"Cold"
50
4
"Ralston Purina"
"Cold"
110
2
…
…
…
…
"General Mills"
"Cold"
110
2
"General Mills"
"Cold"
110
1
"Ralston Purina"
"Cold"
100
3
"General Mills"
"Cold"
100
3
"General Mills"
"Cold"
110
2
Look into the dataset
Polars has a handy function, glimpse, that allows you to explore the dataset and see what values each column contains. More importantly, it displays one column per line, making it easier to view wider columns. Here’s how to use it: