chartify¶The Chartify package has been created by Spotify, so it seems rude not to make use of it in our project :)
There are several popular Python data visualization packages, of which one is Bokeh.
Chartify is built on top of Bokeh, simplifying the creation of certain types of charts while retaining the ability to modify the underlying Bokeh Figure object.
Take a look at the GitHub repo for more information and examples.
Let's take a look at an example which makes use of our Spotify data. Fork the repl, remembering as always to create your own .env file should you wish to re-run the code.
Things should look familiar, until we get to:
tidy = tidy_data(recs)
We'll take a look at tidy.py to see what this function is doing.
One of the most important requirements for data visualization is to ensure our our data is suitably structured.
Tidy data is described here as follows:
Each variable is a column, each observation is a row, and each type of observational unit is a table.
Beyond being tidy, we may also need to apply certain transformations to our datasets, such as stacking or pivoting, so it can be used with our visualization tools of choice.
set_index()¶df = pd.DataFrame(tracks_data)
track_numbers = list(range(1, len(df) + 1))
df['Track'] = track_numbers
df = df.set_index('Track')
df.head(2)
| album | artists | available_markets | disc_number | duration_ms | explicit | external_ids | external_urls | href | id | ... | mode | speechiness | acousticness | instrumentalness | liveness | valence | tempo | track_href | analysis_url | time_signature | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Track | |||||||||||||||||||||
| 1 | {'album_type': 'ALBUM', 'artists': [{'external... | [{'external_urls': {'spotify': 'https://open.s... | [AD, AE, AL, AR, AT, AU, BA, BE, BG, BH, BO, B... | 1 | 272946 | False | {'isrc': 'GBBLK0300012'} | {'spotify': 'https://open.spotify.com/track/6S... | https://api.spotify.com/v1/tracks/6SXy02aTZU3y... | 6SXy02aTZU3ysoGUixYCz0 | ... | 0 | 0.0361 | 0.2170 | 0.000458 | 0.334 | 0.571 | 80.897 | https://api.spotify.com/v1/tracks/6SXy02aTZU3y... | https://api.spotify.com/v1/audio-analysis/6SXy... | 4 |
| 2 | {'album_type': 'ALBUM', 'artists': [{'external... | [{'external_urls': {'spotify': 'https://open.s... | [AD, AE, AL, AR, AT, AU, BA, BE, BG, BH, BO, B... | 1 | 199066 | False | {'isrc': 'GBBRL9691749'} | {'spotify': 'https://open.spotify.com/track/5b... | https://api.spotify.com/v1/tracks/5beiMMlsINDI... | 5beiMMlsINDI5fxRdF0D42 | ... | 0 | 0.0270 | 0.0123 | 0.000000 | 0.155 | 0.677 | 108.030 | https://api.spotify.com/v1/tracks/5beiMMlsINDI... | https://api.spotify.com/v1/audio-analysis/5bei... | 4 |
2 rows × 34 columns
tracks_data (which is a list of dictionaries)Track column which, if there are six tracks, is simply numbers from 1-6index.copy()¶cols = ['popularity', 'danceability', 'energy', 'loudness', 'valence', 'tempo']
stats = df[cols]
stats.head(2)
| popularity | danceability | energy | loudness | valence | tempo | |
|---|---|---|---|---|---|---|
| Track | ||||||
| 1 | 65 | 0.495 | 0.653 | -6.769 | 0.571 | 80.897 |
| 2 | 47 | 0.654 | 0.773 | -6.484 | 0.677 | 108.030 |
df3 = stats / stats.mean()
df3.head(2)
| popularity | danceability | energy | loudness | valence | tempo | |
|---|---|---|---|---|---|---|
| Track | ||||||
| 1 | 1.164179 | 0.947368 | 0.794082 | 1.103731 | 0.783444 | 0.635385 |
| 2 | 0.841791 | 1.251675 | 0.940008 | 1.057260 | 0.928882 | 0.848494 |
.mean() of each columnThis will allow us to more easily compare the values between tracks of each metric using the same chart configuration.
.stack()¶df4 = df3.stack().to_frame().reset_index()
df4.columns=['Track', 'Feature', 'Value']
df4.head(3)
| Track | Feature | Value | |
|---|---|---|---|
| 0 | 1 | popularity | 1.164179 |
| 1 | 1 | danceability | 0.947368 |
| 2 | 1 | energy | 0.794082 |
.stack() and its sister .unstack() can be very useful for transforming DataFrames into the appropriate format for a given chart type.
For bar plots, Chartify will want us to specify categorical_columns and numeric_column, so we need to have a row for each data point, with the categorical data (in our case Features) to be a value in each row.
feature = 'tempo'
ch = chartify.Chart(x_axis_type='categorical')
feature variable to 'tempo' (we'll be able to modify this to any other value found in the Feature column of the tidy DataFrame) chartify.Chart() object with the given argument for x_axis_typeThere are various examples of Chartify code on GitHub.
ch.plot.bar(
data_frame=tidy[tidy['Feature'] == feature],
...
.plot.bar() method of our chartify.Chart() objectdata_frame is a subset of the tidy DataFrame, containing only rows with the given featurecategorical_columns='Track',
numeric_column='Value',
categorical_order_by='labels',
categorical_order_ascending=True)
categorical_columns (which we set before as being the x_axis_type) will be just Track numeric_column will therefore be represented on the y_axiscategorical_order... parameters will determine the order of the x_axisRefer to the examples and documentation for more detail on these methods and parameters.

Our data has been plotted on a bar chart, with some helpful placeholders for various label and title attributes which we can change.
ch.set_title(f"Track comparison by {feature}")
ch.set_subtitle(None)
ch.set_source_label('Data source: Spotify')
ch.axes.set_xaxis_label('Track')
ch.axes.set_yaxis_label('Value')
chartify.Chart(blank_labels=False, layout='slide_100%', x_axis_type='categorical', y_axis_type='linear')
None)filename = f'charts/{feature}.png'
ch.save(filename, format='png')
.save() method allows us to save the chart in various formatssvg, png and html can be used for the format parameterIn the Seeder app, the svg format has been used, which scales better than png (without loss of image quality). The html format may be useful if you want to incorporate interactivity into your charts.