Data Visualization with `chartify`

Chartify package

The Chartify package has been created by Spotify, so it seems rude not to make use of it in our project :)

There are several popular Python data visualization packages, of which one is Bokeh.

Chartify is built on top of Bokeh, simplifying the creation of certain types of charts while retaining the ability to modify the underlying Bokeh Figure object.

Take a look at the GitHub repo for more information and examples.

Example repl

Let's take a look at an example which makes use of our Spotify data. Fork the repl, remembering as always to create your own .env file should you wish to re-run the code.

Things should look familiar, until we get to:

tidy = tidy_data(recs)

We'll take a look at tidy.py to see what this function is doing.

Tidying data

One of the most important requirements for data visualization is to ensure our our data is suitably structured.

Tidy data is described here as follows:

Each variable is a column, each observation is a row, and each type of observational unit is a table.

Beyond being tidy, we may also need to apply certain transformations to our datasets, such as stacking or pivoting, so it can be used with our visualization tools of choice.

`set_index()`

df = pd.DataFrame(tracks_data)
track_numbers = list(range(1, len(df) + 1))
df['Track'] = track_numbers
df = df.set_index('Track')
df.head(2)

	album	artists	available_markets	disc_number	duration_ms	explicit	external_ids	external_urls	href	id	...	mode	speechiness	acousticness	instrumentalness	liveness	valence	tempo	track_href	analysis_url	time_signature
Track
1	{'album_type': 'ALBUM', 'artists': [{'external...	[{'external_urls': {'spotify': 'https://open.s...	[AD, AE, AL, AR, AT, AU, BA, BE, BG, BH, BO, B...	1	272946	False	{'isrc': 'GBBLK0300012'}	{'spotify': 'https://open.spotify.com/track/6S...	https://api.spotify.com/v1/tracks/6SXy02aTZU3y...	6SXy02aTZU3ysoGUixYCz0	...	0	0.0361	0.2170	0.000458	0.334	0.571	80.897	https://api.spotify.com/v1/tracks/6SXy02aTZU3y...	https://api.spotify.com/v1/audio-analysis/6SXy...	4
2	{'album_type': 'ALBUM', 'artists': [{'external...	[{'external_urls': {'spotify': 'https://open.s...	[AD, AE, AL, AR, AT, AU, BA, BE, BG, BH, BO, B...	1	199066	False	{'isrc': 'GBBRL9691749'}	{'spotify': 'https://open.spotify.com/track/5b...	https://api.spotify.com/v1/tracks/5beiMMlsINDI...	5beiMMlsINDI5fxRdF0D42	...	0	0.0270	0.0123	0.000000	0.155	0.677	108.030	https://api.spotify.com/v1/tracks/5beiMMlsINDI...	https://api.spotify.com/v1/audio-analysis/5bei...	4

2 rows × 34 columns

we create a DataFrame from tracks_data (which is a list of dictionaries)
we add a Track column which, if there are six tracks, is simply numbers from 1-6
we set these values as the DataFrame index

`.copy()`

cols = ['popularity', 'danceability', 'energy', 'loudness', 'valence', 'tempo']    

stats = df[cols]
stats.head(2)

	popularity	danceability	energy	loudness	valence	tempo
Track
1	65	0.495	0.653	-6.769	0.571	80.897
2	47	0.654	0.773	-6.484	0.677	108.030

we've created a subset of the original data, containing only that which is needed for our chart

Relative values

df3 = stats / stats.mean()
df3.head(2)

	popularity	danceability	energy	loudness	valence	tempo
Track
1	1.164179	0.947368	0.794082	1.103731	0.783444	0.635385
2	0.841791	1.251675	0.940008	1.057260	0.928882	0.848494

here we have divided the values in each column by the .mean() of each column
our values are now all relative rather than absolute

This will allow us to more easily compare the values between tracks of each metric using the same chart configuration.

`.stack()`

df4 = df3.stack().to_frame().reset_index()
df4.columns=['Track', 'Feature', 'Value']
df4.head(3)

	Track	Feature	Value
0	1	popularity	1.164179
1	1	danceability	0.947368
2	1	energy	0.794082

.stack() and its sister .unstack() can be very useful for transforming DataFrames into the appropriate format for a given chart type.

For bar plots, Chartify will want us to specify categorical_columns and numeric_column, so we need to have a row for each data point, with the categorical data (in our case Features) to be a value in each row.

Creating a chart

feature = 'tempo'
ch = chartify.Chart(x_axis_type='categorical')

we've set the feature variable to 'tempo' (we'll be able to modify this to any other value found in the Feature column of the tidy DataFrame)
we've instantiated a chartify.Chart() object with the given argument for x_axis_type

There are various examples of Chartify code on GitHub.

ch.plot.bar(
    data_frame=tidy[tidy['Feature'] == feature],
    ...

we will use the .plot.bar() method of our chartify.Chart() object
our data_frame is a subset of the tidy DataFrame, containing only rows with the given feature

categorical_columns='Track',
numeric_column='Value', 
categorical_order_by='labels',
categorical_order_ascending=True)

our categorical_columns (which we set before as being the x_axis_type) will be just Track
the numeric_column will therefore be represented on the y_axis
the categorical_order... parameters will determine the order of the x_axis

Refer to the examples and documentation for more detail on these methods and parameters.

bar-chart

Our data has been plotted on a bar chart, with some helpful placeholders for various label and title attributes which we can change.

Modifying chart attributes

ch.set_title(f"Track comparison by {feature}")
ch.set_subtitle(None)
ch.set_source_label('Data source: Spotify')
ch.axes.set_xaxis_label('Track')
ch.axes.set_yaxis_label('Value')

chartify.Chart(blank_labels=False,
layout='slide_100%',
x_axis_type='categorical',
y_axis_type='linear')

we can set the various chart attributes as required (or remove them using None)
plots can be further customized; refer to the documentation

Saving charts

filename = f'charts/{feature}.png'
ch.save(filename, format='png')

the .save() method allows us to save the chart in various formats
svg, png and html can be used for the format parameter

In the Seeder app, the svg format has been used, which scales better than png (without loss of image quality). The html format may be useful if you want to incorporate interactivity into your charts.

Keys	Action
`?`	Open this help
`n`	Next page
`p`	Previous page
`s`	Search

Data Visualization with chartify