Search
Bokeh: Interactive visualizations for web pages

Bokeh is an interactive visualization library that targets modern web browsers for presentation. It is good for:

  • Interactive visualization in modern browsers
  • Standalone HTML documents, or server-backed apps
  • Large, dynamic or streaming data

among other things like plotting spatial data on maps. While it is best utilized in Jupyter notebooks and for creating visualizations in HTML and Javascript, it has the ability to generate output files in formats like PNG and SVG. Bokeh is also capable of creating great looking visualizations with very few commands.

Bokeh has several submodules and generally requires quite a few imports. bokeh.io is used to establish where the output plot is intended to be displayed. bokeh.plotting provides functions to create figures and glyphs for a plot/graphic. bokeh.models gives the user a way to turn Python dictionaries or Pandas DataFrames into data that Bokeh can display quickly. The imports relevant to our discussion are shown below. Of particular importance is the bokeh.io.output_notebook function that gives us the ability to display Bokeh plots in output cells of Jupyter notebooks.

import bokeh.io
import bokeh.plotting
import bokeh.models

import numpy as np
import pandas as pd
import os

bokeh.io.output_notebook()
Loading BokehJS ...

Creating a simple Bokeh plot

There are three things required for a Bokeh plot:

  • figure() -- Controls the canvas. Things like: figure size, title, interactive tools, toolbar location.
  • data source -- possibly from a Pandas Dataframe
  • glyphs or line types -- the data points and/or line styles

An example is shown below. First we load our CSV file into a Pandas DataFrame.

df = cat.MV_2D_200wells.read(); df.head(n=3)
X Y facies_threshold_0.3 porosity permeability acoustic_impedance
0 565 1485 1 0.1184 6.170 2.009
1 2585 1185 1 0.1566 6.275 2.864
2 2065 2865 2 0.1920 92.297 3.524

The Bokeh plotting commands are then

The ColumnDataSource class converts our Panda's DataFrame into a Bokeh source for plotting. The circle member function creates the glyph to display. There are others such as Line or Arc. The full list is here.

p = bokeh.plotting.figure()

data_source = bokeh.models.ColumnDataSource(df)

p.circle(x='porosity', y='permeability', source=data_source)

bokeh.io.show(p)
Bokeh Application

Styling the plot

df = cat.production_by_api(api='33013014020000').read()
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~/miniconda/envs/book/lib/python3.8/site-packages/intake/catalog/base.py in __getattr__(self, item)
    338             try:
--> 339                 return self[item]  # triggers reload_on_change
    340             except KeyError:

~/miniconda/envs/book/lib/python3.8/site-packages/intake/catalog/base.py in __getitem__(self, key)
    409             return out()
--> 410         raise KeyError(key)
    411 

KeyError: 'production_by_api'

During handling of the above exception, another exception occurred:

AttributeError                            Traceback (most recent call last)
<ipython-input-7-025ad0ebbba4> in <module>
----> 1 df = cat.production_by_api(api='33013014020000').read()

~/miniconda/envs/book/lib/python3.8/site-packages/intake/catalog/base.py in __getattr__(self, item)
    339                 return self[item]  # triggers reload_on_change
    340             except KeyError:
--> 341                 raise AttributeError(item)
    342         raise AttributeError(item)
    343 

AttributeError: production_by_api

The previous example used a minimum amount of styling available to produce the Bokeh plot. This example shows more options such as those used for plotting time series data, adding labels, controlling the tools available in the toolbar, etc. More visual styling options can be seen in the Bokeh documentation.

p = bokeh.plotting.figure(plot_width=400, plot_height=300, 
                          x_axis_type='datetime', x_axis_label='Date', 
                          y_axis_label='Oil (bbls)', tools='pan,box_zoom')

data_source = bokeh.models.ColumnDataSource(df)

p.line(x='date', y='volume_oil_formation_bbls', source=data_source)

bokeh.io.show(p)
ERROR:bokeh.core.validation.check:E-1001 (BAD_COLUMN_NAME): Glyph refers to nonexistent column name. This could either be due to a misspelling or typo, or due to an expected column being missing. : key "x" value "date", key "y" value "volume_oil_formation_bbls" [renderer: GlyphRenderer(id=1227, glyph=Line(id='1225', ...), ...)]
ERROR:bokeh.core.validation.check:E-1001 (BAD_COLUMN_NAME): Glyph refers to nonexistent column name. This could either be due to a misspelling or typo, or due to an expected column being missing. : key "x" value "date", key "y" value "volume_oil_formation_bbls" [renderer: GlyphRenderer(id=1227, glyph=Line(id='1225', ...), ...)]
Bokeh Application

Plotting Geo data

Bokeh offers a couple of options for visualizing geographic and/or spatial data on maps. It's interactivity makes it a superior library to Matplotlib for these kinds of plots, especially when the intended output is a Jupyter notebook or website.

In the example below we will plot all of Pioneer Natural Resources' (PDX) oil and gas wells in the Permian basin on a Google map. First we read in the latitude and longitude information from a CSV file into a Pandas DataFrame

df = cat.well_columns(columns='latitude_surface_hole,longitude_surface_hole', 
                      where="parent_ticker='PXD' AND basin_name='PERMIAN'").read().dropna(); 
df.head(n=3)
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~/miniconda/envs/book/lib/python3.8/site-packages/intake/catalog/base.py in __getattr__(self, item)
    338             try:
--> 339                 return self[item]  # triggers reload_on_change
    340             except KeyError:

~/miniconda/envs/book/lib/python3.8/site-packages/intake/catalog/base.py in __getitem__(self, key)
    409             return out()
--> 410         raise KeyError(key)
    411 

KeyError: 'well_columns'

During handling of the above exception, another exception occurred:

AttributeError                            Traceback (most recent call last)
<ipython-input-10-2ee0990cf0bd> in <module>
----> 1 df = cat.well_columns(columns='latitude_surface_hole,longitude_surface_hole', 
      2                       where="parent_ticker='PXD' AND basin_name='PERMIAN'").read().dropna(); 
      3 df.head(n=3)

~/miniconda/envs/book/lib/python3.8/site-packages/intake/catalog/base.py in __getattr__(self, item)
    339                 return self[item]  # triggers reload_on_change
    340             except KeyError:
--> 341                 raise AttributeError(item)
    342         raise AttributeError(item)
    343 

AttributeError: well_columns

To plot data on Google maps in Bokeh uses several special features that deviate somewhat from the standard Bokeh figure class, but instead used a dedicated bokeh.models.GMapOptions class to set map options as well as bokeh.plotting.gmap for creating the figure. After figure creation, setting a data source and adding glyphs proceeds as usual.

The gmap class requires a Google API key as the first argument. In this example, the API key is taken from a system environment variable called 'GOOGLE_API_KEY'. Instructions for getting an API key are here.

map_options = bokeh.models.GMapOptions(lat=np.mean(df['latitude_surface_hole'].values), 
                                       lng=np.mean(df['longitude_surface_hole'].values), 
                                       map_type="terrain", zoom=5)

p = bokeh.plotting.gmap(os.environ['GOOGLE_API_KEY'], map_options, title="Well Locations", 
                      tools='box_select,tap,pan,wheel_zoom,reset', width=600, height=400)

source = bokeh.models.ColumnDataSource(df)

p.circle(x='longitude_surface_hole', y='latitude_surface_hole', size=15, 
         fill_color="blue", fill_alpha=0.8, source=source)

bokeh.io.show(p)
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~/miniconda/envs/book/lib/python3.8/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2645             try:
-> 2646                 return self._engine.get_loc(key)
   2647             except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'latitude_surface_hole'

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-11-310b5f1e3a00> in <module>
----> 1 map_options = bokeh.models.GMapOptions(lat=np.mean(df['latitude_surface_hole'].values), 
      2                                        lng=np.mean(df['longitude_surface_hole'].values),
      3                                        map_type="terrain", zoom=5)
      4 
      5 p = bokeh.plotting.gmap(os.environ['GOOGLE_API_KEY'], map_options, title="Well Locations", 

~/miniconda/envs/book/lib/python3.8/site-packages/pandas/core/frame.py in __getitem__(self, key)
   2798             if self.columns.nlevels > 1:
   2799                 return self._getitem_multilevel(key)
-> 2800             indexer = self.columns.get_loc(key)
   2801             if is_integer(indexer):
   2802                 indexer = [indexer]

~/miniconda/envs/book/lib/python3.8/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2646                 return self._engine.get_loc(key)
   2647             except KeyError:
-> 2648                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   2649         indexer = self.get_indexer([key], method=method, tolerance=tolerance)
   2650         if indexer.ndim > 1 or indexer.size > 1:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'latitude_surface_hole'
ERROR:bokeh.core.validation.check:E-1001 (BAD_COLUMN_NAME): Glyph refers to nonexistent column name. This could either be due to a misspelling or typo, or due to an expected column being missing. : key "x" value "date", key "y" value "volume_oil_formation_bbls" [renderer: GlyphRenderer(id=1227, glyph=Line(id='1225', ...), ...)]
Bokeh Application

Other tile providers

In addition to Google, there are other map tile providers such as Carto, OpenStreetMap, WikiMedia, and ESRI to provide the map backgrounds. The following example, is a fairly complex example that shows off a different tile provider along with some interactivity in a Bokeh plot. The code for this is beyond the scope of this introduction, but hopefully this gives you a few ideas of the types of things that you can do in Bokeh.

Hovering your mouse over wells in the contour plot displays tooltip information as well as updates the time series plot with production data. To "freeze" the production plot on a particular well(s), click on the well. To return to full interactivity, click anywhere on the canvas away from a well.

Advanced Bokeh Features

With Bokeh, you can make sophisticated interactive visualizations with callbacks. There are two types of callbacks:

  1. Javascript callbacks allow for transformations of the plot's data sources and other features, e.g. $x$/$y$-axis scaling, by writing Javascript code that is executed on set interactions, e.g. clicking or hovering over a glyph. These allow for fast updating of the plot display while maintaining the "stand alone" nature of the figure, i.e. plots can still be output to stand alone HTML and embedded in web sites backed by standard web servers. Javascript callback are used to provide the interactivity in the previous example.

  2. Python callbacks allow for transformations of any and all plot features, data sources, etc. through the execution of arbitrary Python code. These types of callbacks require a Bokeh server to be running such that the Python code can be executed.

Both types of callbacks can be used with widgets, although an easier-to-use widget toolkit built on top of Bokeh, called Panel, is recommended for sophisticated widget and dashboard creation.

Other Python plotting libraries

There are several other great plotting libraries for Python

Matplotlib is the defacto-standard plotting library for Python. It has the ability to create virtually any two-dimensional visualization you have ever seen including standard plots, bar charts, box plots, contour and surface plots, etc.

Holoviews is a plotting package with a similar interface to Bokeh, but allows you to chose the backend to be either Bokeh (best for web) or Matplotlib (best for print publications) from a unified front end.

Plotly is another modern plotting library primarily targeting web-based visualizations and offers built in dashboarding capabilities.

Altair, the newest of the group, is based on Vega-Lite, a Javascript visualization grammar similar to the Grammar of Graphics implementation in the R programming language.

Further Reading

Further reading on Bokeh can be found in the official documentation.