Intake: Data Distribution

Intake is a Python package for loading, investigating, and organizing data. From the perspective of a data analyst, it has the following features

Loads data into containers you already use: NumPy arrays, Pandas/Dask Dataframes, etc.
Reduces boilerplate code
Facilitates reusable workflows
Install datasets as Python packages
"Self-describing" data sources
"Quick look" plotting

Below we'll import a built-in catalog of data sources from Intake. We can then list the contents of these data sources.

from intake import cat
list(cat)

['AI_grid',
 'MV_2D_200wells',
 'nonlinear_facies_v1',
 'nonlinear_facies_v2',
 'petrophysical',
 'poro_perm',
 'porosity_1D',
 'production',
 'sample_data',
 'sample_data_12',
 'sample_data_MV_biased',
 'sample_data_biased',
 'siesmic',
 'unconv_MV',
 'unconv_MV_v4',
 'wells']

Each of the entries in the list above are data sources, most of them take parameters as arguments from the user to modify exactly what data gets loaded. The data sources get the data from remote SQL servers as well as locally installed CSV files. Exactly where the data comes from is intentionally abstracted from the data analyst, freeing them to focus on the workflows that use the data.

We'll later show how these data sources have methods that help investigate the user parameters and other descriptions, but for now let's quickly look at an example of loading some data.

Here we will load the total production histories (oil, gas, water) from a well identified by the API number '33007000110000'. This command executes a SQL query to a data base and returns a Panda's dataframe.

In these tutorials, we attempt to not use packages that haven't been introduced yet, and we will cover Panda's in detail later, but we use Intake to load data in essentially every tutorial, you'll have to excuse the premature use of Panda's as well as a few plotting utilities as they are used only for the purposes of demonstrating Intake.

df = cat.production_by_api(api='33007000110000').read()

df.set_index(['date']).head()

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~/miniconda/envs/book/lib/python3.8/site-packages/intake/catalog/base.py in __getattr__(self, item)
    338             try:
--> 339                 return self[item]  # triggers reload_on_change
    340             except KeyError:

~/miniconda/envs/book/lib/python3.8/site-packages/intake/catalog/base.py in __getitem__(self, key)
    409             return out()
--> 410         raise KeyError(key)
    411 

KeyError: 'production_by_api'

During handling of the above exception, another exception occurred:

AttributeError                            Traceback (most recent call last)
<ipython-input-3-09323380e8c5> in <module>
----> 1 df = cat.production_by_api(api='33007000110000').read()
      2 
      3 df.set_index(['date']).head()

~/miniconda/envs/book/lib/python3.8/site-packages/intake/catalog/base.py in __getattr__(self, item)
    339                 return self[item]  # triggers reload_on_change
    340             except KeyError:
--> 341                 raise AttributeError(item)
    342         raise AttributeError(item)
    343 

AttributeError: production_by_api

Reduce boilerplate code

The following code is requried to reproduce the functionality we've demonstrated with a one line command from Intake. The steps here include loading required modules, loading environment variables, formatting the URL for the PostgresSQL database connection, formatting a SQL statement given an API number as an argument to the function query_api, and finally, using the Panda's read_sql() function to load the data into a DataFrame object.

import os
import pandas as pd

username = os.environ['BAZEAN_POSTGRES_USERNAME']
password = os.environ['BAZEAN_POSTGRES_PASSWORD']

engine = 'postgresql://{}:{}@daytum-server.bazean.com:5432/daytum'.format(username, password)

def query_api(api):
    """
        Return SQL statement to get production given an API Number
    """
    return "SELECT date,volume_oil_formation_bbls," \
                   "volume_gas_formation_mcf," \
                   "volume_water_formation_bbls " \
                   "FROM public.production_all WHERE api='{}'".format(api)

pd.read_sql(query_api(api='33007000140000'), engine, parse_dates=['date']).head(3)

	date	volume_oil_formation_bbls	volume_gas_formation_mcf	volume_water_formation_bbls
0	2010-01-01	214	134	685
1	2010-02-01	288	145	948
2	2010-03-01	316	180	959

Compare with

cat.production_by_api(api='33007000140000').read().head(3)

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~/miniconda/envs/book/lib/python3.8/site-packages/intake/catalog/base.py in __getattr__(self, item)
    338             try:
--> 339                 return self[item]  # triggers reload_on_change
    340             except KeyError:

~/miniconda/envs/book/lib/python3.8/site-packages/intake/catalog/base.py in __getitem__(self, key)
    409             return out()
--> 410         raise KeyError(key)
    411 

KeyError: 'production_by_api'

During handling of the above exception, another exception occurred:

AttributeError                            Traceback (most recent call last)
<ipython-input-5-cfef9cd20bfe> in <module>
----> 1 cat.production_by_api(api='33007000140000').read().head(3)

~/miniconda/envs/book/lib/python3.8/site-packages/intake/catalog/base.py in __getattr__(self, item)
    339                 return self[item]  # triggers reload_on_change
    340             except KeyError:
--> 341                 raise AttributeError(item)
    342         raise AttributeError(item)
    343 

AttributeError: production_by_api

Reusable workflows

Sometimes you may have data that comes from different sources, but you would like to reuse a common workflow to analyize the data. For example, one set of data from a PostgresSQL database and another set stored in CSV files, the data engineer can set up catalogs such that the data has the same structure, and the data analyst can use an identical workflow.

Below is a simple plotting workflow that takes as arguments an Intake catalog and an API number as arguments and formats a plot to show the production histories of gas and oil for a well identified by the API number.

def my_fancy_plotting_workflow(catalog, api):
    """
       Creates a plot given a dataframe with a column labeled 'date'
    """
    dataframe = catalog.production_by_api(api=api).read()
    ax1 = dataframe.set_index(['date']).plot(y=['volume_oil_formation_bbls'])
    ax2 = dataframe.set_index(['date']).plot(y=['volume_gas_formation_mcf'], ax=ax1, secondary_y=True)
    ax1.set_xlabel('Date')
    ax1.set_ylabel('Production (BBLS)')
    ax2.set_ylabel('Production (MCF)')
    
    handles = []
    for ax in [ax1, ax2]:
        for h,_ in zip(*ax.get_legend_handles_labels()):
            handles.append(h)
    ax1.legend(handles,['Oil', 'Gas (right)']) 
    return ax1

Using the same catalog introduced earlier with our plotting function my_fancy_plotting_workflow().

my_fancy_plotting_workflow(cat, api='33007000110000');

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~/miniconda/envs/book/lib/python3.8/site-packages/intake/catalog/base.py in __getattr__(self, item)
    338             try:
--> 339                 return self[item]  # triggers reload_on_change
    340             except KeyError:

~/miniconda/envs/book/lib/python3.8/site-packages/intake/catalog/base.py in __getitem__(self, key)
    409             return out()
--> 410         raise KeyError(key)
    411 

KeyError: 'production_by_api'

During handling of the above exception, another exception occurred:

AttributeError                            Traceback (most recent call last)
<ipython-input-7-58967109838f> in <module>
----> 1 my_fancy_plotting_workflow(cat, api='33007000110000');

<ipython-input-6-73ffeca29563> in my_fancy_plotting_workflow(catalog, api)
      3        Creates a plot given a dataframe with a column labeled 'date'
      4     """
----> 5     dataframe = catalog.production_by_api(api=api).read()
      6     ax1 = dataframe.set_index(['date']).plot(y=['volume_oil_formation_bbls'])
      7     ax2 = dataframe.set_index(['date']).plot(y=['volume_gas_formation_mcf'], ax=ax1, secondary_y=True)

~/miniconda/envs/book/lib/python3.8/site-packages/intake/catalog/base.py in __getattr__(self, item)
    339                 return self[item]  # triggers reload_on_change
    340             except KeyError:
--> 341                 raise AttributeError(item)
    342         raise AttributeError(item)
    343 

AttributeError: production_by_api

Write to CSV

Now we'll play data engineer momentarily. We'll read in the production data from a couple of wells and immediately write them to CSV files stored in a local hidden directory named .files. The detailed syntax of the following is not important at the moment. Just understand that we are reading data from the PostgresSQL database and then storing them locally.

cat.production_by_api(api='33007000110000').read().to_csv('datasets/production_33007000110000.csv', index=False)
cat.production_by_api(api='33007000140000').read().to_csv('datasets/production_33007000140000.csv', index=False)

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~/miniconda/envs/book/lib/python3.8/site-packages/intake/catalog/base.py in __getattr__(self, item)
    338             try:
--> 339                 return self[item]  # triggers reload_on_change
    340             except KeyError:

~/miniconda/envs/book/lib/python3.8/site-packages/intake/catalog/base.py in __getitem__(self, key)
    409             return out()
--> 410         raise KeyError(key)
    411 

KeyError: 'production_by_api'

During handling of the above exception, another exception occurred:

AttributeError                            Traceback (most recent call last)
<ipython-input-8-52fff28f4413> in <module>
----> 1 cat.production_by_api(api='33007000110000').read().to_csv('datasets/production_33007000110000.csv', index=False)
      2 cat.production_by_api(api='33007000140000').read().to_csv('datasets/production_33007000140000.csv', index=False)

~/miniconda/envs/book/lib/python3.8/site-packages/intake/catalog/base.py in __getattr__(self, item)
    339                 return self[item]  # triggers reload_on_change
    340             except KeyError:
--> 341                 raise AttributeError(item)
    342         raise AttributeError(item)
    343 

AttributeError: production_by_api

We can verify the files were written by listing the files in the directory .files.

ls datasets/production*

ls: cannot access 'datasets/production*': No such file or directory

Inspecting the contents of one of the files, by looking at the first 5 lines in the file.

!head -n 5 datasets/production_33007000110000.csv

head: cannot open 'datasets/production_33007000110000.csv' for reading: No such file or directory

Read CSV file and create a local catalog

Now we'll use Intake's open_csv() function to read the CSV file back into a new catalog called csv_cat. This catalog object has a method .yaml() that will produce a description of the catalog using YAML syntax. We can then minimally edit this output to create a local catalog that we can use for reading in CSV files with an arbitrary API number.

csv_cat = intake.open_csv('datasets/production_33007000110000.csv', csv_kwargs={'parse_dates': ['date']})

print(csv_cat.yaml())

sources:
  csv:
    args:
      csv_kwargs:
        parse_dates:
        - date
      urlpath: datasets/production_33007000110000.csv
    description: ''
    driver: intake.source.csv.CSVSource
    metadata: {}

Edit YAML and write catalog file

You can inspect the changes between what's above and the edits below. We rename the data source "csv" to "production_by_api" and using the replacement syntax create a user parameter api that replaces the explicit API number above.

%%file "datasets/nd_production.yml"
sources: 
  production_by_api:
    args:
      csv_kwargs:
        parse_dates:
        - date
      urlpath: '/production_.csv'
    description: 'Returns production history given an API'
    driver: intake.source.csv.CSVSource
    metadata: {}

Overwriting datasets/nd_production.yml

Read in local catalog file and execute workflow

csv_cat = intake.open_catalog("datasets/nd_production.yml")
list(csv_cat)

['production_by_api']

my_fancy_plotting_workflow(csv_cat, api='33007000140000');

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
<ipython-input-14-ad40f5e14ec6> in <module>
----> 1 my_fancy_plotting_workflow(csv_cat, api='33007000140000');

<ipython-input-6-73ffeca29563> in my_fancy_plotting_workflow(catalog, api)
      3        Creates a plot given a dataframe with a column labeled 'date'
      4     """
----> 5     dataframe = catalog.production_by_api(api=api).read()
      6     ax1 = dataframe.set_index(['date']).plot(y=['volume_oil_formation_bbls'])
      7     ax2 = dataframe.set_index(['date']).plot(y=['volume_gas_formation_mcf'], ax=ax1, secondary_y=True)

~/miniconda/envs/book/lib/python3.8/site-packages/intake/source/csv.py in read(self)
    140 
    141     def read(self):
--> 142         self._get_schema()
    143         return self._dataframe.compute()
    144 

~/miniconda/envs/book/lib/python3.8/site-packages/intake/source/csv.py in _get_schema(self)
    125 
    126         if self._dataframe is None:
--> 127             self._open_dataset(urlpath)
    128 
    129         dtypes = self._dataframe._meta.dtypes.to_dict()

~/miniconda/envs/book/lib/python3.8/site-packages/intake/source/csv.py in _open_dataset(self, urlpath)
     98 
     99         if self.pattern is None:
--> 100             self._dataframe = dask.dataframe.read_csv(
    101                 urlpath, storage_options=self._storage_options,
    102                 **self._csv_kwargs)

~/miniconda/envs/book/lib/python3.8/site-packages/dask/dataframe/io/csv.py in read(urlpath, blocksize, lineterminator, compression, sample, enforce, assume_missing, storage_options, include_path_column, **kwargs)
    636         **kwargs,
    637     ):
--> 638         return read_pandas(
    639             reader,
    640             urlpath,

~/miniconda/envs/book/lib/python3.8/site-packages/dask/dataframe/io/csv.py in read_pandas(reader, urlpath, blocksize, lineterminator, compression, sample, enforce, assume_missing, storage_options, include_path_column, **kwargs)
    470         sample = blocksize
    471     b_lineterminator = lineterminator.encode()
--> 472     b_out = read_bytes(
    473         urlpath,
    474         delimiter=b_lineterminator,

~/miniconda/envs/book/lib/python3.8/site-packages/dask/bytes/core.py in read_bytes(urlpath, delimiter, not_zero, blocksize, sample, compression, include_path, **kwargs)
    123                     "To read, set blocksize=None"
    124                 )
--> 125             size = fs.info(path)["size"]
    126             if size is None:
    127                 raise ValueError(

~/miniconda/envs/book/lib/python3.8/site-packages/fsspec/implementations/local.py in info(self, path, **kwargs)
     58     def info(self, path, **kwargs):
     59         path = self._strip_protocol(path)
---> 60         out = os.stat(path, follow_symlinks=False)
     61         dest = False
     62         if os.path.islink(path):

FileNotFoundError: [Errno 2] No such file or directory: '/home/travis/daytum-book/content/datasets//production_33007000140000.csv'

Install data as Python packages

For example, with conda

conda install -c daytum data

Here we use the package manager conda to install a package named "data" from the "daytum" channel. This is extremely useful from a data engineer perspective because you can "version control" data and take advantage of conda update, etc.

"Self-describing" Data

The .describe() method when called on data sources returns a description, user parameters, and other metadata.

cat.production_by_state.describe()

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~/miniconda/envs/book/lib/python3.8/site-packages/intake/catalog/base.py in __getattr__(self, item)
    338             try:
--> 339                 return self[item]  # triggers reload_on_change
    340             except KeyError:

~/miniconda/envs/book/lib/python3.8/site-packages/intake/catalog/base.py in __getitem__(self, key)
    409             return out()
--> 410         raise KeyError(key)
    411 

KeyError: 'production_by_state'

During handling of the above exception, another exception occurred:

AttributeError                            Traceback (most recent call last)
<ipython-input-15-dee086fc836f> in <module>
----> 1 cat.production_by_state.describe()

~/miniconda/envs/book/lib/python3.8/site-packages/intake/catalog/base.py in __getattr__(self, item)
    339                 return self[item]  # triggers reload_on_change
    340             except KeyError:
--> 341                 raise AttributeError(item)
    342         raise AttributeError(item)
    343 

AttributeError: production_by_state

GUI

There is a built in GUI which also allows for exploration of data sources. The search feature is particularly useful.

intake.gui

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
~/miniconda/envs/book/lib/python3.8/site-packages/IPython/core/formatters.py in __call__(self, obj)
    700                 type_pprinters=self.type_printers,
    701                 deferred_pprinters=self.deferred_printers)
--> 702             printer.pretty(obj)
    703             printer.flush()
    704             return stream.getvalue()

~/miniconda/envs/book/lib/python3.8/site-packages/IPython/lib/pretty.py in pretty(self, obj)
    392                         if cls is not object \
    393                                 and callable(cls.__dict__.get('__repr__')):
--> 394                             return _repr_pprint(obj, self, cycle)
    395 
    396             return _default_pprint(obj, self, cycle)

~/miniconda/envs/book/lib/python3.8/site-packages/IPython/lib/pretty.py in _repr_pprint(obj, p, cycle)
    682     """A pprint that just redirects to the normal repr function."""
    683     # Find newlines and replace them with p.break_()
--> 684     output = repr(obj)
    685     lines = output.splitlines()
    686     with p.group():

~/miniconda/envs/book/lib/python3.8/site-packages/intake/gui/__init__.py in __repr__(self)
     58 
     59     def __repr__(self):
---> 60         self._instantiate()
     61         return repr(self._instance)
     62 

~/miniconda/envs/book/lib/python3.8/site-packages/intake/gui/__init__.py in _instantiate(self)
     47         if self._instance is None:
     48             GUI = do_import()
---> 49             self._instance = GUI()
     50 
     51     def __getattr__(self, attr, *args, **kwargs):

~/miniconda/envs/book/lib/python3.8/site-packages/intake/gui/gui.py in __init__(self, cats)
     38     def __init__(self, cats=None):
     39         self.source = SourceGUI()
---> 40         self.cat = CatGUI(cats=cats, done_callback=self.done_callback)
     41         self.panel = pn.Column(
     42             pn.Row(

~/miniconda/envs/book/lib/python3.8/site-packages/intake/gui/catalog/gui.py in __init__(self, cats, done_callback, **kwargs)
     70         self.control_panel = pn.Row(name='Controls', margin=0)
     71 
---> 72         self.select = CatSelector(cats=self._cats,
     73                                   done_callback=self.callback)
     74         self.add = CatAdder(done_callback=self.select.add,

~/miniconda/envs/book/lib/python3.8/site-packages/intake/gui/catalog/select.py in __init__(self, cats, done_callback, **kwargs)
     73         super().__init__(**kwargs)
     74 
---> 75         self.items = cats if cats is not None else [intake.cat]
     76 
     77     def setup(self):

~/miniconda/envs/book/lib/python3.8/site-packages/intake/gui/base.py in items(self, items)
    181         """When setting items make sure widget options are uptodate"""
    182         if items is not None:
--> 183             self.options = items
    184 
    185     def _create_options(self, items):

~/miniconda/envs/book/lib/python3.8/site-packages/intake/gui/base.py in options(self, new)
    208         else:
    209             self.widget.options = options
--> 210             self.widget.value = list(options.values())[:1]
    211 
    212     def add(self, items):

~/miniconda/envs/book/lib/python3.8/site-packages/param/parameterized.py in _f(self, obj, val)
    294             instance_param.__set__(obj, val)
    295             return
--> 296         return f(self, obj, val)
    297     return _f
    298 

~/miniconda/envs/book/lib/python3.8/site-packages/param/parameterized.py in __set__(self, obj, val)
    859 
    860         for watcher in watchers:
--> 861             obj.param._call_watcher(watcher, event)
    862         if not obj.param._BATCH_WATCH:
    863             obj.param._batch_call_watchers()

~/miniconda/envs/book/lib/python3.8/site-packages/param/parameterized.py in _call_watcher(self_, watcher, event)
   1454             with batch_watch(self_.self_or_cls, enable=watcher.queued, run=False):
   1455                 if watcher.mode == 'args':
-> 1456                     watcher.fn(event)
   1457                 else:
   1458                     watcher.fn(**{event.name: event.new})

~/miniconda/envs/book/lib/python3.8/site-packages/intake/gui/catalog/select.py in callback(self, event)
     86         self.expand_nested(event.new)
     87         if self.done_callback:
---> 88             self.done_callback(event.new)
     89 
     90     def expand_nested(self, cats):

~/miniconda/envs/book/lib/python3.8/site-packages/intake/gui/catalog/gui.py in callback(self, cats)
    128 
    129         if self.done_callback:
--> 130             self.done_callback(cats)
    131 
    132     def on_click_search_widget(self, event):

~/miniconda/envs/book/lib/python3.8/site-packages/intake/gui/gui.py in done_callback(self, cats)
     61 
     62     def done_callback(self, cats):
---> 63         self.source.select.cats = cats
     64 
     65     @property

~/miniconda/envs/book/lib/python3.8/site-packages/intake/gui/source/select.py in cats(self, cats)
     92             sources.extend([entry for k, entry in cat.items()
     93                             if entry.describe()['container'] != 'catalog'])
---> 94         self.items = sources
     95 
     96     def callback(self, event):

~/miniconda/envs/book/lib/python3.8/site-packages/intake/gui/base.py in items(self, items)
    181         """When setting items make sure widget options are uptodate"""
    182         if items is not None:
--> 183             self.options = items
    184 
    185     def _create_options(self, items):

~/miniconda/envs/book/lib/python3.8/site-packages/intake/gui/base.py in options(self, new)
    208         else:
    209             self.widget.options = options
--> 210             self.widget.value = list(options.values())[:1]
    211 
    212     def add(self, items):

~/miniconda/envs/book/lib/python3.8/site-packages/param/parameterized.py in _f(self, obj, val)
    294             instance_param.__set__(obj, val)
    295             return
--> 296         return f(self, obj, val)
    297     return _f
    298 

~/miniconda/envs/book/lib/python3.8/site-packages/param/parameterized.py in __set__(self, obj, val)
    859 
    860         for watcher in watchers:
--> 861             obj.param._call_watcher(watcher, event)
    862         if not obj.param._BATCH_WATCH:
    863             obj.param._batch_call_watchers()

~/miniconda/envs/book/lib/python3.8/site-packages/param/parameterized.py in _call_watcher(self_, watcher, event)
   1454             with batch_watch(self_.self_or_cls, enable=watcher.queued, run=False):
   1455                 if watcher.mode == 'args':
-> 1456                     watcher.fn(event)
   1457                 else:
   1458                     watcher.fn(**{event.name: event.new})

~/miniconda/envs/book/lib/python3.8/site-packages/intake/gui/source/select.py in callback(self, event)
     96     def callback(self, event):
     97         if self.done_callback:
---> 98             self.done_callback(event.new)
     99 
    100     def __getstate__(self):

~/miniconda/envs/book/lib/python3.8/site-packages/intake/gui/source/gui.py in callback(self, sources)
    137 
    138         enable_widget(self.plot_widget, enable)
--> 139         enable_widget(self.pars_widget, enable and sources[0]._user_parameters)
    140         self.pars_editor.dirty = True  # reset pars editor
    141 

AttributeError: 'EntrypointEntry' object has no attribute '_user_parameters'

Built-in quick look plotting

If the plotting package Holoviews is installed, you can load quick view plots of the data, much like Panda's. Line, Bar, Scatter, and several other types of plots are built in, see the documention for more information. The data engineer can define custom plot specifications in the metadata of the catalog.

Below is an example of creating a scatter plot with the poro_perm data source without any arguments or custimization.

import hvplot.intake

cat.poro_perm.plot.my_scatter()

A more useful plot is defined as my_scatter() in the catalog.

cat.poro_perm.plot.my_scatter()

Persisting Data

Calling the method .persist() on a data source will create a local copy in a storage format that is most suitable to the container. This is useful for large queries across networks or when working with big datasets (assuming you have the local hard drive storage capacity). Once the data has been persisted, it will load much faster in subsequent calls. This can be demonstrated by using the Jupyter notebooks %%timeit magic function.

%timeit df = cat.production_by_api(api='33007000110000', persist='never').read()

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~/miniconda/envs/book/lib/python3.8/site-packages/intake/catalog/base.py in __getattr__(self, item)
    338             try:
--> 339                 return self[item]  # triggers reload_on_change
    340             except KeyError:

~/miniconda/envs/book/lib/python3.8/site-packages/intake/catalog/base.py in __getitem__(self, key)
    409             return out()
--> 410         raise KeyError(key)
    411 

KeyError: 'production_by_api'

During handling of the above exception, another exception occurred:

AttributeError                            Traceback (most recent call last)
<ipython-input-19-d0e6fafab029> in <module>
----> 1 get_ipython().run_line_magic('timeit', "df = cat.production_by_api(api='33007000110000', persist='never').read()")

~/miniconda/envs/book/lib/python3.8/site-packages/IPython/core/interactiveshell.py in run_line_magic(self, magic_name, line, _stack_depth)
   2315                 kwargs['local_ns'] = sys._getframe(stack_depth).f_locals
   2316             with self.builtin_trap:
-> 2317                 result = fn(*args, **kwargs)
   2318             return result
   2319 

<decorator-gen-53> in timeit(self, line, cell, local_ns)

~/miniconda/envs/book/lib/python3.8/site-packages/IPython/core/magic.py in <lambda>(f, *a, **k)
    185     # but it's overkill for just that one bit of state.
    186     def magic_deco(arg):
--> 187         call = lambda f, *a, **k: f(*a, **k)
    188 
    189         if callable(arg):

~/miniconda/envs/book/lib/python3.8/site-packages/IPython/core/magics/execution.py in timeit(self, line, cell, local_ns)
   1158             for index in range(0, 10):
   1159                 number = 10 ** index
-> 1160                 time_number = timer.timeit(number)
   1161                 if time_number >= 0.2:
   1162                     break

~/miniconda/envs/book/lib/python3.8/site-packages/IPython/core/magics/execution.py in timeit(self, number)
    167         gc.disable()
    168         try:
--> 169             timing = self.inner(it, self.timer)
    170         finally:
    171             if gcold:

<magic-timeit> in inner(_it, _timer)

~/miniconda/envs/book/lib/python3.8/site-packages/intake/catalog/base.py in __getattr__(self, item)
    339                 return self[item]  # triggers reload_on_change
    340             except KeyError:
--> 341                 raise AttributeError(item)
    342         raise AttributeError(item)
    343 

AttributeError: production_by_api

To create a local copy, just call the method .persist()

cat.production_by_api(api='33007000110000').persist()

%timeit df = cat.production_by_api(api='33007000110000').read()

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~/miniconda/envs/book/lib/python3.8/site-packages/intake/catalog/base.py in __getattr__(self, item)
    338             try:
--> 339                 return self[item]  # triggers reload_on_change
    340             except KeyError:

~/miniconda/envs/book/lib/python3.8/site-packages/intake/catalog/base.py in __getitem__(self, key)
    409             return out()
--> 410         raise KeyError(key)
    411 

KeyError: 'production_by_api'

During handling of the above exception, another exception occurred:

AttributeError                            Traceback (most recent call last)
<ipython-input-21-174f94171da4> in <module>
----> 1 get_ipython().run_line_magic('timeit', "df = cat.production_by_api(api='33007000110000').read()")

~/miniconda/envs/book/lib/python3.8/site-packages/IPython/core/interactiveshell.py in run_line_magic(self, magic_name, line, _stack_depth)
   2315                 kwargs['local_ns'] = sys._getframe(stack_depth).f_locals
   2316             with self.builtin_trap:
-> 2317                 result = fn(*args, **kwargs)
   2318             return result
   2319 

<decorator-gen-53> in timeit(self, line, cell, local_ns)

~/miniconda/envs/book/lib/python3.8/site-packages/IPython/core/magic.py in <lambda>(f, *a, **k)
    185     # but it's overkill for just that one bit of state.
    186     def magic_deco(arg):
--> 187         call = lambda f, *a, **k: f(*a, **k)
    188 
    189         if callable(arg):

~/miniconda/envs/book/lib/python3.8/site-packages/IPython/core/magics/execution.py in timeit(self, line, cell, local_ns)
   1158             for index in range(0, 10):
   1159                 number = 10 ** index
-> 1160                 time_number = timer.timeit(number)
   1161                 if time_number >= 0.2:
   1162                     break

~/miniconda/envs/book/lib/python3.8/site-packages/IPython/core/magics/execution.py in timeit(self, number)
    167         gc.disable()
    168         try:
--> 169             timing = self.inner(it, self.timer)
    170         finally:
    171             if gcold:

<magic-timeit> in inner(_it, _timer)

~/miniconda/envs/book/lib/python3.8/site-packages/intake/catalog/base.py in __getattr__(self, item)
    339                 return self[item]  # triggers reload_on_change
    340             except KeyError:
--> 341                 raise AttributeError(item)
    342         raise AttributeError(item)
    343 

AttributeError: production_by_api

%%javascript
function hideElements(elements, start) {
for(var i = 0, length = elements.length; i < length;i++) {
    if(i >= start) {
        elements[i].style.display = "none";
    }
}
}
var prompt_elements = document.getElementsByClassName("prompt");
hideElements(prompt_elements, 0)