\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
\n",
"
\n",
""
],
"text/plain": [
":Scatter [porosity] (permeability)"
]
},
"execution_count": 19,
"metadata": {
"application/vnd.holoviews_exec.v0+json": {
"id": "1361"
}
},
"output_type": "execute_result"
}
],
"source": [
"cat.poro_perm.plot.my_scatter()"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Persisting Data"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"Calling the method `.persist()` on a data source will create a local copy in a storage format that is most suitable to the container. This is useful for large queries across networks or when working with big datasets (assuming you have the local hard drive storage capacity). Once the data has been persisted, it will load much faster in subsequent calls. This can be demonstrated by using the Jupyter notebooks `%%timeit` magic function."
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1.42 s ± 21.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
]
}
],
"source": [
"%timeit df = cat.production_by_api(api='33007000110000', persist='never').read()"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"To create a local copy, just call the method `.persist()`\n",
"\n",
"```python\n",
"cat.production_by_api(api='33007000110000').persist()\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
},
"tags": [
"remove_cell"
]
},
"source": [
"You can uncomment the line below, but only run it once."
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"slideshow": {
"slide_type": "skip"
},
"tags": [
"remove_cell"
]
},
"outputs": [],
"source": [
"cat.production_by_api(api='33007000110000').persist();"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The slowest run took 7.65 times longer than the fastest. This could mean that an intermediate result is being cached.\n",
"53.5 ms ± 62.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
]
}
],
"source": [
"%timeit df = cat.production_by_api(api='33007000110000').read()"
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {
"hide_input": true,
"init_cell": true,
"javascript_last_cell": true,
"jupyter": {
"source_hidden": true
},
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [
{
"data": {
"application/javascript": [
"function hideElements(elements, start) {\n",
"for(var i = 0, length = elements.length; i < length;i++) {\n",
" if(i >= start) {\n",
" elements[i].style.display = \"none\";\n",
" }\n",
"}\n",
"}\n",
"var prompt_elements = document.getElementsByClassName(\"prompt\");\n",
"hideElements(prompt_elements, 0)\n"
],
"text/plain": [
"