Running notebooks#
New in version 0.0.18: execute_notebook
was introduced in 0.0.18
. If using an older version, check out PloomberClient
(API Reference)
ploomber-engine
allows you to run Jupyter notebooks programmatically. It is a drop-in replacement for papermill.execute_notebook
with enhanced support for debugging, profiling, experiment tracking and more!
Example#
Install dependencies:
%pip install ploomber-engine --quiet
Show code cell output
Note: you may need to restart the kernel to use updated packages.
Download sample notebook:
%%sh
curl https://raw.githubusercontent.com/ploomber/ploomber-engine/main/examples/display.ipynb \
--output running-demo.ipynb
Show code cell output
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 2672 100 2672 0 0 27576 0 --:--:-- --:--:-- --:--:-- 27833
Run the notebook and store the executed version:
Command-line equivalent
ploomber-engine nb.ipynb output.ipynb
from ploomber_engine import execute_notebook
nb = execute_notebook("running-demo.ipynb", output_path="output.ipynb")
Show code cell output
0%| | 0/10 [00:00<?, ?it/s]
Executing cell: 1: 0%| | 0/10 [00:00<?, ?it/s]
Executing cell: 1: 20%|█████ | 2/10 [00:00<00:00, 9.99it/s]
Executing cell: 2: 20%|█████ | 2/10 [00:00<00:00, 9.99it/s]
Executing cell: 3: 20%|█████ | 2/10 [00:00<00:00, 9.99it/s]
sending something to standard error
Executing cell: 4: 20%|█████ | 2/10 [00:00<00:00, 9.99it/s]
Executing cell: 4: 50%|████████████▌ | 5/10 [00:00<00:00, 11.50it/s]
Executing cell: 5: 50%|████████████▌ | 5/10 [00:00<00:00, 11.50it/s]
Executing cell: 6: 50%|████████████▌ | 5/10 [00:00<00:00, 11.50it/s]
Executing cell: 6: 70%|█████████████████▌ | 7/10 [00:00<00:00, 10.12it/s]
Executing cell: 7: 70%|█████████████████▌ | 7/10 [00:00<00:00, 10.12it/s]
Executing cell: 8: 70%|█████████████████▌ | 7/10 [00:00<00:00, 10.12it/s]
Executing cell: 9: 70%|█████████████████▌ | 7/10 [00:00<00:00, 10.12it/s]
Executing cell: 9: 100%|████████████████████████| 10/10 [00:00<00:00, 14.56it/s]
The function returns a notebook object (same contents as stored in output_path
):
type(nb)
nbformat.notebooknode.NotebookNode
Skip storing the output notebook:
_ = execute_notebook("running-demo.ipynb", output_path=None)
Show code cell output
0%| | 0/10 [00:00<?, ?it/s]
Executing cell: 1: 0%| | 0/10 [00:00<?, ?it/s]
Executing cell: 2: 0%| | 0/10 [00:00<?, ?it/s]
Executing cell: 3: 0%| | 0/10 [00:00<?, ?it/s]
sending something to standard error
Executing cell: 4: 0%| | 0/10 [00:00<?, ?it/s]
Executing cell: 4: 50%|████████████▌ | 5/10 [00:00<00:00, 27.93it/s]
Executing cell: 5: 50%|████████████▌ | 5/10 [00:00<00:00, 27.93it/s]
Executing cell: 6: 50%|████████████▌ | 5/10 [00:00<00:00, 27.93it/s]
Executing cell: 7: 50%|████████████▌ | 5/10 [00:00<00:00, 27.93it/s]
Executing cell: 7: 80%|████████████████████ | 8/10 [00:00<00:00, 19.38it/s]
Executing cell: 8: 80%|████████████████████ | 8/10 [00:00<00:00, 19.38it/s]
Executing cell: 9: 80%|████████████████████ | 8/10 [00:00<00:00, 19.38it/s]
Executing cell: 9: 100%|████████████████████████| 10/10 [00:00<00:00, 25.26it/s]
Logging print
statements#
If your notebook contains print
statements and want to see them in the current session:
Command-line equivalent
ploomber-engine nb.ipynb output.ipynb --log-output
_ = execute_notebook(
"running-demo.ipynb",
output_path="output.ipynb",
log_output=True,
progress_bar=False,
)
sending something to standard error
some message
showing our logo :)
Parametrizing notebooks#
New in version 0.0.19.
You can parametrize notebooks and switch their values at runtime. By default values, are injected in the first cell. However, if you want to provide default values, you may add a cell like this:
# parameters
x = 1
y = 2
If you do so, the passed parameters will be injected in a cell below to replace the default values. If you prefer, you can tag the cell with default values as "parameters"
(à la papermill) instead of adding the # parameters
comment.
Let’s download a sample notebook that prints x + y
:
%%sh
curl https://raw.githubusercontent.com/ploomber/ploomber-engine/main/examples/sum.ipynb \
--output sum-demo.ipynb
Show code cell output
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 883 100 883 0 0 5363 0 --:--:-- --:--:-- --:--:-- 5384
If we don’t pass parameters, it uses the default values:
_ = execute_notebook(
"sum-demo.ipynb", output_path=None, log_output=True, progress_bar=False
)
x + y = 0 + 0 = 0
Passing parameters
overrides the defaults:
Command-line equivalent
ploomber-engine nb.ipynb output.ipynb -p x 21 -p y 21
_ = execute_notebook(
"sum-demo.ipynb",
output_path=None,
log_output=True,
parameters=dict(x=21, y=21),
progress_bar=False,
)
x + y = 21 + 21 = 42
Removing cells#
New in version 0.0.21.
Command-line equivalent
ploomber-engine nb.ipynb output.ipynb --remove-tagged-cells remove
If there are cells you want to remove before execution, tag them and use remove_tagged_cells
. This sample notebook contains one cell that will fail, if executed; however, the cell contains the tag "remove"
, so let’s remove it before execution:
%%sh
curl https://raw.githubusercontent.com/ploomber/ploomber-engine/main/examples/remove.ipynb \
--output running-remove.ipynb
Show code cell output
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 743 100 743 0 0 4448 0 --:--:-- --:--:-- --:--:-- 4449
_ = execute_notebook(
"running-remove.ipynb",
output_path=None,
remove_tagged_cells="remove",
)
Show code cell output
0%| | 0/1 [00:00<?, ?it/s]
Executing cell: 1: 0%| | 0/1 [00:00<?, ?it/s]
Executing cell: 1: 100%|█████████████████████████| 1/1 [00:00<00:00, 297.11it/s]
You may also pass multiple tags to remove_tagged_cells
in a list:
["remove", "also-remove"]
.