Memory usage#
New in version 0.0.18: execute_notebook
(API Reference)
With ploomber-engine you can profile Jupyter notebook’s memory usage. Unlike papermill, which isn’t capable of doing it.
Install requirements:
%pip install ploomber-engine psutil matplotlib --quiet
Note: you may need to restart the kernel to use updated packages.
Example#
Import the execute_notebook
function:
from ploomber_engine import execute_notebook
We’ll now programmatically create a sample notebook and stored it in notebook.ipynb
. Note that it creates a 1MB numpy array on cell 3 and one 10MB numpy array on cell 5.
import nbformat
nb = nbformat.v4.new_notebook()
sleep = "time.sleep(0.5)"
cells = [
# cell 1
"import numpy as np; import time",
# cell 2
sleep,
# cell 3
"x = np.ones(131072, dtype='float64')",
# cell 4
sleep,
# cell 5
"y = np.ones(131072*10, dtype='float64')",
# cell 6
sleep,
]
nb.cells = [nbformat.v4.new_code_cell(cell) for cell in cells]
nbformat.write(nb, "notebook.ipynb")
Let’s execute the notebook with profile_memory=True
Command-line equivalent
ploomber-engine notebook.ipynb output.ipynb --profile-memory
_ = execute_notebook("notebook.ipynb", "output.ipynb", profile_memory=True)
0%| | 0/6 [00:00<?, ?it/s]
Executing cell: 1: 0%| | 0/6 [00:00<?, ?it/s]
Executing cell: 2: 0%| | 0/6 [00:00<?, ?it/s]
Executing cell: 2: 33%|████████▋ | 2/6 [00:00<00:01, 3.94it/s]
Executing cell: 3: 33%|████████▋ | 2/6 [00:00<00:01, 3.94it/s]
Executing cell: 4: 33%|████████▋ | 2/6 [00:00<00:01, 3.94it/s]
Executing cell: 4: 67%|█████████████████▎ | 4/6 [00:01<00:00, 3.94it/s]
Executing cell: 5: 67%|█████████████████▎ | 4/6 [00:01<00:00, 3.94it/s]
Executing cell: 6: 67%|█████████████████▎ | 4/6 [00:01<00:00, 3.94it/s]
Executing cell: 6: 100%|██████████████████████████| 6/6 [00:01<00:00, 3.92it/s]
Executing cell: 6: 100%|██████████████████████████| 6/6 [00:01<00:00, 3.92it/s]

We can also set the path for the plot with profile_memory=<path_to_png>
We can see that after running cells 1-2, there isn’t any important increment in memory usage. However, when finishing execution of cell 3, we see a bump of 1MB, since we allocated the array there. Cell 4 doesn’t increase memory usage, since it only contains a call to time.sleep
, but cell 5 has a 10MB bump since we allocated the second (larger) array.
If you want to look at the executed notebook, it’s available at output.ipynb
.
Customizing the plot#
You might customize the plot by calling the plot_memory_usage
function and passing the output notebook, the returned object is a matplotlib.Axes
.
%%capture
from ploomber_engine.profiling import plot_memory_usage
nb = execute_notebook("notebook.ipynb", "output.ipynb", profile_memory=True)
ax = plot_memory_usage(nb)
_ = ax.set_title("My custom title")

Saving profiling data#
You can save the profiling data by setting save_profiling_data=True
, or providing custom path to save
Enable save_profiling_data by setting as True
#
The file will be saved as output-profiling-data.csv
by default
%%capture
_ = execute_notebook(
"notebook.ipynb", "output.ipynb", profile_memory=True, save_profiling_data=True
)
import pandas as pd
pd.read_csv("output-profiling-data.csv")
cell | runtime | memory | |
---|---|---|---|
0 | 1 | 0.002382 | 96.082031 |
1 | 2 | 0.503185 | 96.082031 |
2 | 3 | 0.002353 | 96.082031 |
3 | 4 | 0.503000 | 96.082031 |
4 | 5 | 0.003840 | 97.054688 |
5 | 6 | 0.503072 | 97.058594 |
Enable save_profiling_data with custom file path#
Please be aware that the file path must end with the .csv
format.
%%capture
_ = execute_notebook(
"notebook.ipynb",
"output.ipynb",
profile_runtime=True,
save_profiling_data="./my_output.csv",
)
import pandas as pd
pd.read_csv("my_output.csv")
cell | runtime | memory | |
---|---|---|---|
0 | 1 | 0.000421 | NaN |
1 | 2 | 0.500884 | NaN |
2 | 3 | 0.000664 | NaN |
3 | 4 | 0.500946 | NaN |
4 | 5 | 0.003532 | NaN |
5 | 6 | 0.500967 | NaN |
Note: you must set profile_memory=True
to get non-NA data
saved for the memory usage.