Deploy AI apps for free on Ploomber Cloud!

Memory usage#

New in version 0.0.18: execute_notebook (API Reference)

With ploomber-engine you can profile Jupyter notebook’s memory usage. Unlike papermill, which isn’t capable of doing it.

Install requirements:

%pip install ploomber-engine psutil matplotlib --quiet
Note: you may need to restart the kernel to use updated packages.

Example#

Import the execute_notebook function:

from ploomber_engine import execute_notebook

We’ll now programmatically create a sample notebook and stored it in notebook.ipynb. Note that it creates a 1MB numpy array on cell 3 and one 10MB numpy array on cell 5.

import nbformat

nb = nbformat.v4.new_notebook()
sleep = "time.sleep(0.5)"
cells = [
    # cell 1
    "import numpy as np; import time",
    # cell 2
    sleep,
    # cell 3
    "x = np.ones(131072, dtype='float64')",
    # cell 4
    sleep,
    # cell 5
    "y = np.ones(131072*10, dtype='float64')",
    # cell 6
    sleep,
]

nb.cells = [nbformat.v4.new_code_cell(cell) for cell in cells]

nbformat.write(nb, "notebook.ipynb")

Let’s execute the notebook with profile_memory=True

_ = execute_notebook("notebook.ipynb", "output.ipynb", profile_memory=True)
  0%|                                                     | 0/6 [00:00<?, ?it/s]
Executing cell: 1:   0%|                                  | 0/6 [00:00<?, ?it/s]
Executing cell: 2:   0%|                                  | 0/6 [00:00<?, ?it/s]
Executing cell: 2:  33%|████████▋                 | 2/6 [00:00<00:01,  3.94it/s]
Executing cell: 3:  33%|████████▋                 | 2/6 [00:00<00:01,  3.94it/s]
Executing cell: 4:  33%|████████▋                 | 2/6 [00:00<00:01,  3.94it/s]
Executing cell: 4:  67%|█████████████████▎        | 4/6 [00:01<00:00,  3.94it/s]
Executing cell: 5:  67%|█████████████████▎        | 4/6 [00:01<00:00,  3.94it/s]
Executing cell: 6:  67%|█████████████████▎        | 4/6 [00:01<00:00,  3.94it/s]
Executing cell: 6: 100%|██████████████████████████| 6/6 [00:01<00:00,  3.93it/s]
Executing cell: 6: 100%|██████████████████████████| 6/6 [00:01<00:00,  3.93it/s]

../../_images/d4d4066187a94fc10fbe71344b3537aec3a82c3a354c8790fb2a9a5ccc51b072.png

We can also set the path for the plot with profile_memory=<path_to_png>

We can see that after running cells 1-2, there isn’t any important increment in memory usage. However, when finishing execution of cell 3, we see a bump of 1MB, since we allocated the array there. Cell 4 doesn’t increase memory usage, since it only contains a call to time.sleep, but cell 5 has a 10MB bump since we allocated the second (larger) array.

If you want to look at the executed notebook, it’s available at output.ipynb.

Customizing the plot#

You might customize the plot by calling the plot_memory_usage function and passing the output notebook, the returned object is a matplotlib.Axes.

%%capture

from ploomber_engine.profiling import plot_memory_usage

nb = execute_notebook("notebook.ipynb", "output.ipynb", profile_memory=True)
ax = plot_memory_usage(nb)
_ = ax.set_title("My custom title")
../../_images/ebd540ccc51a798426a1978a51d6b63752a906f0308cde931051107898a99b6d.png

Saving profiling data#

You can save the profiling data by setting save_profiling_data=True, or providing custom path to save

Enable save_profiling_data by setting as True#

The file will be saved as output-profiling-data.csv by default

%%capture
_ = execute_notebook(
    "notebook.ipynb", "output.ipynb", profile_memory=True, save_profiling_data=True
)
import pandas as pd

pd.read_csv("output-profiling-data.csv")
cell runtime memory
0 1 0.002329 96.402344
1 2 0.502832 96.402344
2 3 0.002088 96.402344
3 4 0.502808 96.402344
4 5 0.003119 96.402344
5 6 0.502937 96.406250

Enable save_profiling_data with custom file path#

Please be aware that the file path must end with the .csv format.

%%capture
_ = execute_notebook(
    "notebook.ipynb",
    "output.ipynb",
    profile_runtime=True,
    save_profiling_data="./my_output.csv",
)
import pandas as pd

pd.read_csv("my_output.csv")
cell runtime memory
0 1 0.000460 NaN
1 2 0.500927 NaN
2 3 0.000835 NaN
3 4 0.500954 NaN
4 5 0.002833 NaN
5 6 0.501043 NaN

Note: you must set profile_memory=True to get non-NA data saved for the memory usage.