Testing existing outputs

Testing existing outputs#

Note

This is an experimental feature, please share your feedback on Slack!. This feature requires ploomber-engine 0.0.16 or higher

With ploomber-engine , you can re-run your notebooks and ensure that their outputs still match.

Example (test passes)#

Let’s create a simple notebook that prints a few numbers:

import nbformat

nb = nbformat.v4.new_notebook()

cells = [
    "print(1)",
    "print(2)",
]

nb.cells = [nbformat.v4.new_code_cell(cell) for cell in cells]
nbformat.write(nb, "notebook.ipynb")

Let’s run the notebook, and re-write the original file:

from ploomber_engine.ipython import PloomberClient


client = PloomberClient.from_path("notebook.ipynb")
out = client.execute()
nbformat.write(out, "notebook.ipynb")

  0%|                                                     | 0/2 [00:00<?, ?it/s]

Executing cell: 1:   0%|                                  | 0/2 [00:00<?, ?it/s]

Executing cell: 2:   0%|                                  | 0/2 [00:00<?, ?it/s]

Executing cell: 2: 100%|█████████████████████████| 2/2 [00:00<00:00, 440.09it/s]

Run the function to test the notebook (it won’t raise any errors since the notebook will produce the same outputs):

from ploomber_engine.testing import test_notebook

test_notebook("notebook.ipynb")

  0%|                                                     | 0/2 [00:00<?, ?it/s]

Executing cell: 1:   0%|                                  | 0/2 [00:00<?, ?it/s]

Executing cell: 2:   0%|                                  | 0/2 [00:00<?, ?it/s]

Executing cell: 2: 100%|█████████████████████████| 2/2 [00:00<00:00, 601.20it/s]

Test failure: output mismatch#

Let’s load the notebook and modify the source code, but keep the outputs the same:

nb = nbformat.read("notebook.ipynb", as_version=nbformat.NO_CONVERT)

# this was previously: print(1)
nb.cells[0].source = "print(100)"

# store the notebook
nbformat.write(nb, "notebook.ipynb")

test_notebook("notebook.ipynb")

  0%|                                                     | 0/2 [00:00<?, ?it/s]

Executing cell: 1:   0%|                                  | 0/2 [00:00<?, ?it/s]

Executing cell: 2:   0%|                                  | 0/2 [00:00<?, ?it/s]

Executing cell: 2: 100%|█████████████████████████| 2/2 [00:00<00:00, 497.01it/s]

---------------------------------------------------------------------------
NotebookTestException                     Traceback (most recent call last)
Cell In[6], line 1
----> 1 test_notebook("notebook.ipynb")

File ~/checkouts/readthedocs.org/user_builds/ploomber-engine/checkouts/latest/src/ploomber_engine/testing.py:75, in test_notebook(path_to_nb)
     69 if len_expected != len_actual:
     70     raise NotebookTestException(
     71         f"Error in cell {idx}: Expected number of "
     72         f"cell outputs ({len_expected}), actual ({len_actual})"
     73     )
---> 75 _compare_outputs(idx, expected, actual)

File ~/checkouts/readthedocs.org/user_builds/ploomber-engine/checkouts/latest/src/ploomber_engine/testing.py:46, in _compare_outputs(idx, out_ref, out_actual)
     44 for ref, actual in zip(out_ref, out_actual):
     45     if ref != actual:
---> 46         raise NotebookTestException(
     47             f"Error in cell {idx}: Expected output ({ref}), actual ({actual})"
     48         )

NotebookTestException: Error in cell 1: Expected output (1), actual (100)

Test failure: different num of outputs#

test_notebook also checks that the number of outputs for each cell match.

Let’s modify the notebook so the first cell produces two outputs:

nb = nbformat.read("notebook.ipynb", as_version=nbformat.NO_CONVERT)

nb.cells[0].source = "print(100); 200"

# store the notebook
nbformat.write(nb, "notebook.ipynb")

test_notebook("notebook.ipynb")

  0%|                                                     | 0/2 [00:00<?, ?it/s]

Executing cell: 1:   0%|                                  | 0/2 [00:00<?, ?it/s]

Executing cell: 2:   0%|                                  | 0/2 [00:00<?, ?it/s]

Executing cell: 2: 100%|█████████████████████████| 2/2 [00:00<00:00, 396.23it/s]

---------------------------------------------------------------------------
NotebookTestException                     Traceback (most recent call last)
Cell In[8], line 1
----> 1 test_notebook("notebook.ipynb")

File ~/checkouts/readthedocs.org/user_builds/ploomber-engine/checkouts/latest/src/ploomber_engine/testing.py:70, in test_notebook(path_to_nb)
     67 len_actual = len(actual)
     69 if len_expected != len_actual:
---> 70     raise NotebookTestException(
     71         f"Error in cell {idx}: Expected number of "
     72         f"cell outputs ({len_expected}), actual ({len_actual})"
     73     )
     75 _compare_outputs(idx, expected, actual)

NotebookTestException: Error in cell 1: Expected number of cell outputs (1), actual (2)

Limitations#

Currently, plots are ignored since they’ll produce different data even if the plots look the same.