Papermill

Jupyter notebooks are awesome for interactive Data analysis but sucks as parameterization and cumbersome to run as a script. Either we have to open it and run all, or convert it into a script and then run it.

The lack of parameterization options mean that it is difficult to use it as a “function” in a bigger workflow. For instance, if we need to run an algorithm with 100 combinations of parameters and we really want to make it parallelized, then either all the combinations and parallelization routine should be included in the notebook and become obscure from outside, or create a separate script that receives inputs and use that script instead of the notebook.

Papermill addresses these two issues as clearly stated at the top of the homepage.

Papermill lets you - parameterize notebooks - execute notebooks