在线时间:8:00-16:00
迪恩网络APP
随时随地掌握行业动态
扫描二维码
关注迪恩网络微信公众号
开源软件名称:tritemio/nbrun开源软件地址:https://github.com/tritemio/nbrun开源编程语言:Python 60.3%开源软件介绍:nbrunNbrun automates the execution of Jupyter Notebooks, passing arguments for parametrization. Nbrun offers a simple yet effective way to remove code duplication and manual steps, allowing to build automated and self-documented analysis pipelines. Nbrun contains a single function ( Nbrun is a small wapper function around nbconvert. UsageClone the repo and execute the master notebook. To use in a new project,
copy the file RationaleOftentimes we write notebooks to analyze a dataset, then we want to use the same notebook for a different dataset. For reproducibility and for quickly finding results later on, it is good practice to save a fully executed notebook for each data file. So, naturally, we start making copies of the original notebook to process each additional data file. But every time we improve or fix the notebook we need to replicate the changes in all the other similar notebooks. Enters nbrun. With nbrun we can keep a single "template" notebook which performs the analysis. The template notebook takes one (or more) arguments as input (e.g. the dataset file name or an ID). A second "master" notebook is used to execute the template notebook in batch, once for each input argument. Each time the template is executed with a particular set of arguments, it is saved to disk. At the end we obtain a fully executed notebook for each dataset/arguments, and a single template notebook to modify when we fix/improve the analysis. ImplementationThe mechanism is "low-tech", yet effective. The input arguments are contained in a
dictionary. The function
Why after the first cell? Conventionally, the template notebook contains assignment to variables used as "input arguments" in the first cell. These are the default values and allow the template notebook to be executed independently. When we pass an argument, it will be inserted in the second cell, overriding the default arguments. LimitationsThe dict containing input arguments is converted to a string containing
python-code which assigns a variable for each argument. For this reason, you
cannot pass arbitrary objects, but only objects with a complete
"literal representation" (i.e. using Also, differently from calling a real python function, no check is performed
on the input arguments. There is not formal "notebook signature", analogous
of the function signature. The user needs to make sure she/he is passing the
right arguments when calling/executing a notebook. As a future idea, a "signature"
can be embedded in the template notebook metadata, and checked by
Developmentnbrun contains only a single function At the moment you have to copy the file Some users asked about future development. I don't think there is much functionality left to be added. So, I don't forsee big changes in the future except for compatibility fixes and small API tweaks. AlternativesRunipy was the first tool to provide batch execution of notebooks, even before Jupyter was born from IPython. Nowadays most of the funtionality is included in nbconvert. With runipy you can pass arguments to notebooks only thorugh environment variables. nbparameterise uses AST instead of eval to pass arguments (which is a safer and more robust choice). However the code is much more complex and, last time I checked, it didn't support passing list or dict as arguments. I thought about improving nbparameterise but at the end I didn't find the time since nbrun was already working for my use case. |
2023-10-27
2022-08-15
2022-08-17
2022-09-23
2022-08-13
请发表评论