This content originally appeared on Level Up Coding - Medium and was authored by José Fernando Costa
You can now reference unpublished Spark notebooks in Azure Synapse Analytics!
Previously, you could only reference a notebook if you had already published it into the Synapse workspace. In other words, if the notebook only existed in the Git repository then you had no chance to call that notebook from within another notebook — you had to simply put all the code into a single notebook.
Now you can reference notebooks that only exist in the Git repository by toggling an option in the “parent” notebook. This is located in the notebook settings, as seen below.
With this option active, the “Main Notebook” notebook in the screenshot is now able to reference other notebooks that exist only in the Git repository.
“How do I actually reference another notebook” ? Glad you asked. Let’s pretend we have a notebook with one cell to define the add_one function below. This notebook will be called “testNB” and stored inside the “TestFolder” folder.
Pass the function a number, it returns the number incremented by one. Now what if we want to call add_one inside the “Main Notebook” that can reference unpublished notebooks? Easy too.
The %run magic command is the one needed to reference/run a notebook inside. Spark will even provide intellisense to help you reference the notebook name properly.
You can’t run more code in the same cell as %run, so the call to add_one is done in the second cell. Two plus one is three and so is the output of the cell. After running “testNB” add_one became available in the Spark session of “Main Notebook”. This was a basic example, but you can imagine how powerful it is to be able to split code into multiple notebooks without the need to publish it to the workspace beforehand :)
If you’re wondering how Synapse chooses the notebook version to run, this is how the documentation explains it:
- If unpublished notebook reference is disabled, always run published version
- If enabled, priority is: edited / new > committed > published
For all the details on this feature, please refer to the official documentation https://docs.microsoft.com/en-us/azure/synapse-analytics/spark/apache-spark-development-using-notebooks#reference-unpublished-notebook
Run unpublished Spark notebooks in Azure Synapse was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.
This content originally appeared on Level Up Coding - Medium and was authored by José Fernando Costa
José Fernando Costa | Sciencx (2022-05-12T11:39:15+00:00) Run unpublished Spark notebooks in Azure Synapse. Retrieved from https://www.scien.cx/2022/05/12/run-unpublished-spark-notebooks-in-azure-synapse/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.