Notebooks and folders

A Notebook is a special type of Databricks folder that can be used to create Spark scripts. Notebooks can call the Notebook scripts to create a hierarchy of functionality. When created, the type of Notebook must be specified (Python, Scala, or SQL), and a cluster can then specify that the Notebook functionality can be run against it. The following screenshot shows the Notebook creation.

Notebooks and folders

Note that a menu option, to the right of a Notebook session, allows the type of Notebook that is to be changed. The following example shows that a Python notebook can be changed to Scala, SQL, or Markdown:

Notebooks and folders

Note that a Scala Notebook cannot be changed to Python, and a Python Notebook cannot be changed to Scala. The terms Python, Scala, and SQL are well understood as the development languages, however, Markdown is new. Markdown allows formatted documentation to be created from formatted commands in text. A simple reference can be found at https://forums.databricks.com/static/markdown/help.html.

This means that formatted comments can be added to the Notebook session as scripts are created. Notebooks are further subdivided into cells, which contain the commands to be executed. Cells can be moved within a Notebook by hovering over the top-left corner, and dragging them into position. New cells can be inserted into a cell list within a Notebook.

Also, using the %sql command, within a Scala or Python Notebook cell, allows SQL syntax to be used. Typically, the key combination of Shift + Enter causes text blocks in a Notebook or folder to be executed. Using the %md command allows Markdown comments to be added within a cell. Also, comments can be added to a Notebook cell. The menu options available at the top-right section of a Notebook cell, shown in the following screenshot, shows comment, as well as the minimize and maximize options:

Notebooks and folders

Multiple web-based sessions may share a Notebook. The actions that occur within the Notebook will be populated to each web interface viewing it. Also, the Markdown and comment options can be used to enable communication between users to aid the interactive data investigation between a distributed group.

Notebooks and folders

The previous screenshot shows the header of a Notebook session for notebook1. It shows the Notebook name and type (Scala). It also shows the option to lock the Notebook to make it read only, as well as the option to detach it from its cluster. The following screenshot shows the creation of a folder within a Notebook workspace:

Notebooks and folders

A drop-down menu, from the Workspace main menu option, allows for the creation of a folder—in this case, named folder1. The later sections will describe other options in this menu. Once created and selected, a drop-down menu from the new folder called folder1 shows the actions associated with it in the following screenshot:

Notebooks and folders

So, a folder can be exported to a DBC archive. It can be locked, or cloned to create a copy. It can also be renamed, or deleted. Items can be imported into it; for instance, files, which will be explained by example later. Also, new notebooks, dashboards, libraries, and folders can be created within it.

In the same way as actions can be carried out against a folder, a Notebook has a set of possible actions. The following screenshot shows the actions available via a drop-down menu for the Notebook called notebook1, which is currently attached to the running cluster called semclust1. It is possible to rename, delete, lock, or clone a Notebook. It is also possible to detach it from its current cluster, or attach it if it is detached. It is also possible to export the Notebook to a file, or a DBC archive.

Notebooks and folders

From the folder Import option, files can be imported to a folder. The following screenshot shows the file drop-option window that is invoked if this option is selected. It is possible to either drop a file onto the upload pane from the local server, or click on this pane to open a navigation browser to search the local server for files to upload.

Notebooks and folders

Note that the files that are uploaded need to be of a specific type. The following screenshot shows the supported file types. This is a screenshot taken from the file browser when browsing for a file to upload. It also makes sense. The supported file types are Scala, SQL, and Python; as well as DBC archives and JAR file libraries.

Notebooks and folders

Before leaving this section, it should also be noted that Notebooks and folders can be dragged and dropped to change their position. The next section will examine Databricks jobs and libraries via simple worked examples.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.148.105