Cookiecutter

Another tool we find useful is Cookiecutter. In a nutshell, this is a templating engine for projects. There are two main scenarios where Cookiecutter can be useful.

First, if you are usually working on multiple projects of a similar structure or purpose, you may save some time and emotion by creating a single template of the project. That includes the folder structure, its name, the default files or templates to include, Makefiles (https://krzysztofzuraw.com/blog/2016/makefiles-in-python-projects.html), proper gitignore settings, and anything else you want. Specific variables can be injected into any text-based files, depending on your selection and configurations. As an illustration, in our practice, we adopted our own templates for our routine data analysis requests.

Second (and specific to programming), building a web app, package, library, or anything based on a certain framework or stack, or with some specific focus will benefit from reusing a single, thoroughly designed structure, and there are dozens of pre-designed ones available for you to use (http://cookiecutter-templates.sebastianruml.name/). In fact, many framework/tool developers prepare such a template themselves—for their own benefit, and to facilitate integration for new users.

The best part is the fact that Cookiecutter is itself language-agnostic (hence, there are plenty of templates for projects in GO, Kotlin, and other languages), but is written in Python, so it is easy to modify and add to it.

Now, let's execute a basic example. First, let's set up our configuration file. The default location package is ~/.cookiecutterrc. You can redirect the tool to another location by passing a COOKIECUTTER_CONFIG environmental variable, but for now, let's stick with the default one. The following is a simple template for the configuration file:

default_context:
author_name: "Philipp Kats"
email: "[email protected]"
github_username: "casyfill"

You can pass any value you want to this config file. These values will be used as default ones if a particular template has a corresponding variable in place.

Once a config file is in place, let's create a new project from a template. As has already been mentioned, there are a vast number of templates for different cases; most of them are stored as repositories – the tool can use both public and private repositories. One that is particularly popular among the data community is Cookiecutter Data Science (https://drivendata.github.io/cookiecutter-data-science/) – a template for a general-purpose data science project. Let's give it a try. 

First, we need to move in our terminal to the folder where our project should be placed. Next, type the following code and hit Enter:

cookiecutter https://github.com/drivendata/cookiecutter-data-science

At this point, the program will start hammering you with questions. Note that it will recognize your name from the configuration. In the future, you can add any other settings, such as the S3 bucket name, to the config file. Once you're done with questions, it will generate the path. Now, open the new folder in VS Code, using code <project_name>, and explore it! 

As you can see, there are many files already. These include a README file, with the full project's tree and injected description – one you had to type in the questionnaire phase. Your name is in place in setup.py. The Makefile, a convenient interface for frequent command-line operations, knows the bucket you typed in and can upload all the data there. There are a few other pre-generated features as well.

Now, this template is large, and perhaps too complex for some projects. In fact, an entire web page (https://drivendata.github.io/cookiecutter-data-science/) is devoted to how to use it. But that does not mean that you have to adapt to it. Instead, you could clone their template and tailor it to your needs, or even build one of your own, from scratch. For example, we will discuss another excellent tool, DVC, in Chapter 14, Improving Your Model – Pipelines and Experiments. It seems reasonable to integrate it into this template.

The benefits of using templates may seem few in number at the start, but the returns are somewhat cumulative – the more you use every template, and the more features you add, the more value you'll get from it.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.66.241