Data science projects

A typical data science project starts with the idea of learning from a set of data and making a prediction. A lot of the upfront work goes into data collection, data cleaning, data analysis, and visualization. Then, data is further digested into features as inputs a machine learning model. The process up until this point is called data engineering. The data scientist then chooses one or more machine learning models and keeps on refining and tuning the model to arrive at a good level of accuracy for the predictive model. This process is called model development. When the model is ready for production, it is deployed and sometimes a frontend is created for the end user. The final process is referred to as model deployment.

The data engineering and model development processes can be interactive at the beginning, but they usually end up getting automated. That's because the process needs to be repeatable and the results have to be consistent. Data scientists may use a variety of tools during development, ranging from a number of Jupyter notebooks to a suite of related libraries and programs.

When a predictive model becomes production-ready, it can be deployed as a web service so that it can be used to make real-time predictions. At this point, the model needs to have a life cycle and be maintained, just like any other production software.

Table of Contents for Data science projects

Create new playlist

Sign In

Sign Up

Table of Contents for
Data science projects