Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

The architecture of the project

For a simple e-commerce site these days, the following entities are obvious:

Products/items and their metadata
Customers/users and their metadata
News items/blog posts related to the products or from editorial
User reviews associated with products
User ratings associated with products

Other than that, there are many more systems required to effectively build a complete e-commerce enterprise. The following diagram highlights them:

Because our objective is to build a recommender system and not a complete e-commerce site, we will narrow our focus to a minimum set of requirements. So here are different software components that we will need:

Persistent/structured data storage
A queuing mechanism
Search support

For data storage, we can use MongoDB, and we have already covered it in a previous chapter. Because MongoDB is NoSQL storage, we need to be careful in designing a schema that allows us to join different entities such as a user and reviews to form a complete product profile.

A queuing mechanism is used to process the data as it arrives in a streaming fashion. This is also important if different independent components such as a search indexer, recommendation engine, e-mailer service, and so on, all want to process the data in parallel. We can use Apache Kafka for this purpose. Since we have already covered Apache Kafka in a previous chapter let's stick to that.

For search, we can use a popular search technology such as Elasticsearch/Apache Solr, or just plain Apache Lucene. Although MongoDB also supports search queries, it is not as extensive as Elasticsearch or Apache Solr. In order to set up Elasticsearch (or Apache Solr) you can refer to their project pages:

Elasticsearch: https://www.elastic.co/
Apache Solr: https://lucene.apache.org/solr/

We will also go through Elasticsearch setup in Chapter 7, Enhancing the User Experience.

The following is the architecture of the application that we will build:

Batch versus online

As shown in the architecture diagram of your application, the input data of different interactions happening in the system are captured as soon as they take place. They are routed via the queuing mechanism to storage and indexing components (we have ignored other components such as e-mail and payment for now).

When this data finally reaches the recommender system component, then either the recommender system will learn instantly, that is, online recommendations, or it will wait for some specified time (maybe hours or days) and then re-generate recommendations, that is, batched processing. The recommender system can wait for some time before generating new recommendations. This delayed approach is also called batching. This can be due to the fact that either enough data is not yet available so it makes no sense to run a recommender algorithm or the recommender algorithm is itself very expensive.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for The architecture of the project

Create new playlist

Sign In

Sign Up

The architecture of the project

Batch versus online

Table of Contents for
The architecture of the project