Creating the REST API

The server of EveryPlantSelectionApp is responsible for retrieving the plant names (the plant families, genuses, and species) and making them available to our client-side code via a simple REST API. To do this, we can use the express Node.js library, which enables us to route HTTP requests to specific functions, easily delivering JSON to our client.

Here's the skeletal beginnings of our server implementation:

import express from 'express';

const app = express();
const port = process.env.PORT || 3000;

app.get('/plants/:query', (req, res) => {
  req.params.query; // => The query
  res.json({
    fakeData: 'We can later place some real data here...'
  });
});

app.listen(
  port,
  () => console.log(`App listening on port ${port}!`)
);

As you can see, we're implementing just one route (/plants/:query). This will be requested by the client whenever a user enters a partial plant name into the <input/>, so that a user typing Carduaceus may produce the following set of requests to the server:

GET /plants/c
GET /plants/ca
GET /plants/car
GET /plants/card
GET /plants/cardu
GET /plants/cardua
...

You can imagine how this may result in a larger number of expensive and possibly redundant requests, especially if a user is typing quickly. It's possible that a user will type cardua before any of the previous requests can complete. For that reason, when we come around to implementing the client side, it'll be appropriate for us to use some kind of request throttling (or request debouncing) to ensure that we're only making a reasonable number of requests.

Request throttling is the act of reducing the overall amount of requests by only allowing a new request to be performed at a specified time interval, meaning that 100 requests spanned over five seconds, throttled to an interval of one second, would produce only five requests. Request debouncing is similar, though instead of performing a single request on every interval, it'll wait a predesignated amount of time for incoming requests to stop being made before enacting an actual request. So, 100 requests over five seconds, debounced by five seconds, would only produce a single final request at the five second mark.

In order to implement the /plants/ endpoint, we need to consider the most optimal way to search through the names of over 300,000 different plant species for matches. To accomplish this, we'll be using a special in-memory data structure called a trie. This is also known as a prefix tree and is very common to use in situations where autosuggestion or autocompletion needs to occur.

A trie is a tree-like structure that stores chunks of letters that appear next to each other as a series of nodes attached by branches. It's much easier to visualize than to describe, so let's imagine that we need a trie based on the following data:

['APPLE', 'ACORN', 'APP', 'APPLICATION']

Using that data, the produced trie might look something like this:

As you can see, our dataset of four words has been represented as a tree-like structure where the first common letter, "A", serves as the root. The "CORN" suffix branches off from this. Additionally, the "PP" branch (forming "APP"), branches off, and the last "P" of that then branches off to "L", which itself then branches off to "E" (forming "APPLE") and "ICATION" (forming "APPLICATION").

This may seem convoluted, but given this trie structure, we can, given an initial prefix typed by a user like "APPL", easily find all matching words ("APPLE" and "APPLICATION") by simply stepping through the nodes of the tree. This is far more performant than any linear search algorithm. For our purposes, given a prefix of a plant name, we want to be able to performantly display every plant name that the prefix may lead to.

Our specific dataset will include over 300,000 different plant species, but for the purposes of this case study, we'll only be using species from the Acanthaceae family, which amounts to around 8,000 species. These are available to use in the form of JSON as follows:

[
  { id: 105,
    family: 'Acanthaceae',
    genus: 'Andrographis',
    species: 'alata' },
  { id: 106,
    family: 'Acanthaceae',
    genus: 'Justicia',
    species: 'alata' },
  { id: 107,
    family: 'Acanthaceae',
    genus: 'Pararuellia',
    species: 'alata' },
  { id: 108,
    family: 'Acanthaceae',
    genus: 'Thunbergia',
    species: 'alata' },
  // ...
]

We'll be feeding this data into a third-party trie implementation called trie-search on NPM. This package has been selected because it fulfills our requirements and seems like a well-tested and well-maintained library.

In order for the trie to operate as we desire, we'll need to concatenate the family, genus, and species of each plant into a singular string. This enables the trie to include both the fully qualified plant name (for example, "Acanthaceae Pararuellia alata") and the split names (["Acanthaceae", "Pararuellia", "alata"]). The split name is automatically generated by the trie implementation we're using (meaning it splits strings on whitespace, via the regex /s/g):

const trie = new TrieSearch(['name'], {
  ignoreCase: true // Make it case-insensitive
});

trie.addAll(
  data.map(({ family, genus, species, id }) => {
    return { name: family + ' ' + genus + ' ' + species, id };
  })
);

The preceding code enters our dataset into the trie. Following this, it can be queried by simply passing a prefix string to its get(...) method:

trie.get('laxi');

Such a query (for the prefix, laxi) would return the following from our dataset:

[
  { id: 203,
    name: 'Acanthaceae Acanthopale laxiflora' },
  { id: 809,
    name: 'Acanthaceae Andrographis laxiflora' },
  { id: 390,
    name: 'Acanthaceae Isoglossa laxiflora' },
  //... (many more)
]

So, with regard to our REST endpoint, /photos/:query, all it needs to do is return a JSON payload that contains whatever we get from trie.get(query):

app.get('/plants/:query', (req, res) => {
  const queryString = req.params.query;
  if (queryString.length < 3) {
    return res.json([]);
  }
  res.json(
    trie.get(queryString)
  );
});

To separate our concerns a little better and to ensure we're not mixing too many different layers of abstraction (in possible violation of The Law of Demeter), we can abstract away our trie data structure and plant data to a module of its own. We can call this plantData to communicate the fact that it encapsulates and provides access to the plant data. The nature of how it works, which happens to be via an in-memory trie data structure, does not need to be known to its consumers:

// server/plantData.js

import TrieSearch from 'trie-search';
import plantData from './data.json';

const MIN_QUERY_LENGTH = 3;

const trie = new TrieSearch(['fullyQualifiedName'], {
  ignoreCase: true
});

trie.addAll(
  plantData.map(plant => {
    return {
      ...plant,
      fullyQualifiedName:
        `${plant.family} ${plant.genus} ${plant.species}`
    };
  })
);

export default {
  query(partialString) {
    if (partialString.length < MIN_QUERY_LENGTH) {
      return [];
    }
    return trie.get(partialString);
  }
};

As you can see, this module returns an interface that provides one method, query(), which our main HTTP routing code can utilize to deliver the JSON result for /plants/:query:

//...
import plantData from './plantData';
//...
app.get('/plants/:query', (req, res) => {
  const query = req.params.query;
  res.json( plantData.query(partial) );
});

Because we have isolated and contained the plant-querying functionality, it is now far easier to make assertions about it. Writing some tests that target the plantData abstraction will give us a high level of confidence that our HTTP layer is using a reliable abstraction, minimizing the potential bugs that can crop up within our HTTP layer itself.

At this point, since this is the first set of tests we'll be writing for our project, we'll be installing Jest (npm install jest --save-dev). There are a large number of testing frameworks available, with varying styles, but for our purposes, Jest is suitable.

We can write tests for our plantData module in a file intuitively located alongside it and named plantData.test.js:

import plantData from './plantData';

describe('plantData', () => {

  describe('Family+Genus name search (Acanthaceae Thunbergia)', () => {
    it('Returns plants with family and genus of "Acanthaceae Thunbergia"', () =>{
      const results = plantData.query('Acanthaceae Thunbergia');
      expect(results.length).toBeGreaterThan(0);
      expect(
        results.filter(plant =>
          plant.family === 'Acanthaceae' &&
          plant.genus === 'Thunbergia'
        )
      ).toHaveLength(results.length);
    });
  });

});

There are a large number of tests within plantData.test.js that aren't included here for the sake of brevity; however, you can view them in the GitHub repository: https://github.com/PacktPublishing/Clean-Code-in-JavaScript.

As you can see, this test is asserting whether an Acanthaceae Thunbergia query intuitively returns plants that have a fully qualified name containing these terms. In our dataset, this will only include plants that have an Acanthaceae family and a Thunbergia genus, so we can simply confirm that the results match that expectation. We can also check that partial searches, such as Acantu Thun, also intuitively return any plants that have either family, genus, or species names beginning with Acantu or Thun:

describe('Partial family & genus name search (Acantu Thun)', () => {
  it('Returns plants that have a fully-qualified name containing both "Acantu" and "Thunbe"', () => {
    const results = plantData.query('Acant Thun');
    expect(results.length).toBeGreaterThan(0);
    expect(
      results.filter(plant =>
        /Acant/i.test(plant.fullyQualifiedName) &&
        /Thun/i.test(plant.fullyQualifiedName)
      )
    ).toHaveLength(results.length);
  });
});

We confirm our expectations here by asserting that every returned result's fullyQualifiedName matches the regular /Acant/i and /Thun/i expressions. The /i expression indicates case sensitivity. The expression here represents a word boundary so that we can ensure that the Acant and Thun substrings appear at the beginning of individual words and are not embedded within words. For example, imagine a plant called Luathunder. We don't want our autosuggestion mechanism to match such instances. We only want it to match prefixes, as that is how users will be entering plant families, genuses, or species into <input /> (from the start of each word).

Now that we have a well-tested and isolated server-side architecture, we can begin to move onto the client side, where we will be rendering the plant names provided by /plants/:query in response to the user typing.

Table of Contents for Creating the REST API

Create new playlist

Sign In

Sign Up

Table of Contents for
Creating the REST API