The server

The core of our application is the server. Let's assume that we've already implemented image classification in the same way we did in the previous section, that is, by using a model saved as a TorchScript snapshot and loaded into the torch::jit::script::Module object. We encapsulated this functionality in the following class:

class Network {
  public:
    Network(const std::string& snapshot_path,
            const std::string& synset_path,
            torch::DeviceType device_type);
    
    std::string Classify(const at::Tensor& image);
    
  private:
    torch::DeviceType device_type_;
    Classes classes_;
    torch::jit::script::Module model_;
};

The following code shows an implementation of the main routine of our application:

#include <torch/script.h>
#include "network.h"
#include "third-party/httplib/httplib.h"
#include "utils.h"

int main(int argc, char** argv) {
    try {
        std::string snapshoot_path;
        std::string synset_path;
        std::string www_path;
        std::string host = "localhost";
        int port = 8080;
        
        if (argc >= 4) {
            snapshoot_path = argv[1];
            synset_path = argv[2];
            www_path = argv[3];
            if (argc >= 5)
                host = argv[4];
            if (argc >= 6)
                port = std::stoi(argv[5]);
            
            torch::DeviceType device_type = torch::cuda::is_available()
                                                    ? torch::DeviceType::CUDA
                                                    : torch::DeviceType::CPU;
            
            Network network(snapshoot_path, synset_path, device_type);
            ...
            // HTTP service implementation
            ...
        } else {
            std::cout << "usage: " << argv[0]
            << " <model snapshoot path> <synset file path> 
                <www dir=../../client> "
            "[host=localhost] [port=8080]
";
        }
    } catch (const std::exception& err) {
        std::cerr << err.what();
    } catch (...) {
        std::cerr << "Unhandled exception";
    }
    return 1;
}

Here, we read the parameters required by our application upon startup. There are three required parameters: the path to the model snapshot file, the path to the synset file, and the path to the directory where we place our HTML client application files. There are also two optional parameters: the server host IP address and the server network port.

After we've read the program parameters, we can initialize the Network type object with a specified model snapshot and synset files. We also dynamically determined whether there is a CUDA device available on the machine where we start the server. We did this with the torch::cuda::is_available() function.

If a CUDA device is available, we can move our model to this device to increase computational performance. The following code shows how we can load a model into the specified device:

model_ = torch::jit::load(snapshot_path, device_type);

The torch::jit::load() function accepts the device type as its second parameter and automatically moves the model to the specified device.

There is a lightweight C++ single-file header-only cross-platform HTTP/HTTPS library available named cpp-httplib. We can use it to implement our server. The following code shows how we used the httplib::Server type to instantiate the server object so that it can handle HTTP requests:

httplib::Server server;

The httplib::Server class also implements a simple static file server. The following code snippet shows how to set up the directory for loading static pages:

server.set_base_dir(www_path.c_str());

The path that's passed into the set_base_dir() method should point to the directory we use to store the HTML pages for our service. To be able to see what's going on in the server when it's launched, we can configure the logging function. The following code shows how to print minimal request information when the server accepts the incoming message:

server.set_logger([](const auto& req, const auto& /*res*/) {
    std::cout << req.method << "
" << req.path << std::endl;
});

It is also able to handle HTTP errors when our server works. The following snippet shows how to fill the response object with error status information:

server.set_error_handler([](const auto& /*req*/, auto& res) {
    std::stringstream buf;
    buf << "<p>Error Status: <span style='color:red;'>";
    buf << res.status;
    buf << "</span></p>";
    res.set_content(buf.str(), "text/html");
});

The server sends this response object to the client in the case of an error.

Now, we have to configure the handler for our server object so that it can handle POST requests. There is a POST method in the httplib::Server class that we can use for this purpose. This method takes the name of the request's pattern and the handler object.

The special URL pattern should be used by the client application to perform a request; for example, the address can look like http://localhost:8080/imgclassify, where imgclassify is the pattern. We can have different handlers for different requests. The handler can be any callable object that accepts two arguments: the first should be of the const Request type, while the second should be of the Response& type. The following code shows our implementation of the image classification request:

server.Post("/imgclassify", [&](const auto& req, auto& res) {
    std::string response_string;
    for (auto& file : req.files) {
        auto body = req.body.substr(file.second.offset, 
                                    file.second.length);
        try {
            auto img = ReadMemoryImageTensor(body, 224, 224);
            response_string += "; " + network.Classify(img);
        } catch (...) {
            response_string += "; Classification failed";
        }
    }
    res.set_content(response_string.c_str(), "text/html");
});

In this handler, we iterated over all the files in the input request. For each file, we performed the following steps:

Extracted the bytes representing the image
Decoded the bytes into the image object
Converted the image object into a tensor
Classified the image

The Request type object has a files member, which can be used to iterate the information chunks about files that are sent with a given request. Each chunk is of the MultipartFile type and contains information about the filename, the type, the starting position in the whole message body, and the length. The body of the Request object is a std::string object, so we used the substr() method to extract the particular file data by specifying the start position and the length of the file. We used the ReadMemoryImageTensor() function to decode the file data into an image. This function also scaled and normalized the image to satisfy the ResNet model's requirements. The result of calling this function was a PyTorch tensor object. Then, we used the Network object to classify the image tensor with the ResNet model that was loaded from the snapshot. Moreover, the Classify() method returned a string containing the classification information we got from our classifier. This string was used to fill the response object.

We can use the listen() method of the httplib::Server type object to enable it to accept incoming connections and processing messages. The following code shows how to do this:

if(!server.listen(host.c_str(), port)) {
    std::cerr << "Failed to start server
";
}

The listen() method automatically binds the server socket to the given IP address and the port number.

In this section, we looked at the main implementation stages for the server part of our service. In the next section, we'll look at the implementation of the client.

Table of Contents for The server

Create new playlist

Sign In

Sign Up

Table of Contents for
The server