Engineering the application for scalability

At a time when most of the enterprise projects resort to using one framework or another, which usually decides how the application will be served during the production phase, it is still a good idea to take a look beneath the surface and understand how to develop the application while keeping the scalability of the application in perspective.

In this section, we will take a look at the different techniques that can help us build a scalable application, even when we are not using some per-built framework which can do it for us. During the course of this section, we will see how we can use thread/process pooling to handle multiple clients at the same time, and why the pooling of resources is necessary and what prevents us from starting a separate thread or process for dealing with every other incoming request.

But before we dive into the concepts of how we can utilize the thread pooling or process pooling in the application development, let's first take a look at a simple way through which we can hand-off the processing of the incoming requests to a background thread.

The following code implements a simple socket server which first accepts an incoming connection and then hands it off to a background thread for reads and writes, hence freeing the main thread to accept the other incoming connections:

# simple_socket_thread.py
#!/usr/bin/python3
import socket
import threading

# Let's first create a TCP type Server for handling clients
class Server(object):
    """A simple TCP Server."""

    def __init__(self, hostname, port):
        """Server initializer

        Keyword arguments:
        hostname -- The hostname to use for the server
        port -- The port on which the server should bind
        """

        self.server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        self.hostname = hostname
        self.port = port
        self.bind_connection()
        self.listen()

    def bind_connection(self):
        """Bind the server to the host."""

        self.server.bind((self.hostname, self.port))

    def listen(self):
        """Start listening for the incoming connections."""

        self.server.listen(10) # Queue a maximum of 10 clients
        # Enter the listening loop
        while True:
            client, client_addr = self.server.accept()
            print("Received a connection from %s" % str(client_addr))
            client_thread = threading.Thread(target=self.handle_client, args=(client,))
            client_thread.daemon = True
            client_thread.start()

    def handle_client(self, client):
        """Handle incoming client connection.

        Keyword arguments:
        client -- The client connection socket
        """

        print("Accepted a client connection")
        while True:
            buff = client.recv(1024).decode()
            if not buff:
                break
            print(buff)
        print("Client closed the connection")
        client.close() # We are done now, let's close the connection

if __name__ == '__main__':
    server = Server('localhost', 7000)

In this code, we have implemented a simple Server class which initializes a TCP-based server on the machine, ready to accept the incoming connections. Without diverting too much, let's try to focus on the important aspect of this code, where we start the listening loop of the server under the listen() method.

Under the listen() method, we first call the listen() method of the socket and tell it that it can queue up, at most, 10 connections which have not been accepted. Once this limit is reached, any further client connection will be rejected by the server. Now, moving on from here, we start an infinite loop where the first call is made to the accept() method of the socket. The call to the accept() method blocks until a client attempts to make a connection. On a successful attempt, the accept() call returns the client connection socket and the client address. The client connection socket can be used to perform I/O operations with the client.

The fun part happens next: as soon as the client connection is accepted, we launch a daemon thread responsible for handling the communication with the client and hand-off the client connection socket to the thread. This essentially frees up our main thread from dealing with the I/O of the client socket, and hence, our main thread can now accept more clients. This process continues for every other client that connects to our server.

So far so good; we have a nice way through which we can handle the incoming clients and our service can scale up gradually as the number of clients increases. That was an easy solution, wasn't it? Well, apparently during the course of coming up with this solution, we have ignored a major flaw in the process. The flaw lies in the fact that we have not implemented any kind of control related to how many threads can be launched by the application for dealing with the incoming clients. Imagine what will happen if a million clients try to connect to our server? Will we be really running a million threads at the same time? The answer is a big NO.

But why isn't it possible? Let's take a look.

Table of Contents for Engineering the application for scalability

Create new playlist

Sign In

Sign Up

Table of Contents for
Engineering the application for scalability