Implementing the encoder

We are going to implement both the encoder and decoder utilizing transform streams. This will give us the most flexibility in terms of actually implementing the streams, and it already has a lot of the behavior that we need since we are technically transforming the data. First, we will need some generic helpers for both the encoding and decoding of our specific data types, and we will put all of these methods in a helpers.js helper file. The encoding functions will look like the following:

export const encodeString = function(str) {
const buf = Buffer.from(str);
const len = Buffer.alloc(4);
len.writeUInt32BE(buf.byteLength);
return Buffer.concat([Buffer.from([0x03]), len, buf]);
}
export const encodeNumber = function(num) {
const type = Math.round(num) === num ? 0x01 : 0x02;
const buf = Buffer.alloc(4);
buf.writeInt32BE(num);
return Buffer.concat([Buffer.from([type]), buf]);
}

Encoding the string takes in the string and outputs the buffer that will hold the information for the decoder to work on. First, we will change the string to the Buffer format. Next, we create a buffer to hold the length of the string. Then, we store the length of the buffer utilizing the writeUInt32BE method.

For those that do not know byte/bit conversions, 8 bits of information (a bit is either a 1 or 0- the lowest form of data we can supply) makes up 1 byte. A 32-bit integer, what we are trying to write, is then made up of 4 bytes (32/8). The U portion of that method means it is unsigned. Unsigned means we only want positive numbers (lengths can only be 0 or positive in our case). With this information, we can see why we allocated 4 bytes for this operation and why we are utilizing this specific method. For more information on both the write/read portions for buffers, go to https://nodejs.org/api/buffer.html as it explains in depth the buffer operations we have access to. We will only explain the operations that we will be utilizing.

Once we have the string turned into the buffer format and the length of the string, we will write out a buffer that has type as the first byte, in our case the 0x03 byte; the length of the string, so we know how much of the incoming buffer is the string; and then finally, the string itself. This one should be the most complicated out of the two helper methods, but from a decoding perspective, it should make sense. When we are reading the buffer, we do not know how long a string is going to be. Because of this, we need some information in the prefix of this type to know how much to actually read. In our case, the 0x03 tells us that the type is a string and we know, based on our data type protocol that we established previously, that the next 4 bytes will be the length of the string. Finally, we can use this information to read so far ahead in the buffer to grab the string and decode it back to a string.

The encodeNumber method is much easier to understand. First, we check whether the rounding of the number equals itself. If it does, then we know that we are dealing with a whole number, otherwise, we treat it as a floating-point number. For those that are unaware, in most cases, knowing this information does not matter too much in JavaScript (though there are certain optimizations that the V8 engine utilizes when it knows that it is dealing with whole numbers), but if we want to use this data format with other languages, then the difference matters.

Next, we allocated 4 bytes since we are only going to write out 32-bit signed integers. Signed means they will support both positive and negative numbers (again, we won't go into the big difference between the two, but for those that are curious, we actually limit the maximum value we can store in here if we utilize signed integers since we have to utilize one of the bits to tell us whether the number is negative or not). We then write out the final buffer, which consists of our type and then the number in buffer format.

Now, with the helper methods and the following constants in the helper.js file, proceed as follows:

export const CONSTANTS = {
object : 0x04,
number : 0x01,
floating : 0x02,
string : 0x03,
header : 0x10,
body : 0x11
}

We can create our encoder.js file:

  1. Import the necessary dependencies and also create the shell of our SimpleSchemaWriter class:
import { Transform } from 'stream';
import { encodeString, encodeNumber } from './helper.js';

export default class SimpleSchemaWriter extends Transform {
}
  1. Create the constructor and make sure that objectMode is always turned on:
// inside our SimpleSchemaWriter class
constructor(opts={}) {
opts.writableObjectMode = true;
super(opts);
}
  1. Add a private #encode helper function that will do the underlying data check and conversion for us:
// inside of our SimpleSchemaWriter class
#encode = function(data) {
return typeof data === 'string' ?
encodeString(data) :
typeof data === 'number' ?
encodeNumber(data) :
null;
}
  1. Write the main _transform function for our Transform stream. Details of this stream will be explained as follows:
_transform(chunk, encoding, callback) {
const buf = [];
buf.push(Buffer.from([0x10]));
for(const key of Object.keys(chunk)) {
const item = this.#encode(key);
if(item === null) {
return callback(new Error("Unable to parse!"))
}
buf.push(item);
}
buf.push(Buffer.from([0x10]));
buf.push(Buffer.from([0x11]));
for(const val of Object.values(chunk)) {
const item = this.#encode(val);
if(item === null) {
return callback(new Error("Unable to parse!"))
}
buf.push(item);
}
buf.push(Buffer.from([0x11]));
this.push(Buffer.concat(buf));
callback();
}

Overall, the transform function should look familiar to previous _transform methods we have implemented, with some exceptions:

  1. Our first portion of the encoding is wrapping our headers (the keys of the object). This means that we need to write out our delineator for headers, which is the 0x10 byte.

 

  1. We will run through all of the keys of our object. From here, we will utilize the private method, encode. This method will check the data type of the key and return the encoding utilizing one of the helper methods that we discussed previously. If it does not get a type that it understands, it will return null. We will then give back an Error since our data protocol does not understand the type.
  2. Once we have run through all of the keys, we will write out the 0x10 byte again, stating that we are done with the headers, and write out the 0x11 byte to tell the decoder that we are starting with the body of our message. (We could have utilized the constants from the helpers.js file here and we probably should, but this should help with understanding the underlying protocol. The decoder will utilize these constants to showcase better programming practices.)
  3. We will now run through the values of the object and run them through the same encoding system that we did with the headers, and also return an Error if we do not understand the data type.
  4. Once we are finished with the body, we will push the 0x11 byte again to say that we are done with the body. This will be the signal to the decoder to stop converting this object and to send out the object it has been converting. We will then push all of this data to the Readable portion of our Transform stream and use the callback to say that we are ready to process more data.

There are some problems with the overall structure of our encoding scheme (we shouldn't be using singular bytes for our wrappers since they can easily be misconstrued by our encoder and decoder) and we should support more data types, but this should give a nice understanding as to how an encoder can be built for more generally used data formats.

Right now, we will not be able to test this, other than that it spits out the correct encoding, but once we have the decoder up and running, we will be able to test to see whether we get the same object on both sides. Let's now take a look at the decoder for this system.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.116.67.22