Custom encoding/decoding with JSON

In the JSON world, we can consider terms like encoding/decoding as synonyms to serializing/deserializing. They basically all mean transforming to and back from JSON. In the following example, I'm going to show you how to encode complex numbers:

# json_examples/json_cplx.py
import json

class ComplexEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, complex):
return {
'_meta': '_complex',
'num': [obj.real, obj.imag],
}
return json.JSONEncoder.default(self, obj)

data = {
'an_int': 42,
'a_float': 3.14159265,
'a_complex': 3 + 4j,
}

json_data = json.dumps(data, cls=ComplexEncoder)
print(json_data)

def object_hook(obj):
try:
if obj['_meta'] == '_complex':
return complex(*obj['num'])
except (KeyError, TypeError):
return obj

data_out = json.loads(json_data, object_hook=object_hook)
print(data_out)

We start by defining a ComplexEncoder class, which needs to implement the default method. This method is passed to all the objects that have to be serialized, one at a time, in the obj variable. At some point, obj will be our complex number, 3+4j. When that is true, we return a dictionary with some custom meta information, and a list that contains both the real and the imaginary part of the number. That is all we need to do to avoid losing information for a complex number.

We then call json.dumps, but this time we use the cls argument to specify our custom encoder. The result is printed:

{"an_int": 42, "a_float": 3.14159265, "a_complex": {"_meta": "_complex", "num": [3.0, 4.0]}}

Half the job is done. For the deserialization part, we could have written another class that would inherit from JSONDecoder, but, just for fun, I've used a different technique that is simpler and uses a small function: object_hook.

Within the body of object_hook, we find another try block, but don't worry about it for now. I'll explain it in detail in the next chapter. The important part is the two lines within the body of the try block itself. The function receives an object (notice, the function is only called when obj is a dictionary), and if the metadata matches our convention for complex numbers, we pass the real and imaginary parts to the complex function. The try/except block is there only to prevent malformed JSON from ruining the party (and if that happens, we simply return the object as it is).

The last print returns:

{'an_int': 42, 'a_float': 3.14159265, 'a_complex': (3+4j)}

You can see that a_complex has been correctly deserialized.

Let's see a slightly more complex (no pun intended) example now: dealing with datetime objects. I'm going to split the code into two blocks, the serializing part, and the deserializing afterwards:

# json_examples/json_datetime.py
import json
from datetime import datetime, timedelta, timezone

now = datetime.now()
now_tz = datetime.now(tz=timezone(timedelta(hours=1)))

class DatetimeEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, datetime):
try:
off = obj.utcoffset().seconds
except AttributeError:
off = None

return {
'_meta': '_datetime',
'data': obj.timetuple()[:6] + (obj.microsecond, ),
'utcoffset': off,
}
return json.JSONEncoder.default(self, obj)

data = {
'an_int': 42,
'a_float': 3.14159265,
'a_datetime': now,
'a_datetime_tz': now_tz,
}

json_data = json.dumps(data, cls=DatetimeEncoder)
print(json_data)

The reason why this example is slightly more complex lies in the fact that datetime objects in Python can be time zone aware or not; therefore, we need to be more careful. The flow is basically the same as before, only it is dealing with a different data type. We start by getting the current date and time information, and we do it both without (now) and with (now_tz) time zone awareness, just to make sure our script works. We then proceed to define a custom encoder as before, and we implement once again the default method. The important bits in that method are how we get the time zone offset (off) information, in seconds, and how we structure the dictionary that returns the data. This time, the metadata says it's a datetime information, and then we save the first six items in the time tuple (year, month, day, hour, minute, and second), plus the microseconds in the data key, and the offset after that. Could you tell that the value of data is a concatenation of tuples? Good job if you could!

When we have our custom encoder, we proceed to create some data, and then we serialize. The print statement returns (after I've done some prettifying):

{
"a_datetime": {
"_meta": "_datetime",
"data": [2018, 3, 18, 17, 57, 27, 438792],
"utcoffset": null
},
"a_datetime_tz": {
"_meta": "_datetime",
"data": [2018, 3, 18, 18, 57, 27, 438810],
"utcoffset": 3600
},
"a_float": 3.14159265,
"an_int": 42
}

Interestingly, we find out that None is translated to null, its JavaScript equivalent. Moreover, we can see our data seems to have been encoded properly. Let's proceed to the second part of the script:

# json_examples/json_datetime.py
def object_hook(obj):
try:
if obj['_meta'] == '_datetime':
if obj['utcoffset'] is None:
tz = None
else:
tz = timezone(timedelta(seconds=obj['utcoffset']))
return datetime(*obj['data'], tzinfo=tz)
except (KeyError, TypeError):
return obj

data_out = json.loads(json_data, object_hook=object_hook)

Once again, we first verify that the metadata is telling us it's a datetime, and then we proceed to fetch the time zone information. Once we have that, we pass the 7-tuple (using * to unpack its values in the call) and the time zone information to the datetime call, getting back our original object. Let's verify it by printing data_out:

{
'a_datetime': datetime.datetime(2018, 3, 18, 18, 1, 46, 54693),
'a_datetime_tz': datetime.datetime(
2018, 3, 18, 19, 1, 46, 54711,
tzinfo=datetime.timezone(datetime.timedelta(seconds=3600))),
'a_float': 3.14159265,
'an_int': 42
}

As you can see, we got everything back correctly. As an exercise, I'd like to challenge you to write the same logic, but for a date object, which should be simpler.

Before we move on to the next topic, a word of caution. Perhaps it is counter-intuitive, but working with datetime objects can be one of the trickiest things to do, so, although I'm pretty sure this code is doing what it is supposed to do, I want to stress that I only tested it very lightly. So if you intend to grab it and use it, please do test it thoroughly. Test for different time zones, test for daylight saving time being on and off, test for dates before the epoch, and so on. You might find that the code in this section then would need some modifications to suit your cases.

Let's now move to the next topic, IO.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.120.109