So far, both of the data object implementations that we've created have overridden the _create and _update methods that were required in BaseDataObject. It would be fair, under the circumstances, to question why those were put in place at all. The short answer to that question is that both of the implementations that have come together so far use the same process at the data store level for creating and updating records and documents. As a result, they simply haven't been needed. If it was expected that hms_sys would never need any other database backend, we'd be justified in removing them from the entire code base.
However, what would've happened if the decision to use MongoDB had gone a different way, and the preferred (or mandated) backend data store engine was an RDBMS such as Microsoft SQL Server? Or, worse, what if that sort of change was mandated after the system was operational?
Setting aside the data migration planning that would have to happen and focusing on only the application and service code, what would that kind of change require? Not much, when it comes right down to it. A generic SQL/RDBMS engine ABC (HMSSQLDataObject) might look something like the following, for a given RDBMS API/library:
class HMSSQLDataObject(BaseDataObject, metaclass=abc.ABCMeta): """ Provides baseline functionality, interface requirements, and type-identity for objects that can persist their state-data to a (GENERIC) SQL-based RDBMS back-end data-store. """
The same _configuration class property would probably be in use, serving the same purpose. It's possible that the _data_dict_keys class attribute would also be of use in reducing record fields to a valid argument dictionary in from_data_dict. Since SQL, for the various CRUD operations, or at least for specific starting points for those CRUD operations, would need to be stored and accessible to the classes, a viable option for doing so would be to attach them as class-attributes, as well:
################################### # Class attributes/constants # ################################### # - Keeps track of the global configuration for data-access _configuration = None # - Keeps track of the keys allowed for object-creation from # retrieved data _data_dict_keys = None # - SQL for various expected CRUD actions: _sql_create = """Some SQL string goes here""" _sql_read_oids = """Some SQL string goes here""" _sql_read_all = """Some SQL string goes here""" _sql_read_criteria = """Some SQL string goes here""" _sql_update = """Some SQL string goes here""" _sql_delete = """Some SQL string goes here"""
Since the SQL for the various CRUD operations would include the tables that the data is stored in, and the process of connecting to the database in most RDBMS' handles the equivalents to the connection and database in our MongoDB approach, only the connection itself needs to be tracked and available as a property:
################################### # Property-getter methods # ################################### def _get_connection(self): try: return self.__class__._connection except AttributeError: # - Most RDBMS libraries provide a "connect" function, or # allow the creation of a "connection" object, using the # parameters we've named in DatastoreConfig, or simple # variations of them, so all we need to do is connect: self.__class__._connection = RDBMS.connect( **self.configuration ) return self.__class__._connection
Like its equivalent in the Mongo-based implementation, a connection is lazily instantiated and performs an actual deletion, rather than resetting to default values, as follows:
################################### # Property-deleter methods # ################################### def _del_connection(self) -> None: try: del self.__class__._connection except AttributeError: # - It may already not exist pass
The related property declaration is identical, and is shown as follows:
################################### # Instance property definitions # ################################### connection = property( _get_connection, None, _del_connection, 'Gets or deletes the database-connection that the instance ' 'will use to manage its persistent state-data' )
Object initialization is also identical, as follows:
################################### # Object initialization # ################################### def __init__(self, oid:(UUID,str,None)=None, created:(datetime,str,float,int,None)=None, modified:(datetime,str,float,int,None)=None, is_active:(bool,int,None)=None, is_deleted:(bool,int,None)=None, is_dirty:(bool,int,None)=None, is_new:(bool,int,None)=None, ): """ Object initialization. self .............. (HMSMongoDataObject instance, required) The instance to execute against oid ............... (UUID|str, optional, defaults to None) The unique identifier of the object's state-data record in the back-end data-store created ........... (datetime|str|float|int, optional, defaults to None) The date/time that the object was created modified .......... (datetime|str|float|int, optional, defaults to None) The date/time that the object was last modified is_active ......... (bool|int, optional, defaults to None) A flag indicating that the object is active is_deleted ........ (bool|int, optional, defaults to None) A flag indicating that the object should be considered deleted (and may be in the near future) is_dirty .......... (bool|int, optional, defaults to None) A flag indicating that the object's data needs to be updated in the back-end data-store is_new ............ (bool|int, optional, defaults to None) A flag indicating that the object's data needs to be created in the back-end data-store """ # - Call parent initializers if needed BaseDataObject.__init__(self, oid, created, modified, is_active, is_deleted, is_dirty, is_new ) # - Perform any other initialization needed
The significant, substantial differences are mostly in the methods that handle the CRUD operations. The original save method, as implemented in BaseDataObject, is left in place, and will call the _create or _update methods, as determined by the is_dirty or is_new property values for the instance. Each of these methods is responsible for acquiring the SQL template from the appropriate class attribute, populating it, as needed, with current state data values, sanitizing the resultant SQL, and executing it against the connection:
################################### # Instance methods # ################################### def _create(self): # - The base SQL is in self.__class__._sql_create, and the # field-values would be retrieved from self.to_data_dict(): data_dict = self.to_data_dict() SQL = self.__class__._sql_create # - Some process would have to add the values, if not the keys, # into the SQL, and the result sanitized, but once that was # done, it'd become a simple query-execution: self.connection.execute(SQL) def _update(self): # - The base SQL is in self.__class__._sql_update, and the # field-values would be retrieved from self.to_data_dict(): data_dict = self.to_data_dict() SQL = self.__class__._sql_update # - Some process would have to add the values, if not the keys, # into the SQL, and the result sanitized, but once that was # done, it'd become a simple query-execution: self.connection.execute(SQL)
The delete class method is simple:
################################### # Class methods # ################################### @classmethod def delete(cls, *oids): # - First, we need the database-connection that we're # working with: connection = cls.get_connection() SQL = cls._sql_delete % oids # - Don't forget to sanitize it before executing it! result_set = connection.execute(SQL)
Most of the pattern and approach behind the get method should look familiar; again, it's got the same signature (and is intended to perform the same activities) as the methods that have been created so far, which implement the required functionality of the BaseDataObject:
@classmethod def get(cls, *oids, **criteria) -> list: # - First, we need the database-connection that we're # working with: connection = cls.get_connection() # - The first pass of the process retrieves documents based # on oids or criteria. # - We also need to keep track of whether or not to do a # matches call on the results after the initial data- # retrieval: post_filter = False
# - Records are often returned as a tuple (result_set) # of tuples (rows) of tuples (field-name, field-value): # ( ..., ( ('field-name', 'value' ), (...), ... ), …)
The branch that handles oid requests is as follows:
if oids: # - Need to replace any placeholder values in the raw SQL # with actual values, AND sanitize the SQL string, but # it starts with the SQL in cls._sql_read_oids SQL = cls._sql_read_oids result_set = connection.execute(SQL) if criteria: post_filter = True
The criteria branch is as follows:
elif criteria: # - The same sort of replacement would need to happen here # as happens for oids, above. If the query uses just # one criteria key/value pair initially, we can use the # match-based filtering later to filter further as needed key = criteria.keys()[0] value = criteria[key] SQL = cls._sql_read_criteria % (key, value) result_set = connection.execute(SQL) if len(criteria) > 1: post_filter = True
The default branch that simply gets everything else is as follows:
else: SQL = cls._sql_read_all result_set = connection.execute(SQL)
All of the branches generate a list of data_dict values that can be used to create object instances, though they may not be returned from the backend data store as dictionary values.
The lowest common denominator results of a query are, as noted in the preceding code comments, a tuple of tuples of tuples, which might look something like the following:
# This is the outermost tuple, collecting all of the # rows returned into a result_set: ( # Each tuple at this level is a single row: ( # Each tuple at this level is a key/value pair: ('oid', '43d240cd-4c9f-44c2-a196-1c7c56068cef'), ('first_name', 'John'), ('last_name', 'Smith'), ('email', '[email protected]'), # ... ), # more rows could happen here, or not... )
If the engine, or the Python API to the engine, provides a built-in mechanism for converting rows returned into dictionary instances, that's probably the preferred approach to use to make the conversion. If there isn't anything built in to handle that, converting the nested tuples to a series of dictionaries isn't difficult to do:
# - We should have a result_set value here, so we can convert # it from the tuple of tuples of tuples (or whatever) into # data_dict-compatible dictionaries: data_dicts = [ dict( [field_tuple for field_tuple in row] ) for row in result_set ]
From this point on, the process is pretty much the same as in the previous implementations, in JSONFileDataObject and HMSMongoDataObject:
# - With those, we can create the initial list of instances: results = [ cls.from_data_dict(data_dict) for data_dict in data_dicts ] # - If post_filter has been set to True, then the request # was for items by oid *and* that have certain criteria if post_filter: results = [ obj for obj in results if obj.matches(**criteria) ]
Another (potentially major) difference concerns how child objects, such as the products in an Artisan object, will have to be handled. If there is a need to fetch those child objects as objects and populate the parent object with them, assuming that they use the same BaseDataObject-derived interface, each child type will have a class associated with it, each of those classes will have a get method, and that get method will allow the oid of the parent object to be specified as criteria. That will allow for a process that looks like the following, used to retrieve and attach any child objects, as needed (using Artisan and Product classes as an example):
# - Data-objects that have related child items, like the # Artisan to Product relationship, may need to acquire # those children here before returning the results. If # they do, then a structure like this should work most # of the time: for artisan in results: artisan._set_products( Product.get(artisan_oid=artisan.oid) ) return results
The other members of a final business/data object class that derives from HMSSQLDataObject should, for the most part, be expected by now, since they are also required for the implementation of final data objects derived from the other two DataObject ABCs. They would include the concrete implementations of to_data_dict and matches instance methods and the from_data_dict class method, and the various class-specific variables (mostly the _sql class attributes).