You can find the first post in this series here.
Since we are discussing the specifications that could drive such a subsystem of an architecture, we need to first identify a methodology that provides optimal benefits when building it. We need a design method that provides both a great deal of flexibility and incorporates a granular approach to dealing with data. In such a case, I would turn to metadata-driven design, on which I have expounded in the past.
So, what is metadata-driven design (i.e., MDD)? For the sake of brevity, MDD can be thought of as an increment to domain-driven design, where metadata provides the blueprint for the storage, the data structures, and the functionality inherent to an enterprise-scale application. Through the addition of more rows of metadata, stakeholders can extend the scope and functionality of the application with little to no additional software development. However, if any actual software development is required to enhance the platform due to some unforeseen complexity, it should not present that much difficulty; this increment should also be able to use additional sets of metadata, existing as extra layers on top of the original set(s). These additional layers can be thought of as dimensions, and much like a communications protocol, the set of these layers can be thought of as a stack.
So, let’s showcase an artifact from the InfoQ article that started it all, which provides an introduction to MDD:
In this image, we have an example displaying a set of metadata that describes a logical group of data points (i.e., Attributes); we can address them as a group, especially since they reside on the same table. Using this metadata, we could generate static data structures, but by using “flexible” structures, we can gain the benefit of not having to recompile or redeploy any of the code for our servers. So, we will create flexible data structures that act more like a series of nested containers, using the metadata to determine the hierarchy of this container.
For example, these attributes could be collected in a hash table of “[GroupName] > [Attributes]”, where the key “GroupName” is a string and the value “Attributes” is another hash table; the “Attributes” hash table could contain the actual pairing of each Attribute to its value. This particular metadata helps to construct the base layer of our stack. On top of that initial dimension (and presented in the bottom of the image), we have a new set of metadata that describes permissions of users in relation to each Attribute; by doing so, we have created the preliminary parameters for our permissions schema. However, as stated earlier, we need more information in order to have a more complete perspective.
So far, we have a way of packaging the actual data, and we have a way of storing user permissions in relation to the data points. Now, in order to finally create our permissions subsystem, we need a set of information that describes the state and history of each data point. Let’s assume that whenever we persist data to our system, we also write records to an auditing repository that describe the events that have just occurred. (After all, it’s recommended to have such a recorded history on hand; it’s our protective shield in the face of the dragon known as SOX.) For example, if the stakeholder Bob Smith changed a price from $3.99 to $2.99, we would log just that, along with any other edits from years prior. When we need to know who made the last edit for this product’s price, we could scour this huge table, with many rows listing such details from months and years ago…but for the advantage of performance and general reliability, we should designate a place to put particularly vital information, like data about the most recent edit. So, we will create yet another dimension to the metadata and add it to our stack.
Much like the initial dimension that described the structures of the actual data, this dimension will describe a structure that represents the state and history of each instance of an Attribute (i.e., per record), and this information will also be persisted to a table. We will call this information contextual data:
Now that we have a definition of contextual metadata, we can start to employ it when we persist main records, writing context records in parallel. Take the following set of contextual data as an example instance, which follows the definition in our metadata:
This contextual data lists a number of important properties regarding the “Price” Attribute on the record with EAN 1234567890. It tells us that Bob Smith locked this record’s price on 5/10/2016, and it tells us that Bob Smith also edited the price on the same day. Using some simple queries, one can find this data within our immense auditing repository. However, by updating this simple table while simultaneously depositing to the huge vault of our auditing, we can improve performance by having such pertinent information quickly returned via simple queries. Now that we have our final, requisite ingredient for our recipe, we can finally take the steps to create our desired subsystem, of which we will do in Part 3.