Thursday, October 3, 2019

At the beginning there was a schema.


Recently we developed a simple “in house” JSON-based protocol for fetching user details from a directory, where apps would submit a ticket (a string identifying a user) to get user info. For example:
app request
server response
{ ”ticket” : ”T9yIC4c2mzR” }
{
   "authorized" : true, 
   "name" : "John Johnson", 
   "email" : "jj@....", 
   "licenses" : [....]
}  

To communicate in JSON we utilized our own DCodec library, which is capable not only of JSON serialization, but also of JSON validation against an ASN.1 schema when one is provided. So we defined this ASN.1 schema:
UserInfo ::= SEQUENCE 
{
  authorized  BOOLEAN DEFAULT FALSE, 
  name        UTF8String (SIZE (3..161)),
  email       UTF8String (SIZE (5..100)),
  licenses    SEQUENCE (SIZE (0..MAX)) OF License
}
...

Since the protocol was simple, the schema was not mandatory (some nodes couldn’t use it even if they wanted to), but it was convenient as informal documentation for developers (defining the message structure, field optionality, value ranges, defaults, etc), as well as for message validation. Yet, there were a few more benefits of having a schema, which we realized only after we implemented it

The logic for handling the UserInfo.authorized field appeared to be very simple: the user is authorized only when the field is set to TRUE, while in all other conditions, including when something is wrong with the message (e.g. wasn't received), would result in a non-authorized user. Such logic proved correct during testing, so the code went into production.

While in production, on rare occasions, we noticed that some responses included an extra field (a way for the directory server to tell about an internal problem):
{
   "error" : 5,
   "authorized" : false, 
   "name" : "", 
   "email" : "", 
   "licenses" : []
}  

We’d never have caught the problem of a message carrying an extra field if not for the “invalid message” events logged due to the schema mismatch, since from the app point of view there was nothing wrong, and the user just appeared to be “non-authorized”, so there was no visible interruption.

Hence we updated the schema to allow a new field “error”, but we immediately caught another schema mismatch. This time it was for “name” and “email” SIZE violation, which cannot be empty. This time we realized that we’re re-using a single message for different cases - success and failure of fetching user info. So we changed the schema one more time to add a new message type for errors:
Response ::= CHOICE 
{
  user  UserInfo, -- successful response
  error Error     -- unsuccessful response
}

UserInfo ::= SEQUENCE 
{
  authorized  BOOLEAN DEFAULT FALSE, 
  name        UTF8String (SIZE (3..161)),
  email       UTF8String (SIZE (5..100)),
  licenses    SEQUENCE (SIZE (0..MAX)) OF License
}

Error ::= SEQUENCE
{
  code        INTEGER (0..255), 
  description UTF8String (SIZE (0..256))
}
Now the response could be either user info, say:
{"user":{"authorized":true, "name":"John Jonson", .... }}
or an “error”, like:
{"error":{"code":5, "description":"...."}}

Without the schema we could have just patched the app code (as “it’s not my problem”) by either ignoring the error, or allowed empty fields, or custom-handled every issue somewhere and somehow. Doing so would result in masking the problem and/or spreading the validation logic all over the application stack.

Another not-so-obvious benefit of using a schema is that the data definition could be stored together with the app sources, so data become a versioned part of Infrastructure as Code, Continuous Integration and Continuous Deployment - following the best practices of DevOps.


WITH SCHEMA
  • Clear boundary of data validation, when the data just entered the system (layer, node).
  • Catch the problem early.
  • Data and code are in sync, defined and implemented together.
  • Precise logic for data definition/validation.
WITHOUT SCHEMA
  • Debug the entire app stack to pinpoint the data at consumption location.
  • Vendor-specific validation logic.
  • Certain conditions might be left unchecked/undetected (e.g. outside the app logic).

No comments:

Post a Comment