Blog

Fun with json-schema

What initially started as a project to sanitize, generalize and centralize validation of user input to our web services has during the development of our 2.7 release evolved into a mechanism that d

What initially started as a project to sanitize, generalize and centralize validation of user input to our web services has during the development of our 2.7 release evolved into a mechanism that drives the development and documentation both of the web services themselves as well as parts of the consumers.

The problem we initially set out to solve was simple: how do we validate user input in an extensible and reusable way? As our web services produce and consume JSON, we naturally turned our eyes to the then relatively new json-schema IETF draft. We just had to write a json-schema validator as at the time none existed in the Java space that was being maintained and fit the rest of our architecture. Simple.

During the development of the json-schema validator we quickly came to realize that it would be convenient to have native Java representations of the objects described in those schemas in order to further process the user input in our management stack without having a manual conversion step to our internal, private representations. Furthermore, keeping API documentation up to date is notoriously boring so we might as well extract that information from the schemas as well. I think you can guess where this is going.

As an engineer, there are few things I find more rewarding than having a simple seed of an idea blossom into something much more useful than I had originally anticipated.

To make a long story short: it is a testament to the well thought out nature of the json-schema spec that we could simply add documentation- and code generation tools (currently Java and Python, with a Javascript one on the way) on top of our emerging set of schemas without having to break it.

In all honesty, json-schema does not provide as full featured type system as say for example Java or Haskell, but with a bit of elbow grease (and judicial use of the extends keyword) you can express fairly complex models without much trouble.