Monday, December 7, 2015

Using JSON Schema with Python to validate JSON data

By Vasudev Ram


I got to know about JSON Schema and the jsonschema Python library recently.


JSON Schema is a scheme (pun not intended) or method for checking that input JSON data adheres to a specified schema, roughly similar to what can done for XML data using an XML Schema.

So I thought of writing a small program to try out the jsonschema library. Here it is:
# test_jsonschema_unix.py
# A program to try the jsonschema Python library.
# Uses it to validate some JSON data.
# Follows the Unix convention of writing normal output to the standard 
# output (stdout), and errors to the standard error output (stderr).
# Author: Vasudev Ram
# Copyright 2015 Vasudev Ram

from __future__ import print_function
import sys
import json
import jsonschema
from jsonschema import validate

# Create the schema, as a nested Python dict, 
# specifying the data elements, their names and their types.
schema = {
    "type" : "object",
    "properties" : {
        "price" : {"type" : "number"},
        "name" : {"type" : "string"},
    },
}

print("Testing use of jsonschema for data validation.")
print("Using the following schema:")
print(schema)
print("Pretty-printed schema:")
print(json.dumps(schema, indent=4))

# The data to be validated:
# Two records OK, three records in ERROR.
data = \
[
    { "name": "Apples", "price": 10},
    { "name": "Bananas", "price": 20},
    { "name": "Cherries", "price": "thirty"},
    { "name": 40, "price": 40},
    { "name": 50, "price": "fifty"}
]

print("Raw input data:")
print(data)
print("Pretty-printed input data:")
print(json.dumps(data, indent=4))

print("Validating the input data using jsonschema:")
for idx, item in enumerate(data):
    try:
        validate(item, schema)
        sys.stdout.write("Record #{}: OK\n".format(idx))
    except jsonschema.exceptions.ValidationError as ve:
        sys.stderr.write("Record #{}: ERROR\n".format(idx))
        sys.stderr.write(str(ve) + "\n")
The name of the program is test_jsonschema_unix.py, because, as you can see in the source code, the normal output is sent to sys.stdout (standard output) and the errors are sent to sys.stderr (standard error output), as Unix tools often do. So, to run this with the stdout and stderr redirected to separate files, we can do this:

$ python test_jsonschema_unix.py >out 2>err

(where the filename out is for output and err is for error)

which gives us this for out:
Testing use of jsonschema for data validation.
Using the following schema:
{'type': 'object', 'properties': {'price': {'type': 'number'}, 'name': {'type': 'string'}}}
Pretty-printed schema:
{
    "type": "object", 
    "properties": {
        "price": {
            "type": "number"
        }, 
        "name": {
            "type": "string"
        }
    }
}
Raw input data:
[{'price': 10, 'name': 'Apples'}, {'price': 20, 'name': 'Bananas'}, {'price': 'thirty', 'name': 'Cherries'}, {'price': 40, 'name': 40}, {'price': 'fifty', 'name': 50}]
Pretty-printed input data:
[
    {
        "price": 10, 
        "name": "Apples"
    }, 
    {
        "price": 20, 
        "name": "Bananas"
    }, 
    {
        "price": "thirty", 
        "name": "Cherries"
    }, 
    {
        "price": 40, 
        "name": 40
    }, 
    {
        "price": "fifty", 
        "name": 50
    }
]
Validating the input data using jsonschema:
Record #0: OK
Record #1: OK
and this for err:
Record #2: ERROR
'thirty' is not of type 'number'

Failed validating 'type' in schema['properties']['price']:
    {'type': 'number'}

On instance['price']:
    'thirty'
Record #3: ERROR
40 is not of type 'string'

Failed validating 'type' in schema['properties']['name']:
    {'type': 'string'}

On instance['name']:
    40
Record #4: ERROR
'fifty' is not of type 'number'

Failed validating 'type' in schema['properties']['price']:
    {'type': 'number'}

On instance['price']:
    'fifty'
So we can see that the good records went to out and the bad ones went to err, which means that jsonschema could validate our data.

- Vasudev Ram - Online Python training and programming

Signup to hear about new products and services I create.

Posts about Python  Posts about xtopdf

My ActiveState recipes

2 comments:

Vasudev Ram said...



Are you looking for training on Python programming, SQL programming and database design, or Unix / Linux architecture, usage, commands and shell scripting?

Visit my training page and check out the courses I offer.


Vasudev Ram said...

I conduct courses on:

- Python programming
- Linux commands & shell scripting
- SQL programming and database design
- PDF report generation using ReportLab and xtopdf

xtopdf is my own product, a Python toolkit for PDF generation from other formats.

Check out my course outlines and testimonials.

More courses will be added over time.

Sign up to be notified of my new courses