Skip to main content

JSON Schema Validation

PostgreSQL offers native support for JSON and JSONB data types, allowing you to store JSON documents directly in your tables. However, ensuring that these JSON documents adhere to a specific schema requires additional mechanisms. There are extensions like json-schema and json-schema-validator that can be used to validate JSON documents against a schema. We can do simple validation without relying on extensions.

Using CHECK constraint

CREATE TABLE IF NOT EXISTS users (
id INT GENERATED BY DEFAULT AS IDENTITY (START WITH 1001) PRIMARY KEY,
profile JSONB NOT NULL,
CONSTRAINT valid_profile CHECK (profile::JSONB @> '{"name": "string", "email": "string", "age": "integer"}'::JSONB)
);

In the above example, we have a table users with a profile column of type JSONB. We have defined a CHECK constraint valid_profile to validate the profile column against a JSON schema. The @> operator is used to check if the left-hand JSONB value contains the right-hand JSONB value. The right-hand JSONB value is the JSON schema that we want to validate against.

-- Insert a valid JSON document
INSERT INTO users (profile) VALUES
('{"name": "John Doe", "email": "[email protected]", "age": 30}'::JSONB);
-- Insert an invalid JSON document
INSERT INTO users (profile) VALUES
('{"name": "Jane Smith", "email": "invalid.email", "age": 30}'::JSONB);

Notice the second INSERT statement (invalid.email). Let's create a custom domain for the email type to validate the email address.

CREATE DOMAIN email AS VARCHAR(255)
CONSTRAINT valid_email CHECK (VALUE ~* '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$');

Now, we can use the email domain in the users table.

CONSTRAINT valid_profile CHECK (profile::JSONB @> '{"name": "string", "email": "email", "age": "integer"}'::JSONB)

With a composite type

We can also use a composite type to define the schema and use it in the table definition.

CREATE TYPE user_type AS (
name VARCHAR(255),
email VARCHAR(255),
age INTEGER
);

CREATE TABLE user_profiles (
id INT GENERATED BY DEFAULT AS IDENTITY (START WITH 1001) PRIMARY KEY,
user_data user_type
);

INSERT INTO user_profiles (user_data) VALUES (row_to_json(('John Doe', '[email protected]', 30)::user_type));

INSERT INTO user_profiles (user_data) VALUES
((ROW('John Doe', '[email protected]', 30)::user_type));

Querying composite type vs JSON/JSONB

  • composite type requires casting to the composite type
SELECT id, user_data::user_type FROM user_profiles;
  • composite type requires accessing the attributes of the composite using the dot notation
SELECT user_data.name FROM user_profiles;
  • JSON/JSONB requires casting to the JSON/JSONB type
SELECT id, row_to_json(user_data) FROM user_profiles;
  • JSON/JSONB requires accessing the attributes of the JSON/JSONB using the arrow notation
SELECT user_data->>'name' FROM user_profiles;

Conclusion

While composite types offer benefits in terms of integration with PostgreSQL's relational model and type safety, JSONB provides superior flexibility, query capabilities, and performance optimization options, especially for unstructured or semi-structured data.

  • Composite Types: Best suited for static or well-defined data structures where the schema is unlikely to change frequently, and performance needs are consistent with the limitations around indexing.

  • JSON/JSONB: Ideal for dynamic or evolving data structures, document storage, or cases where you need to store and query nested or unstructured data efficiently.