Json Schema简介和Json Schema的高性能.net实现库 LateApexEarlySpeed.Json.Schema

Json Schema简介和Json Schema的高性能.net实现库 LateApexEarlySpeed.Json.Schema LateApexEarlySpeed. LateApexEarlySpeed Schema Speed. Early Json. 的高性能 Json .net

什么是Json Schema ?

Json schema是一种声明式语言,它可以用来标识Json的结构,数据类型和数据的具体限制,它提供了描述期望Json结构的标准化方法。
利用Json Schema, 你可以定义Json结构的各种规则,以便确定Json数据在各个子系统中交互传输时保持兼容和一致的格式。

一般来说,系统可以自己实现逻辑来判断当前json是否满足接口要求,比如是否某个字段存在,是否属性值是有效的。但当验证需求变得复杂后,比如有大量嵌套json结构,属性之间的复杂关联限制等等,则容易编写出考虑不全的验证代码。另外,当系统需要动态的json数据要求,比如先由用户自己决定他需要的json结构,然后系统根据用户表达的定制化json结构需求,帮助用户验证后续的json数据。这种系统代码编译时无法确定的json结构,就需要另一种解决方案。

Json Schema就是针对这种问题的比较自然的解决方案。它可以让你或你的用户描述希望的json结构和值的内容限制,有效属性,是否是required, 还有有效值的定义,等等。。利用Json Schema, 人们可以更好的理解Json结构,而且程序也可以根据你的Json Schema验证Json数据。
Json Schema语法的学习见官方介绍

比如下面的一个简单例子,用.net下的Json Schema实现库library LateApexEarlySpeed.Json.Schema进行Json数据的验证:

Json Schema (文件:schema.json):

{
  "type": "object",
  "properties": {
    "propBoolean": {
      "type": "boolean"
    },
    "propArray": {
      "type": "array",
      "uniqueItems": true
    }
  }
}

Json 数据 (文件:instance.json):

{
  "propBoolean": true,
  "propArray": [ 1, 2, 3, 4, 4 ]
}

C# 代码:

            string jsonSchema = File.ReadAllText("schema.json");
            string instance = File.ReadAllText("instance.json");

            var jsonValidator = new JsonValidator(jsonSchema);
            ValidationResult validationResult = jsonValidator.Validate(instance);

            if (validationResult.IsValid)
            {
                Console.WriteLine("good");
            }
            else
            {
                Console.WriteLine($"Failed keyword: {validationResult.Keyword}");
                Console.WriteLine($"ResultCode: {validationResult.ResultCode}");
                Console.WriteLine($"Error message: {validationResult.ErrorMessage}");
                Console.WriteLine($"Failed instance location: {validationResult.InstanceLocation}");
                Console.WriteLine($"Failed relative keyword location: {validationResult.RelativeKeywordLocation}");
            }

输出:

Failed keyword: uniqueItems
ResultCode: DuplicatedArrayItems
Error message: There are duplicated array items
Failed instance location: /propArray
Failed relative keyword location: /properties/propArray/uniqueItems

LateApexEarlySpeed.Json.Schema中文介绍

项目原始文档:https://github.com/lateapexearlyspeed/Lateapexearlyspeed.JsonSchema

中文文档:
LateApexEarlySpeed.Json.Schema是2023年12月发布的一个新的.net下的Json Schema实现库library,基于截止到2023年12月为止最新版的Json schema - draft 2020.12。
Json Schema验证功能经过了official json schema test-suite for draft 2020.12的测试。(部分排除的用例见下面的已知限制章节)

主要特点:

  • 基于微软.net下默认的System.Text.Json而非经典的Newtonsoft.Json
  • 高性能:和已有的知名且杰出的.net下的一些JsonSchema library相比,具有很好的性能 (在common case下,利用BenchmarkDotnet进行的性能测试)。用户请根据自己的使用场景进行性能验证

一些性能测试结果 (下面的测试比较都是在同样的使用方式下进行,见 性能建议 ):
12th Gen Intel Core i7-12800H, 1 CPU, 20 logical and 14 physical cores

[Host] : .NET 6.0.25 (6.0.2523.51912), X64 RyuJIT AVX2

DefaultJob : .NET 6.0.25 (6.0.2523.51912), X64 RyuJIT AVX2

有效数据用例:

Method Mean Error StdDev Gen0 Gen1 Allocated ValidateByPopularSTJBasedValidator 29.80 us 0.584 us 0.573 us 4.4556 0.2441 55.1 KB ValidateByThisValidator 15.99 us 0.305 us 0.300 us 1.9531 - 24.2 KB

无效数据用例:

Method Mean Error StdDev Median Gen0 Gen1 Allocated ValidateByPopularSTJBasedValidator 65.04 us 2.530 us 7.341 us 66.87 us 4.5776 0.1221 56.42 KB ValidateByThisValidator 15.47 us 1.160 us 3.421 us 17.14 us 1.4954 - 18.45 KB

Note: "STJ"是"System.Text.Json"的缩写,这个是.net sdk里默认带的json基础包, LateApexEarlySpeed.Json.Schema这个库也是基于这个STJ编写的。

基准Schema:

{
  "$id": "http://main",
  "type": "object",

  "additionalProperties": false,
  "patternProperties": {
    "propB*lean": {
      "type": "boolean"
    }
  },
  "dependentRequired": {
    "propNull": [ "propBoolean", "propArray" ]
  },
  "dependentSchemas": {
    "propNull": {
      "type": "object"
    }
  },
  "propertyNames": true,
  "required": [ "propNull", "propBoolean" ],
  "maxProperties": 100,
  "minProperties": 0,
  "properties": {
    "propNull": {
      "type": "null"
    },
    "propBoolean": {
      "type": "boolean",
      "allOf": [
        true,
        { "type": "boolean" }
      ]
    },
    "propArray": {
      "type": "array",
      "anyOf": [ false, true ],
      "contains": { "type": "integer" },
      "maxContains": 100,
      "minContains": 2,
      "maxItems": 100,
      "minItems": 1,
      "prefixItems": [
        { "type": "integer" }
      ],
      "items": { "type": "integer" },
      "uniqueItems": true
    },
    "propNumber": {
      "type": "number",
      "if": {
        "const": 1.5
      },
      "then": true,
      "else": true,
      "enum": [ 1.5, 0, 1 ]
    },
    "propString": {
      "type": "string",
      "maxLength": 100,
      "minLength": 0,
      "not": false,
      "pattern": "abcde"
    },
    "propInteger": {
      "$ref": "#/$defs/typeIsInteger",
      "exclusiveMaximum": 100,
      "exclusiveMinimum": 0,
      "maximum": 100,
      "minimum": 0,
      "multipleOf": 0.5,
      "oneOf": [ true, false ]
    }
  },
  "$defs": {
    "typeIsInteger": { "$ref": "http://inside#/$defs/typeIsInteger" },
    "toTestAnchor": {
      "$anchor": "test-anchor"
    },
    "toTestAnotherResourceRef": {
      "$id": "http://inside",
      "$defs": {
        "typeIsInteger": { "type": "integer" }
      }
    }
  }
}

有效基准数据:

{
  "propNull": null,
  "propBoolean": true,
  "propArray": [ 1, 2, 3, 4, 5 ],
  "propNumber": 1.5,
  "propString": "abcde",
  "propInteger": 1
}

无效基准数据:

{
  "propNull": null,
  "propBoolean": true,
  "propArray": [ 1, 2, 3, 4, 4 ], // Two '4', duplicated
  "propNumber": 1.5,
  "propString": "abcde",
  "propInteger": 1
}

基础用法

安装Nuget package

Install-Package LateApexEarlySpeed.Json.Schema
string jsonSchema = File.ReadAllText("schema.json");
string instance = File.ReadAllText("instance.json");

var jsonValidator = new JsonValidator(jsonSchema);
ValidationResult validationResult = jsonValidator.Validate(instance);

if (validationResult.IsValid)
{
    Console.WriteLine("good");
}
else
{
    Console.WriteLine($"Failed keyword: {validationResult.Keyword}");
    Console.WriteLine($"ResultCode: {validationResult.ResultCode}");
    Console.WriteLine($"Error message: {validationResult.ErrorMessage}");
    Console.WriteLine($"Failed instance location: {validationResult.InstanceLocation}");
    Console.WriteLine($"Failed relative keyword location: {validationResult.RelativeKeywordLocation}");
    Console.WriteLine($"Failed schema resource base uri: {validationResult.SchemaResourceBaseUri}");
}

输出信息

当json数据验证失败后,可以查看错误数据的具体信息:

  • IsValid: As summary indicator for passed validation or failed validation.

  • ResultCode: The specific error type when validation failed.

  • ErrorMessage: the specific wording for human readable message

  • Keyword: current keyword when validation failed

  • InstanceLocation: The location of the JSON value within the instance being validated. The value is a JSON Pointer.

  • RelativeKeywordLocation: The relative location of the validating keyword that follows the validation path. The value is a JSON Pointer, and it includes any by-reference applicators such as "$ref" or "$dynamicRef". Eg:

    /properties/width/$ref/minimum
    
  • SubSchemaRefFullUri: The absolute, dereferenced location of the validating keyword when validation failed. The value is a full URI using the canonical URI of the relevant schema resource with a JSON Pointer fragment, and it doesn't include by-reference applicators such as "$ref" or "$dynamicRef" as non-terminal path components. Eg:

    https://example.com/schemas/common#/$defs/count/minimum
    
  • SchemaResourceBaseUri: The absolute base URI of referenced json schema resource when validation failed. Eg:

    https://example.com/schemas/common
    

性能建议

尽可能的重用已实例化的JsonValidator实例(JsonValidator可以简单理解为代表一个json schema验证文档)来验证json数据,以便获得更高性能

外部json schema依赖的支持

除了自动支持当前schema文档内的引用关系,还支持外部json schema依赖:

  • 本地schema依赖文本
var jsonValidator = new JsonValidator(jsonSchema);
string externalJsonSchema = File.ReadAllText("schema2.json");
jsonValidator.AddExternalDocument(externalJsonSchema);
ValidationResult validationResult = jsonValidator.Validate(instance);
  • 远程schema url (实现库将访问网络来获得远程的schema)
var jsonValidator = new JsonValidator(jsonSchema);
await jsonValidator.AddHttpDocumentAsync(new Uri("http://this-is-json-schema-document"));
ValidationResult validationResult = jsonValidator.Validate(instance);

自定义keyword的支持

除了json schema specification中的标准keywords之外,还支持用户创建自定义keyword来实现额外的验证需求:

{
  "type": "object",
  "properties": {
    "prop1": {
      "customKeyword": "Expected value"
    }
  }
}
ValidationKeywordRegistry.AddKeyword<CustomKeyword>();
[Keyword("customKeyword")] // It is your custom keyword name
[JsonConverter(typeof(CustomKeywordJsonConverter))] // Use 'CustomKeywordJsonConverter' to deserialize to 'CustomKeyword' instance out from json schema text
internal class CustomKeyword : KeywordBase
{
    private readonly string _customValue; // Simple example value

    public CustomKeyword(string customValue)
    {
        _customValue = customValue;
    }

    // Do your custom validation work here
    protected override ValidationResult ValidateCore(JsonInstanceElement instance, JsonSchemaOptions options)
    {
        if (instance.ValueKind != JsonValueKind.String)
        {
            return ValidationResult.ValidResult;
        }

        return instance.GetString() == _customValue
            ? ValidationResult.ValidResult
            : ValidationResult.CreateFailedResult(ResultCode.UnexpectedValue, "It is not my expected value.", options.ValidationPathStack, Name, instance.Location);
    }
}
internal class CustomKeywordJsonConverter : JsonConverter<CustomKeyword>
{
    // Library will input json value of your custom keyword: "customKeyword" to this method.
    public override CustomKeyword? Read(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options)
    {
        // Briefly: 
        return new CustomKeyword(reader.GetString()!);
    }

    public override void Write(Utf8JsonWriter writer, CustomKeyword value, JsonSerializerOptions options)
    {
        throw new NotImplementedException();
    }
}

Format支持

目前library支持如下format:

  • uri
  • uri-reference
  • date
  • time
  • date-time
  • email
  • uuid
  • hostname
  • ipv4
  • ipv6
  • json-pointer
  • regex

如果需要自定义format验证,可以实现一个FormatValidator子类并注册:

[Format("custom_format")] // this is your custom format name in json schema
public class TestCustomFormatValidator : FormatValidator
{
    public override bool Validate(string content)
    {
        // custom format validation logic here...
    }
}

// register it globally
FormatRegistry.AddFormatType<TestCustomFormatValidator>();

Other extension usage doc is to be continued .

限制

  • 目前library关注于验证,暂不支持annotation
  • 因为暂不支持annotation, 所以不支持如下keywords: unevaluatedProperties, unevaluatedItems
  • 目前不支持 content-encoded string

问题报告

欢迎把使用过程中遇到的问题和希望增加的功能发到github repo issue中

More doc is to be written

评论