Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid error output with OpenApi 3.1 schema #157

Closed
ekzobrain opened this issue Nov 16, 2023 · 13 comments · Fixed by #164
Closed

Invalid error output with OpenApi 3.1 schema #157

ekzobrain opened this issue Nov 16, 2023 · 13 comments · Fixed by #164

Comments

@ekzobrain
Copy link

Try this file. It has just one error at /servers/0/description , but validation produces a bunch of error even inside other root keywords. Tested against unreleased 2.1 branch. When changing description type from number to string at /servers/0/description - document passes validation.

oas-order.json

@ekzobrain ekzobrain mentioned this issue Nov 16, 2023
davishmcclurg added a commit that referenced this issue Nov 16, 2023
This cuts down on the number of errors when `UnevaluatedProperties`
fails.

Related:
- #157
- json-schema-org/json-schema-spec#1172
@davishmcclurg
Copy link
Owner

I think the problem here is unevaluatedProperties, which is used in OpenAPI meta/document schemas. It's a bit of a mess. See:

I don't think it's a new issue in the 2.1.0 branch, so I probably won't hold that up.

I need to read more about it, but I think issue is whether unevaluatedProperties can consider sibling keyword evaluations if those keywords failed. I tried out using all sibling keyword evaluations and it resulted in better errors:

irb(main):001:0> JSONSchemer.openapi(JSON.parse(File.read('/Users/dharsha/Downloads/oas-order.json'))).validate(output_format: 'basic', resolve_enumerators: true)
=>
{"valid"=>false,
 "keywordLocation"=>"",
 "absoluteKeywordLocation"=>"json-schemer://openapi31/schema-base#",
 "instanceLocation"=>"",
 "error"=>"value at root does not match schema",
 "errors"=>
  [{"valid"=>false,
    "keywordLocation"=>"/allOf/0/then/$ref/properties/servers/items/$ref/properties/description/type",
    "absoluteKeywordLocation"=>"https://spec.openapis.org/oas/3.1/schema/2022-10-07#/$defs/server/properties/description/type",
    "instanceLocation"=>"/servers/0/description",
    "error"=>"value at `/servers/0/description` is not a string"}]

davishmcclurg added a commit that referenced this issue Nov 28, 2023
This cuts down on the number of errors when `UnevaluatedProperties`
fails.

Related:
- #157
- json-schema-org/json-schema-spec#1172
@exterm
Copy link

exterm commented Dec 7, 2023

I'm experiencing problems with unevaluatedProperties too. I verified that the attached fix works for me.

@davishmcclurg
Copy link
Owner

I'm experiencing promblems with unevaluatedProperties too. I verified that the attached fix works for me.

Thanks for trying it out, @exterm. Just to be clear, the problem you are experiencing is with the extra unevaluatedProperties errors?

@exterm
Copy link

exterm commented Dec 8, 2023

I get extraneous

value at `/type` does not match schema

errors when something else is broken. My setup is a bit complex... let me explain.

I'm defining a custom metaschema. I've been able to simplify the metaschema to a much shorter one that still shows the same behavior:

{
    "$schema": "https://json-schema.org/draft/2020-12/schema",
    "$id": "https://app.example.com/schemas/properties/v1/metaschema",
    "$vocabulary": {
        "https://json-schema.org/draft/2020-12/vocab/applicator": true,
        "https://json-schema.org/draft/2020-12/vocab/validation": true
    },
    "$dynamicAnchor": "meta",

    "title": "Core and Validation specifications meta-schema",
    "type": "object",
    "properties": {
      "properties": {
          "type": "object",
          "additionalProperties": {
              "$dynamicRef": "#meta"
          }
      },
      "type": { "$ref": "#/$defs/simpleTypes" }
    },
    "required": ["type"],
    "$defs": {
      "simpleTypes": {
        "enum": [
            "array",
            "boolean",
            "integer",
            "null",
            "number",
            "object",
            "string"
        ]
      }
  },
    "unevaluatedProperties": false
}

If I use the above metaschema to validate this schema:

{
        "type": "object",
        "properties": {
          "foo": {
            "type": "object",
            "foobar": 1,
          },
        },
      }

I get the expected errors

        "value at `/properties/foo/foobar` does not match schema",
        "value at `/properties` does not match schema",

but also

        "value at `/type` does not match schema",

Which goes away with your patch.

@davishmcclurg
Copy link
Owner

@exterm thanks for the extra info!

I think this is the simplest example that shows the issue:

schemer = JSONSchemer.schema({
  'properties' => {
    'x' => {
      'type' => 'integer'
    }
  },
  'unevaluatedProperties' => false
})
schemer.validate({ 'x' => 'invalid' }).map { _1.fetch_values('schema_pointer', 'error') }
# =>
# [["/properties/x", "value at `/x` is not an integer"],
#  ["/unevaluatedProperties", "value at `/x` does not match schema"]]

And the same behavior with a valid property:

schemer = JSONSchemer.schema({
  'properties' => {
    'x' => {
      'type' => 'integer'
    },
    'y' => {
      'type' => 'string'
    }
  },
  'unevaluatedProperties' => false
})
schemer.validate({ 'x' => 'invalid', 'y' => 'valid' }).map { _1.fetch_values('schema_pointer', 'error') }
# =>
# [["/properties/x", "value at `/x` is not an integer"],
#  ["/unevaluatedProperties", "value at `/x` does not match schema"],
#  ["/unevaluatedProperties", "value at `/y` does not match schema"]]

The issue is unevaluatedProperties only considers a property to be "evaluated" if it validated successfully. Since properties/x (in the example above) failed to validate, properties itself also failed, which means unevaluatedProperties doesn't consider any of its children "evaluated".

There's some debate about this interpretation, but from my understanding of the spec, it seems correct. Unfortunately, it produces lots of unhelpful errors, so I'm thinking it may make sense to treat failing properties as evaluated. I need to double check that won't affect the overall validation result, though.

davishmcclurg added a commit that referenced this issue Dec 10, 2023
This cuts down on the number of errors when `unevaluatedProperties`
fails.

Related:
- #157
- json-schema-org/json-schema-spec#1172
@exterm
Copy link

exterm commented Dec 11, 2023

I need to double check that won't affect the overall validation result, though.

Reading through the whole discussion, it seems like the consensus reached in the end is that both interpretations are valid and don't change the validation result, as persisted in the ADR.

Also, at least one other implementation seems to have moved to the new interpretation (JsonSchema.Net).

I particularly found this post by Ben Hutton helpful in understanding the problem; it seems like the spec is a bit confused about itself here. I appreciate they found a conclusion to this though and ended up explicitly allowing both interpretations.

davishmcclurg added a commit that referenced this issue Dec 11, 2023
This uses all adjacent (same schema) results to calculate unevaluated
keys/items, whether the keywords validated successfully or not.
Previously, it only considered valid results, which causes confusing
errors:

```ruby
schemer = JSONSchemer.schema({
  'properties' => {
    'x' => {
      'type' => 'integer'
    }
  },
  'unevaluatedProperties' => false
})
schemer.validate({ 'x' => 'invalid' }).map { _1.fetch_values('schema_pointer', 'error') }
 # =>
 # [["/properties/x", "value at `/x` is not an integer"],
 #  ["/unevaluatedProperties", "value at `/x` does not match schema"]]
```

The overall validation result shouldn't be affected, since the only
additional keywords that it considers are failed ones (meaning the
entire schema fails regardless of the unevaluated keys/items).
Duplicate/unhelpful error messages are reduced, though, which is the
main reason for making this change.

Generally, this interpretation doesn't align with my reading of the
spec, but there's been a lot of [discussion][0] around it and I think it
makes sense from a user experience perspective. Hopefully it will get
clarified in a future draft.

Closes: #157

Related:
- json-schema-org/json-schema-spec#1172
- https://github.com/orgs/json-schema-org/discussions/67
- https://github.com/orgs/json-schema-org/discussions/57
- https://github.com/json-schema-org/json-schema-spec/blob/2cb7c7447f9b795c9940710bf0eda966a92c937f/adr/2022-04-08-cref-for-ambiguity-and-fix-later-gh-spec-issue-1172.md

[0]: json-schema-org/json-schema-spec#1172
davishmcclurg added a commit that referenced this issue Dec 11, 2023
This uses all adjacent (same schema) results to calculate unevaluated
keys/items, whether the keywords validated successfully or not.
Previously, it only considered valid results, which causes confusing
errors:

```ruby
schemer = JSONSchemer.schema({
  'properties' => {
    'x' => {
      'type' => 'integer'
    }
  },
  'unevaluatedProperties' => false
})
schemer.validate({ 'x' => 'invalid' }).map { _1.fetch_values('schema_pointer', 'error') }
 # =>
 # [["/properties/x", "value at `/x` is not an integer"],
 #  ["/unevaluatedProperties", "value at `/x` does not match schema"]]
```

The overall validation result shouldn't be affected, since the only
additional keywords that it considers are failed ones (meaning the
entire schema fails regardless of the unevaluated keys/items).
Duplicate/unhelpful error messages are reduced, though, which is the
main reason for making this change.

Generally, this interpretation doesn't align with my reading of the
spec, but there's been a lot of [discussion][0] around it and I think it
makes sense from a user experience perspective. Hopefully it will get
clarified in a future draft.

Closes: #157

Related:
- json-schema-org/json-schema-spec#1172
- https://github.com/orgs/json-schema-org/discussions/67
- https://github.com/orgs/json-schema-org/discussions/57
- https://github.com/json-schema-org/json-schema-spec/blob/2cb7c7447f9b795c9940710bf0eda966a92c937f/adr/2022-04-08-cref-for-ambiguity-and-fix-later-gh-spec-issue-1172.md

[0]: json-schema-org/json-schema-spec#1172
@davishmcclurg
Copy link
Owner

Reading through the whole discussion, it seems like the consensus reached in the end is that both interpretations are valid and don't change the validation result, as persisted in the ADR.

👍 agreed. Thanks for linking to the ADR—I hadn't seen that. It not changing the validation result made sense to me when I thought about it like this:

The overall validation result shouldn't be affected, since the only additional keywords that it considers are failed ones (meaning the entire schema fails regardless of the unevaluated keys/items).

I opened a pull request with the change (including unevaluatedItems as well): #164

@exterm do you mind checking it out again to make sure it behaves like you expect?

I included a test to describe some of the inconsistent behavior noted in the discussion. Validation result doesn't change, just which keys are considered "unevaluated".

@exterm
Copy link

exterm commented Dec 13, 2023

The fix in that PR seems different from the one that you had previously proposed.

Here's a more concise example of the problem that I am seeing:

    test "it" do
      schema = {
        "$schema": "https://json-schema.org/draft/2020-12/schema",

        "type": "object",
        "properties": {
          "foo": { "type": "string" },
          "bar": { "type": "string" },
        },
        "unevaluatedProperties": false,
      }

      data = {
        "foo": "hello world",
        "bar": {},
      }

      errors = JSONSchemer.schema(schema).validate(data).to_a

      expected_errors = [
        [{ "type" => "string" }, "value at `/bar` is not a string"],
      ]

      assert_equal expected_errors, errors.map { |e| [e["schema"], e["error"]] }
    end
Minitest::Assertion: --- expected
+++ actual
@@ -1,3 +1,4 @@
 #<Set:
  {[{"type"=>"string"}, "value at `/bar` is not a string"],
+  [false, "value at `/foo` does not match schema"],
+  [false, "value at `/bar` does not match schema"]}>

(I actually don't care about "value at `/bar` does not match schema", but I'd like it to stop complaining about /foo)

This is not fixed on #164.

It is however fixed in 3bbefa3 and 719a7da .

@davishmcclurg
Copy link
Owner

This is not fixed on #164.

Hm, it seems like it's working when I try it out your example:

repos/json_schemer unevaluated-properties % bin/console --simple-prompt
?> schema = {
?>   "$schema": "https://json-schema.org/draft/2020-12/schema",
?>
?>   "type": "object",
?>   "properties": {
?>     "foo": { "type": "string" },
?>     "bar": { "type": "string" },
?>   },
?>   "unevaluatedProperties": false,
>> }
=> {:$schema=>"https://json-schema.org/draft/2020-12/schema", :type=>"object", :properties=>{:foo=>{:type=>"string"}, :bar=>{:type=>"string"}}, :unevaluatedProperties=>false}
?> data = {
?>   "foo": "hello world",
?>   "bar": {},
>> }
=> {:foo=>"hello world", :bar=>{}}
>> errors = JSONSchemer.schema(schema).validate(data).to_a
=>
[{"data"=>{},
...
>> errors.map { |e| [e["schema"], e["error"]] }
=> [[{"type"=>"string"}, "value at `/bar` is not a string"]]

Maybe I'm missing something, though. What's your process for trying out the branch?

@exterm
Copy link

exterm commented Dec 14, 2023

OK, I'm thoroughly confused now. I couldn't reproduce the problem in that simple example today either. However, the problem still existed in my real world code. I started from the real world code again and simplified as much as I could. It seems that this might be related to $ref somehow. I then checked out your branch and added my example as an additional test case below the ones you added in the branch. This is the test case I added:

  def test_inconsistent_errors_with_refs
    schema = {
      "$schema" => "https://json-schema.org/draft/2020-12/schema",
      "allOf" => [
        { "$ref" => "#/$defs/rules" },
      ],
      "$defs" => {
        "rules" => {
          "type" => "object",
          "properties" => {
            "foo" => { "type" => "string" },
            "bar" => { "type" => "string" },
          },
        },
      },
      "unevaluatedProperties" => false,
    }

    data = {
      "foo" => "hello world",
      "bar" => 1,
    }

    errors = JSONSchemer.schema(schema).validate(data).to_a

    expected_errors = [
      [{ "type" => "string" }, "value at `/bar` is not a string"],
    ]

    assert_equal expected_errors, errors.map { |e| [e["schema"], e["error"]] }
  end

Which still gives me an error saying "value at `/foo` does not match schema". The error disappears when I remove "unevaluatedProperties": false. However, this doesn't seem to be fixed with any of your earlier commits either. I was previously testing within my Rails app and something seems to have interfered. Maybe this is expected behavior somehow and I'm just misinterpreting the spec?

@davishmcclurg
Copy link
Owner

It seems that this might be related to $ref somehow.

I think you're running into one of the issues they described with this interpretation of unevaluatedProperties. See examples: json-schema-org/json-schema-spec#1172 (comment) and json-schema-org/json-schema-spec#1172 (comment)

The gist is introducing an intermediate schema (eg, $ref or allOf) changes which keys are considered "evaluated".

The fact that you ran into this problem so quickly and that it was confusing makes me question whether this is a good approach.

davishmcclurg added a commit that referenced this issue Dec 16, 2023
This uses all adjacent (same schema) results to calculate unevaluated
keys/items, whether the keywords validated successfully or not.
Previously, it only considered valid results, which causes confusing
errors:

```ruby
schemer = JSONSchemer.schema({
  'properties' => {
    'x' => {
      'type' => 'integer'
    }
  },
  'unevaluatedProperties' => false
})
schemer.validate({ 'x' => 'invalid' }).map { _1.fetch_values('schema_pointer', 'error') }
 # =>
 # [["/properties/x", "value at `/x` is not an integer"],
 #  ["/unevaluatedProperties", "value at `/x` does not match schema"]]
```

The overall validation result shouldn't be affected, since the only
additional keywords that it considers are failed ones (meaning the
entire schema fails regardless of the unevaluated keys/items).
Duplicate/unhelpful error messages are reduced, though, which is the
main reason for making this change.

Generally, this interpretation doesn't align with my reading of the
spec, but there's been a lot of [discussion][0] around it and I think it
makes sense from a user experience perspective. Hopefully it will get
clarified in a future draft.

Closes: #157

Related:
- json-schema-org/json-schema-spec#1172
- https://github.com/orgs/json-schema-org/discussions/67
- https://github.com/orgs/json-schema-org/discussions/57
- https://github.com/json-schema-org/json-schema-spec/blob/2cb7c7447f9b795c9940710bf0eda966a92c937f/adr/2022-04-08-cref-for-ambiguity-and-fix-later-gh-spec-issue-1172.md

[0]: json-schema-org/json-schema-spec#1172
davishmcclurg added a commit that referenced this issue Dec 16, 2023
This uses all adjacent (same schema) results to calculate unevaluated
keys/items, whether the keywords validated successfully or not.
Previously, it only considered valid results, which causes confusing
errors:

```ruby
schemer = JSONSchemer.schema({
  'properties' => {
    'x' => {
      'type' => 'integer'
    }
  },
  'unevaluatedProperties' => false
})
schemer.validate({ 'x' => 'invalid' }).map { _1.fetch_values('schema_pointer', 'error') }
 # =>
 # [["/properties/x", "value at `/x` is not an integer"],
 #  ["/unevaluatedProperties", "value at `/x` does not match schema"]]
```

The overall validation result shouldn't be affected, since the only
additional keywords that it considers are failed ones (meaning the
entire schema fails regardless of the unevaluated keys/items).
Duplicate/unhelpful error messages are reduced, though, which is the
main reason for making this change.

Generally, this interpretation doesn't align with my reading of the
spec, but there's been a lot of [discussion][0] around it and I think it
makes sense from a user experience perspective. Hopefully it will get
clarified in a future draft.

Closes: #157

Related:
- json-schema-org/json-schema-spec#1172
- https://github.com/orgs/json-schema-org/discussions/67
- https://github.com/orgs/json-schema-org/discussions/57
- https://github.com/json-schema-org/json-schema-spec/blob/2cb7c7447f9b795c9940710bf0eda966a92c937f/adr/2022-04-08-cref-for-ambiguity-and-fix-later-gh-spec-issue-1172.md

[0]: json-schema-org/json-schema-spec#1172
@exterm
Copy link

exterm commented Dec 18, 2023

Oh yes, you're right - I completely missed the "subschemas don't forward annotations" part when I read through the discussion the first time. Thank you for the links.

The fact that you ran into this problem so quickly and that it was confusing makes me question whether this is a good approach.

Yes, fully agreed.

It seems this is a place where the spec isn't very mature. Comments in that discussion about revising the whole annotation mechanism make that very obvious.

Not sure what to do for now. I know what behavior I'd like to see, but it doesn't seem to be possible to derive it from the spec, and I'm not even sure it'd be compatible with the spec.

On the other hand, it seems that error reporting isn't very clearly specced at all and there is no test suite for it, so as long as true / false validation results are preserved, there is some room for interpretation?

To fully resolve this, and only report "does not match schema" errors for properties that actually can't be successfully evaluated, it seems that subschemas need to forward their annotations? I wouldn't be surprised if that broke a ton of other functionality that assume the previous behavior.

davishmcclurg added a commit that referenced this issue Mar 2, 2024
Features:

- Global configuration
- Symbol key property defaults
- Better schema validation support

See individual commits for more details.

Closes:

- #157
- #162
- #167
- #173
davishmcclurg added a commit that referenced this issue Mar 2, 2024
Features:

- Global configuration
- Symbol key property defaults
- Better schema validation support

See CHANGELOG.md and individual commits for more details.

Closes:

- #157
- #162
- #167
- #173
@davishmcclurg davishmcclurg mentioned this issue Mar 2, 2024
@davishmcclurg
Copy link
Owner

I went ahead with the imperfect fix. It should reduce extraneous errors for common cases (ie, no intermediate schema) and that seems helpful. I'll revisit if it's confusing for people in practice.

It'll be released shortly in 2.2.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants