-
Notifications
You must be signed in to change notification settings - Fork 356
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
infinite loop with Constraint::CHECK_MODE_APPLY_DEFAULTS #359
Comments
Thanks for finding this! Will see if I can get that fixed this morning; infinite recursion is not good! |
Have nailed it down to this so far as a minimal test case: {
"properties": {
"propertyOne": {
"anyOf": [
{ "$ref": "#" }
],
"default": {}
}
}
} |
good job finding the small schema reproducing the issue, my attempts failed because when I replaced and then work got in the way ;) |
Yeah, there's some bloody weird behavior happening in there. Still haven't found the root cause, although I'm getting there. Working on several things at once, so it's taking a while. |
Have located and fixed an unrelated bug with resolving references - @bighappyface, do you want it (and any other bugs I might find along the way to sorting this one) in the same PR, or do you want a separate PR per bug? |
OK, this is really nutty. I'm currently buried in the depths of |
@mathroc @shmax @bighappyface Is there any scenario where a |
Sorry, not an expert on that stuff. If you can provide a minimal code snippet I can help debug. |
The minimal schema example I posted above is all I've got at the moment unfortunately. If you can be a bit more specific I can try to isolate bits that might help. My hope is that if such a reference is never legal, then I can kill the problem when it sees the reference for the first time. If it's legal, then this will be much more complicated to fix. |
Fair enough. If you haven't sorted it out by tomorrow evening I'll fire up the debugger and help research. |
Thanks! :-) |
not sure what you mean. a recursive schema is valid (eg: any tree structure). It was added in #277 maybe @jojo1981 or @araines can help finding the issue here ? |
I know they're valid - I'm not asking about recursive schemas in general; I'm asking very specifically about whether the single case of a schema that infinitely recurses the root is ever valid (and if it is, is it ever something that anybody could actually use). That PR looks very useful - thank you for that. I have some more reading to do! |
well I would probably represent a binary tree like this: {
"type": "object",
"properties": {
"value": {
"type": "integer"
},
"left": {
"$ref": "#"
},
"right": {
"$ref": "#"
}
},
"required": "value"
} |
@mathroc Thank you - exactly what I needed :-). Complicated fix it is then! |
Although thinking about this a bit more, lazy dereferencing should be preventing this problem before it becomes one, so I've clearly got some more hunting to do. It's also quite possible that I've done something stupid in my own code and just haven't realised it yet. |
wondering about something, it might be wrong and unrelated but, in case it can help..
even if that's the case, I don't see what would cause an infinite recursion in I've now read #349 but cannot understand what is going wrong here, however now that I have taken a look at the code I have another question / suggestion, I added it as code review here : https://github.com/justinrainbow/json-schema/pull/349/files#r103655365 (not sure if you receive notification once a PR is closed) |
keeping just this part of the code still trigger the issue : // Apply default values from schema
if ($this->factory->getConfig(self::CHECK_MODE_APPLY_DEFAULTS)) {
if ($this->getTypeCheck()->isObject($value) && isset($schema->properties)) {
// $value is an object, so apply default properties if defined
// foreach ($schema->properties as $i => $propertyDefinition) {
// if (!$this->getTypeCheck()->propertyExists($value, $i) && isset($propertyDefinition->default)) {
// if (is_object($propertyDefinition->default)) {
// $this->getTypeCheck()->propertySet($value, $i, clone $propertyDefinition->default);
// } else {
// $this->getTypeCheck()->propertySet($value, $i, $propertyDefinition->default);
// }
// }
// }
} elseif ($this->getTypeCheck()->isArray($value)) {
// if (isset($schema->properties)) {
// // $value is an array, but default properties are defined, so treat as assoc
// foreach ($schema->properties as $i => $propertyDefinition) {
// if (!isset($value[$i]) && isset($propertyDefinition->default)) {
// if (is_object($propertyDefinition->default)) {
// $value[$i] = clone $propertyDefinition->default;
// } else {
// $value[$i] = $propertyDefinition->default;
// }
// }
// }
// } elseif (isset($schema->items)) {
// // $value is an array, and default items are defined - treat as plain array
// foreach ($schema->items as $i => $itemDefinition) {
// if (!isset($value[$i]) && isset($itemDefinition->default)) {
// if (is_object($itemDefinition->default)) {
// $value[$i] = clone $itemDefinition->default;
// } else {
// $value[$i] = $itemDefinition->default;
// }
// }
// }
// }
} elseif (($value instanceof self || $value === null) && isset($schema->default)) {
// $value is a leaf, not a container - apply the default directly
$value = is_object($schema->default) ? clone $schema->default : $schema->default;
}
} and commenting the last and not setting the defaults (when they're objects) prevent the issue too, so this "wworks (does not end up crashing): // Apply default values from schema
if ($this->factory->getConfig(self::CHECK_MODE_APPLY_DEFAULTS)) {
if ($this->getTypeCheck()->isObject($value) && isset($schema->properties)) {
// $value is an object, so apply default properties if defined
foreach ($schema->properties as $i => $propertyDefinition) {
if (!$this->getTypeCheck()->propertyExists($value, $i) && isset($propertyDefinition->default)) {
if (is_object($propertyDefinition->default)) {
// $this->getTypeCheck()->propertySet($value, $i, clone $propertyDefinition->default);
} else {
$this->getTypeCheck()->propertySet($value, $i, $propertyDefinition->default);
}
}
}
} elseif ($this->getTypeCheck()->isArray($value)) {
if (isset($schema->properties)) {
// $value is an array, but default properties are defined, so treat as assoc
foreach ($schema->properties as $i => $propertyDefinition) {
if (!isset($value[$i]) && isset($propertyDefinition->default)) {
if (is_object($propertyDefinition->default)) {
// $value[$i] = clone $propertyDefinition->default;
} else {
$value[$i] = $propertyDefinition->default;
}
}
}
} elseif (isset($schema->items)) {
// $value is an array, and default items are defined - treat as plain array
foreach ($schema->items as $i => $itemDefinition) {
if (!isset($value[$i]) && isset($itemDefinition->default)) {
if (is_object($itemDefinition->default)) {
// $value[$i] = clone $itemDefinition->default;
} else {
$value[$i] = $itemDefinition->default;
}
}
}
}
} elseif (($value instanceof self || $value === null) && isset($schema->default)) {
// $value is a leaf, not a container - apply the default directly
// $value = is_object($schema->default) ? clone $schema->default : $schema->default;
}
} |
@mathroc It's needed; attempting to apply everything at the leaf stage will result in a lot of failed tests, unless you have a better idea for how to do the leaf sets. In addition, there are precedence rules, which are the reason for the current code layout:
Yes, I do receive emails for closed PR stuff. I did have it using |
I've already done the steps you posted the code for above; that was one of the first things I tried. Am well beyond that now. I appreciate you tackling it though; a fresh pair of eyes never hurts, and you may well spot something that I've missed. |
OK, so I've gotten to the bottom of this, and I'm gonna say it... I don't think that this is a bug. After spending hours combing the code and bashing my head against a metaphorical brick wall when tracing stuff yielded no problematic code that could cause this, I went back to the minimal schema example and ran it in my head. Then I compared it to where the problem occurred in the original schema the bug was reported against. And as it turns out, assuming I'm understanding the spec correctly, the problem has been the schema all along. {
"properties": {
"propertyOne": {
"anyOf": [
{ "$ref": "#" }
],
"default": {}
}
}
} When this is dereferenced, it yields infinitely nested instances of the root object. Which is legal, and this library does lazy-dereferencing, so it's theoretically not a problem. EXCEPT: The way
The result is an infinitely nested set of So... while clearly both legal and desired behavior according to the schema, obeying an instruction to create an infinitely nested set of objects is clearly not a good outcome for anyone who actually wants their validation to finish. @mathroc @shmax @bighappyface Is this something that should be left caveat emptor, and it's the user's responsibility not to try and set defaults with a schema that deliberately pulls this kind of stunt, or is it better to come up with some kind of detection mechanism to notice and abort infinitely recursive situations? |
ok! I think I understand what you mean. actually I think it should default to not create properties that are not required because it's doing more than "APPLY_DEFAULTS" and I'm not sure a flag to force creation of not required properties is needed. that way schema that are "validatable" are always safe what do you think? |
It's intended, because the spec doesn't say anything about only-apply-defaults-if-required, so assuming it seems dangerous. My current thoughts are to implement some kind of mechanism that makes setting defaults-on-defaults a noop (maybe by remembering the path somewhere?), but I haven't finished considering the ramifications of doing this and whether it's a good idea. |
I've filed an issue against the spec - in there, I've asked if there's a recommended way of dealing with this situation. If there is I think we should follow that - if not, then I guess we'll need to roll our own solution, whatever form that ends up taking. |
good point asking on json-schema-org/json-schema-spec. I was thinking that this are you aware of other validation libraries doing this ? In my mind |
I'm not particularly familiar with validation libraries other than this one, so I really can't say. I did search for previous issues on the spec repository regarding this and found nothing. The schema recommends validating defaults, so unless I'm badly misunderstanding something this is actually the intended behavior... and if so, then the intended behavior violates the spec. |
Here is how some of this was addressed in the past: Interesting stuff... |
@bighappyface it would prevent exhausting the memory (or reaching the nested call limit) but it would also build a partial recursive structure on the first "recursive default" found and might not apply defaults found afterward
@erayd not sure what would the resulting data look like but this might work ! @erayd is there anything wrong with only applying defaults for required properties ? I can't think of a use case where I could not make the properties |
Unless you have nested defaults, it would look normal, just as you'd expect with all defaults applied. If you defined nested defaults, then it would become noncompliant in order to break the recursion situations - but unfortunately would also catch non-recursive situations with nested defaults, which isn't ideal. I'd like to come up with a better approach that actually detects the recursion if I can think of a sane way to do it (I haven't thought of one yet; currently considering how to utilise analysis of the path for this).
Yes - behaving like this is risky, because it assumes that the schema author knows we only apply required defaults, or that the user can modify the schema (which if the schema URI isn't under their control may not be easily possible). Ignoring non-required defaults breaks any situation where a schema author (or a user of this library) expects it to be applied but isn't aware of the situation. An example is a webservice which consumes a user message and uses a schema to clamp it to ensure legitimacy and completeness before processing. In such a case, the webapp isn't necessarily doing any further checks on the user data, because it assumes the schema has done the work, including filling in missing elements. That was my primary motivation for adding the defaults feature in the first place - a simple way of ensuring data completeness prior to processing by the ingesting service, given that I'm already validating it against a schema anyway. |
that's exactly how I'm planning to use it :) we just make different assumptions, as said in json-schema-org/json-schema-spec/issues/263 it's because
just to be clear, that's where I have a different opinion, in the app I believe I should only rely on properties being present if they are required. I'm now thinking that the behavior of what do you think about having multiple modes for multiple interpretations of the |
Having multiple modes for applying defaults seems like a good idea. Draft-06 will obviate it, but older schemas will still be around for a long time, so it's a feature that makes a lot of sense. I've added |
Following discussion on #263, I think that for now, not setting defaults-on-defaults is a good idea. Does anyone object to this? If not, then I will go ahead and implement this restriction. |
Hy @erayd , it's exactly what I meant, I just tested and it works great, thx ! |
As has been noted, the problem is that "defaults" doesn't necessarially mean "the value to assume if unspecified" but usually, merely "a suitable initial value when creating an instance" (usually for user interfaces). Mutating defaults, I think, is somewhat outside the scope of JSON Schema, but it's still a useful feature that implementations can adopt, that applications may frequently may decide to make use of. For these cases where you do want to mutate the instance and apply defaults, yeah, I think it makes sense that you don't recursively mutate the default, just assume defaults are already fully expanded. If there's any suggestions for better names for these two behaviors, I'm interested in knowing. Normally the distinction is unambiguous, but here, where it's being used in the context of both user interfaces and application configuration, different people see "default" and think different things. |
@awwright What about something like this?
Using Using |
…365) * Don't try to fetch files that don't exist Throws an exception when the ref can't be resolved to a useful file URI, rather than waiting for something further down the line to fail after the fact. * Refactor defaults code to use LooseTypeCheck where appropriate * Test for not treating non-containers like arrays * Update comments * Rename variable for clarity * Add CHECK_MODE_ONLY_REQUIRED_DEFAULTS If CHECK_MODE_ONLY_REQUIRED_DEFAULTS is set, then only apply defaults if they are marked as required. * Workaround for $this scope issue on PHP-5.3 * Fix infinite recursion via $ref when applying defaults * Add missing second test for array case * Add test for setting a default value for null * Also fix infinite recursion via $ref for array defaults * Move nested closure into separate method * $parentSchema will always be set when $name is, so don't check it * Handle nulls properly - fixes issue #377
See #365 - Will be official in 6.0.0 |
…nbow#359) (jsonrainbow#365) * Don't try to fetch files that don't exist Throws an exception when the ref can't be resolved to a useful file URI, rather than waiting for something further down the line to fail after the fact. * Refactor defaults code to use LooseTypeCheck where appropriate * Test for not treating non-containers like arrays * Update comments * Rename variable for clarity * Add CHECK_MODE_ONLY_REQUIRED_DEFAULTS If CHECK_MODE_ONLY_REQUIRED_DEFAULTS is set, then only apply defaults if they are marked as required. * Workaround for $this scope issue on PHP-5.3 * Fix infinite recursion via $ref when applying defaults * Add missing second test for array case * Add test for setting a default value for null * Also fix infinite recursion via $ref for array defaults * Move nested closure into separate method * $parentSchema will always be set when $name is, so don't check it * Handle nulls properly - fixes issue jsonrainbow#377
* Add URI translation for retrieval & add local copies of spec schema * Add use line for InvalidArgumentException & adjust scope (#372) Fixes issue #371 * add quiet option (#382) * add quiet option * use verbose instead of quiet * add quiet option * always output dump-schema * always output dump-schema-url * fix typo and ws * [BUGFIX] Add provided schema under a dummy / internal URI (fixes #376) (#378) * Add provided schema under a dummy / internal URI (fixes #376) In order to resolve internal $ref references within a user-provided schema, SchemaStorage needs to know about the schema. As user-supplied schemas do not have an associated URI, use a dummy / internal one instead. * Remove dangling use * Change URI to class constant on SchemaStorage * Add option to disable validation of "format" constraint (#383) * Add more unit tests (#366) * Add test coverage for coercion API * Complete test coverage for SchemaStorage * Add test coverage for ObjectIterator * Add exception test for JsonPointer * MabeEnum\Enum appears to use singletons - add testing const * Don't check this line for coverage mbstring is on all test platforms, so this line will never be reached. * Add test for TypeConstraint::validateTypeNameWording() * Add test for exception on TypeConstraint::validateType() * PHPunit doesn't like an explanation with its @codeCoverageIgnore... * Add various tests for UriRetriever * Add tests for FileGetContents * Add tests for JsonSchema\Uri\Retrievers\Curl * Add missing bad-syntax test file * Restrict ignore to the exception line only * Fix exception scope * Allow the schema to be an associative array (#389) * Allow the schema to be an associative array Implements #388. * Use json_decode(json_encode()) for array -> object cast * Skip exception check on PHP versions < 5.5.0 * Skip test on HHVM, as it's happy to encode resources * Enable FILTER_FLAG_EMAIL_UNICODE for email format if present (#398) * Don't throw exceptions until after checking anyOf / oneOf (#394) Fixes #393 * Fix infinite recursion on some schemas when setting defaults (#359) (#365) * Don't try to fetch files that don't exist Throws an exception when the ref can't be resolved to a useful file URI, rather than waiting for something further down the line to fail after the fact. * Refactor defaults code to use LooseTypeCheck where appropriate * Test for not treating non-containers like arrays * Update comments * Rename variable for clarity * Add CHECK_MODE_ONLY_REQUIRED_DEFAULTS If CHECK_MODE_ONLY_REQUIRED_DEFAULTS is set, then only apply defaults if they are marked as required. * Workaround for $this scope issue on PHP-5.3 * Fix infinite recursion via $ref when applying defaults * Add missing second test for array case * Add test for setting a default value for null * Also fix infinite recursion via $ref for array defaults * Move nested closure into separate method * $parentSchema will always be set when $name is, so don't check it * Handle nulls properly - fixes issue #377 * Add option to also validate the schema (#357) * Remove stale files from #357 (obviated by #362) (#400) * Stop #386 sneaking in alongside another PR backport
…nbow#359) (jsonrainbow#365) * Don't try to fetch files that don't exist Throws an exception when the ref can't be resolved to a useful file URI, rather than waiting for something further down the line to fail after the fact. * Refactor defaults code to use LooseTypeCheck where appropriate * Test for not treating non-containers like arrays * Update comments * Rename variable for clarity * Add CHECK_MODE_ONLY_REQUIRED_DEFAULTS If CHECK_MODE_ONLY_REQUIRED_DEFAULTS is set, then only apply defaults if they are marked as required. * Workaround for $this scope issue on PHP-5.3 * Fix infinite recursion via $ref when applying defaults * Add missing second test for array case * Add test for setting a default value for null * Also fix infinite recursion via $ref for array defaults * Move nested closure into separate method * $parentSchema will always be set when $name is, so don't check it * Handle nulls properly - fixes issue jsonrainbow#377
I've found an issue, with some schema, the validator will go into an infinite loop and eventually terminate when the allowed memory is exhausted, you can reproduce the issue with this code :
I'm not sure yet what is causing this. I'll guess a infinite recursion has been introduced in UndefinedConstraint::validateCommonProperties. I'll try to make another reproduction of the bug with a smaller schema
The text was updated successfully, but these errors were encountered: