Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add URI translation, package:// URI scheme & bundle spec schemas #362

Merged
merged 3 commits into from
Mar 7, 2017

Conversation

erayd
Copy link
Contributor

@erayd erayd commented Feb 24, 2017

What

  • Adds URI translation for schema retrieval.
  • Adds default translation for package:// URI scheme, to allow loading schema files from within the json-schema package root.
  • Adds default translation to intercept requests for the official spec schemas and retrieve them from disk instead.

Why

  • So that otherwise remote requests can be served from disk (reliability, performance).
  • Remove potential for users to depend on json-schema.org being up (see this).
  • So that refs to remote schemas that no longer exist can be remapped to a new location.

Summary

  • package://* is translated by default to file:///path/to/json-schema/*
  • draft-03 and draft-04 official spec schemas are translated by default to package://dist/schema/*
  • New translations can be added by the user via UrlRetriever::setTranslation().

Adding translations is a bit cumbersome for the user ($factory->getSchemaStorage()->getUriRetriever()->setTranslation()), but is unlikely to be used often, and this seemed like the best place to put it.

No need to keep duplicate files around in package://tests/fixtures/ if
we're distributing them for users anyway.
@erayd erayd changed the title Add URI translation, package-local scheme & bundle spec schemas Add URI translation, package:// URI scheme & bundle spec schemas Feb 24, 2017
*/
protected $translationMap = array(
// use local copies of the spec schemas
'|^https?://json-schema.org/draft-(0[34])/schema#?|' => 'package://dist/schema/json-schema-draft-$1.json'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, I thought I would add https:// support once it's supported but it's even better :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels vaguely not right to me... can't quite put my finger on it, but something about baking these specific use cases into what was previously a very generic class. I'm not totally sure I understand the concept--the idea is that we can sort of intercept what would otherwise be an external request for a resource, and instead proxy one of the local files?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shmax The feature itself is generic - it's essentially a regex URI rewriter; you can transform any URI into any other URI.

There are two predefined rules:

  • A rule that allows you to load any schema file from within the package by using a package:// URL prefix;
  • A rule that rewrites official spec schema URLs into local ones.

The user can add more rules, or not, as they see fit.

Are you able to clarify in more detail what aspect of this you're uncomfortable with? If I can get a better understanding of what you're objecting to, then I can try to come up with an approach you're happier with.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just seems that the feature is entirely bespoke for the needs of the schema validator feature you're working on. You supply an API for adding new translations, but then instead of using it you just pre-config this generic class for your specific needs. Minimally, I would wonder if you could instantiate an instance of this class in-place where you need it, and eat your own dog food. I guess I'm not even too sure if redirecting what one expects to be an external request to a local file is even a good idea? I know we don't generally expect the remote file to change, but what if it does, particularly down the road when we get to draft 5? If the goal is to avoid remote requests, I'm wondering if a better idea might be to add a /cache directory with some kind of expiry or hash mechanism, go ahead and do the remote request, and store a copy of the file. If you're worried about remote being unavailable, you could prime the directory with the files, and check it in.

I'm just spouting ideas, here. I don't know that we necessarily have to solve all this now, and if everybody else is okay with this as-is I can live with it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Yes, the original motivation for adding this was the schema validation stuff. It has usefulness beyond just that though - I figured that as long as I was creating a mechanism for bundling local files anyway, then I might as well make it as widely useful as possible, rather than writing something that only serves my specific use-case.
  2. I preconfigured this specific use-case, because it seems like the one that will be overwhelmingly useful to lots of people, and the users it's most likely to help are also the users least likely to explicitly enable it, because reaching that class through the current API is both a pain in the ass and non-obvious. Doing it this way will also benefit the people who dereference spec URLs anywhere, not just specifically inside the schema-validation feature. It's also a significant security improvement over requesting the spec via HTTP.
  3. I put the preconfiguration in that class specifically, because it seemed like the best place to put it - UrlRetriever is where it's actually used, and the factory isn't available from inside UrlRetriever or any of its related classes.
  4. Is part of your argument that this behavior shouldn't be default-enabled? I personally feel that having it default-enabled is a good thing (per reasons above), but if you disagree then that does change things somewhat.
  5. If the URL asks for anything other than the existing draft-03 or draft-04 specs, the request will just go through unaltered. It only intercepts those specific files, and leaves everything else alone.
  6. If the file located at that URL changes, then IMO the most likely cause is either some kind of error / infrastructure problem (in which case getting a valid copy of the file is problematic), or malicious action (in which case the file can't be trusted). It's a widely-used spec; that file should never change - if there is an error in the spec, then a new version of the spec should be released, rather than silently changing the validation behavior of an existing spec version.
  7. Adding a cache directory implies write access to the filesystem, which comes with its own problems. Personally, I feel that's a road we shouldn't go down - but I'd be interested to hear your thoughts on why it might be a good idea.

You're right, we don't need to solve all this now - but it seems worth having the discussion. There's no huge rush to get any of this stuff merged; it's all nice-to-haves rather than anything critical. If this PR goes nowhere, I'd be OK with that - it just seemed like a good way to solve the current problem while also providing additional value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shmax If I've made you feel like you're having to pick battles, then something is wrong with my approach to discussing this, and for that I am sincerely sorry. I consider your input valuable, and I'm also new to this project - I have no desire or right to be stepping on toes.

I know you've said LGTM above, but would you prefer it if we just put this whole thing on hold for a while and worked on other things, and come back to it later?

I do still need to fix that recursion bug, but I can refrain from any feature-add work for a while if you feel that a break would be helpful. I don't want you to feel like I'm pressuring you; no objection to robust discussion on the merits, but if it's getting to the point of you wanting to back out, then I can't help but feel I've been a bit too robust with the way I've gone about it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, no, you're doing great. I only meant that none of my concerns are really heavy enough to warrant stalling this if it will hold up your other PR (which is the fun one). If you want to keep mulling it, one idea would be to merge it into your other PR. Nothing else really depends on this, does it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shmax Oh, gotcha - thanks for clarifying!

I think that perhaps this warrants a bit more thought - letting it sit here a while is not a problem; as you say, nothing else depends on it - and the schema-validation thing is something I'm doing because there was an issue that asked about it, I have a bit of time to contribute, and I figured it would be helpful. There's no pressing need for that to be finished in a hurry.

I'll base #357 on this PR for now - that way I can keep working on that one, and it won't hold anything up over there - and we can revisit this again one before the schema-validation stuff is merged to see whether I / anyone else still thinks this is a good idea once it's sat quietly for a bit. If we all like it at that point, great! If not, then I'm happy to refactor to a different approach.

Does that sound like a good plan?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM 🥇

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweet :-).

@mathroc
Copy link
Contributor

mathroc commented Feb 24, 2017

translations map is a nice addition, I usually extend the SchemaStorage, this is going to make that a bit easier :)
I map https://my.app.com/schema/ to a folder and want to have the local files when developing/testing and to avoid a network request on production

👍

Allows users to rewrite to package:// targets and still have the URI
work.
@erayd
Copy link
Contributor Author

erayd commented Feb 25, 2017

@bighappyface Per above discussion with @shmax, please don't merge this one yet - we'd like to think on it a bit more first.

@erayd erayd changed the base branch from master to 6.0.0-dev March 2, 2017 21:49
@erayd erayd mentioned this pull request Mar 2, 2017
@erayd
Copy link
Contributor Author

erayd commented Mar 3, 2017

@shmax I still like this approach the way it is - do you still feel that things would be better changed? If so, do you have you have any suggestions for what should take its place?

@shmax
Copy link
Collaborator

shmax commented Mar 3, 2017

Naw, it's fine. :shipit:

@erayd
Copy link
Contributor Author

erayd commented Mar 3, 2017

@shmax OK. Ready to roll for merging this / the schema validation?

Also, would you prefer these merge first, or for your error one to be first? I'm not fussed on the order.

@shmax
Copy link
Collaborator

shmax commented Mar 3, 2017

Yep, go for it. I'm in no rush, either. I don't mind resolving.

@erayd
Copy link
Contributor Author

erayd commented Mar 3, 2017

OK. @bighappyface?

Copy link
Collaborator

@bighappyface bighappyface left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@bighappyface bighappyface merged commit 72b94c1 into jsonrainbow:6.0.0-dev Mar 7, 2017
@erayd erayd deleted the distribution-schemas branch March 7, 2017 18:54
@erayd erayd mentioned this pull request Mar 7, 2017
erayd added a commit to erayd/json-schema that referenced this pull request Mar 21, 2017
erayd added a commit to erayd/json-schema that referenced this pull request Mar 21, 2017
bighappyface pushed a commit that referenced this pull request Mar 22, 2017
* Add URI translation for retrieval & add local copies of spec schema

* Add use line for InvalidArgumentException & adjust scope (#372)

Fixes issue #371

* add quiet option (#382)

* add quiet option

* use verbose instead of quiet

* add quiet option

* always output dump-schema

* always output dump-schema-url

* fix typo and ws

* [BUGFIX] Add provided schema under a dummy / internal URI (fixes #376) (#378)

* Add provided schema under a dummy / internal URI (fixes #376)

In order to resolve internal $ref references within a user-provided
schema, SchemaStorage needs to know about the schema. As user-supplied
schemas do not have an associated URI, use a dummy / internal one instead.

* Remove dangling use

* Change URI to class constant on SchemaStorage

* Add option to disable validation of "format" constraint (#383)

* Add more unit tests (#366)

* Add test coverage for coercion API

* Complete test coverage for SchemaStorage

* Add test coverage for ObjectIterator

* Add exception test for JsonPointer

* MabeEnum\Enum appears to use singletons - add testing const

* Don't check this line for coverage

mbstring is on all test platforms, so this line will never be reached.

* Add test for TypeConstraint::validateTypeNameWording()

* Add test for exception on TypeConstraint::validateType()

* PHPunit doesn't like an explanation with its @codeCoverageIgnore...

* Add various tests for UriRetriever

* Add tests for FileGetContents

* Add tests for JsonSchema\Uri\Retrievers\Curl

* Add missing bad-syntax test file

* Restrict ignore to the exception line only

* Fix exception scope

* Allow the schema to be an associative array (#389)

* Allow the schema to be an associative array

Implements #388.

* Use json_decode(json_encode()) for array -> object cast

* Skip exception check on PHP versions < 5.5.0

* Skip test on HHVM, as it's happy to encode resources

* Enable FILTER_FLAG_EMAIL_UNICODE for email format if present (#398)

* Don't throw exceptions until after checking anyOf / oneOf (#394)

Fixes #393

* Fix infinite recursion on some schemas when setting defaults (#359) (#365)

* Don't try to fetch files that don't exist

Throws an exception when the ref can't be resolved to a useful file URI,
rather than waiting for something further down the line to fail after
the fact.

* Refactor defaults code to use LooseTypeCheck where appropriate

* Test for not treating non-containers like arrays

* Update comments

* Rename variable for clarity

* Add CHECK_MODE_ONLY_REQUIRED_DEFAULTS

If CHECK_MODE_ONLY_REQUIRED_DEFAULTS is set, then only apply defaults
if they are marked as required.

* Workaround for $this scope issue on PHP-5.3

* Fix infinite recursion via $ref when applying defaults

* Add missing second test for array case

* Add test for setting a default value for null

* Also fix infinite recursion via $ref for array defaults

* Move nested closure into separate method

* $parentSchema will always be set when $name is, so don't check it

* Handle nulls properly - fixes issue #377

* Add option to also validate the schema (#357)

* Remove stale files from #357 (obviated by #362) (#400)

* Stop #386 sneaking in alongside another PR backport
erayd added a commit to erayd/json-schema that referenced this pull request Mar 22, 2017
…nrainbow#362)

* Add URI translation for retrieval & add local copies of spec schema

* Use dist copies of schemas

No need to keep duplicate files around in package://tests/fixtures/ if
we're distributing them for users anyway.

* Move package:// translation after all other rules

Allows users to rewrite to package:// targets and still have the URI
work.
erayd added a commit to erayd/json-schema that referenced this pull request Mar 22, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants