Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with custom tags #457

Closed
isaacs opened this issue Apr 4, 2023 · 11 comments · Fixed by #467
Closed

Issues with custom tags #457

isaacs opened this issue Apr 4, 2023 · 11 comments · Fixed by #467
Labels
bug Something isn't working

Comments

@isaacs
Copy link
Contributor

isaacs commented Apr 4, 2023

Describe the bug
The docs contain a link to https://github.com/eemeli/yaml/blob/master/src/tags/yaml-1.1/set.js in the custom tags section.

To Reproduce
Self explanatory

Expected behaviour
Should link to the implementation of the !!set custom tag. (I'm guessing it's this?)

Versions (please complete the following information):

  • Environment: [e.g. Node.js 14.7.0 or Chrome 87.0]
  • yaml: 2.2

Additional context

Assuming that https://github.com/eemeli/yaml/blob/main/src/schema/yaml-1.1/set.ts is the correct link, there's still (at least) a documentation shortcoming in this section.

It is provided as an example of how to implement a complex custom type, but looking at that code, it appears to be relying on createPair, Pair, createNode, and other functions that are not exported by the library. The mapTag is exported, but attempting to just hijack its createNode method fails with TypeError: schema.createPair is not a function

It feels like something was supposed to be exported (or documented?) when parseMap was removed, but I'm not finding it.

@eemeli
Copy link
Owner

eemeli commented Apr 4, 2023

You're right; the link should've pointed at the file you found and there were a couple of missing exports. Pair is already exported from 'yaml', but now these are also available:

import { createNode, createPair } from 'yaml/util'

I've released v2.3.0-1 that should make the necessary things public; it's available on npm as yaml@next. I'd be happy to leave this issue open for a short while effectively as a tracker for any other issues with custom tags that you might come across, so that they can get fixed in the next proper release.

@eemeli eemeli changed the title broken doc link to https://github.com/eemeli/yaml/blob/master/src/tags/yaml-1.1/set.js Issues with custom tags Apr 4, 2023
@isaacs
Copy link
Contributor Author

isaacs commented Apr 4, 2023

Fantastic, thanks! I'll make another pass at this soon and report back if there are any other blockers. (Stretch goal is to send a doc PR adding a complex object/map example to the existing string-type example.)

@isaacs
Copy link
Contributor Author

isaacs commented Apr 5, 2023

Got this pretty much working, here's an example of a type for a null-prototype object, about the simplest non-scalar type tap-yaml has: https://github.com/tapjs/tap-yaml/blob/isaacs/yaml2/lib/types/null-object.js

This bit here feels like it could be more elegant somehow: https://github.com/tapjs/tap-yaml/blob/isaacs/yaml2/lib/types/null-object.js#L15-L29

It's kind of surprising that I'm getting a YAMLMap in the resolve() call, I had thought from the docs that the nodeClass would be used to instantiate the node. Creating a map and pulling off its items works, but it feels like I'm delving into internals in a bad way.

Same with createNode, it seems like there's got to be a cleaner way to "parse all the things in the configured way and just give me the items". The alternative approach I'd tried (which also worked, but was worse) was to just copy-paste the logic out of YAMLMap, but that seemed to me like it would be even more brittle, since any change or addition in the config options would mean updating to support it properly.

@isaacs
Copy link
Contributor Author

isaacs commented Apr 5, 2023

@eemeli So here's what I'm thinking would make this a whole lot nicer, with relatively little change to the inner workings.

First, move the existing logic in mapTag.createNode to a static method on the YAMLMap class, which instead of calling new YAMLMap(...), call new this(...), so that any child class will use itself by default. (Unless it overrides the method entirely, which is also fine, I guess.)

Then, in src/doc/createNode.ts, do this:

diff --git a/src/doc/createNode.ts b/src/doc/createNode.ts
index d6e7241..fbea500 100644
--- a/src/doc/createNode.ts
+++ b/src/doc/createNode.ts
@@ -97,7 +97,9 @@ export function createNode(
     delete ctx.onTagObj
   }
 
-  const node = tagObj?.createNode
+  const node = typeof tagObj?.nodeClass?.createNode === 'function'
+    ? tagObj?.nodeClass?.createNode(ctx.schema, value, ctx)
+    : tagObj?.createNode
     ? tagObj.createNode(ctx.schema, value, ctx)
     : new Scalar(value)
   if (tagName) node.tag = tagName

That seems pretty low impact. Would probably want to move the other collection tag definitions to use a similar pattern, but very straightforward.

For the other part, in src/compose/compose-collection.ts, it calls one of resolveBlockMap, resolveBlockSeq, or resolveFlowCollection, based on the tagToken. Then in there is where it creates the YAMLMap or YAMLSeq object, which is why my nodeClass isn't being instantiated.

What I'd like to do with my custom tag is to say that it must be a map style (ie, you can't do !nullobject [1,2,3], that doesn't make sense). I could imagine cases where you'd say it must be a sequence, like if you wanted to create an !arguments tag to represent JS Arguments objects or some other array-ish thing.

So, it seems to me that the thing to do here is to move the ctx.schema.knownTags[tagName] check up in that function, and in each of those resolve methods, it instantiates the nodeClass instead of YAMLMap or YAMLSeq. And if it's a map, and tagObj.collection is not set to 'map', or it's a seq and tagObj.collection is not set to 'seq', then blow up. If there's no tagObj.nodeClass, then just use YAMLMap or YAMLSeq as appropriate.

That's a somewhat bigger change, so I wanted to check before diving into it. (I'm sure you know all this, just reporting what I found in digging so that if there's something obvious I missed, it'll stand out.)

But with those two, the total implementation for a custom !nullobject tag is:

const { YAMLMap } = require('yaml')
const tag = '!nullobject'

class YAMLNullObject extends YAMLMap {
  get tag() { return tag }
  set tag(_) {}
  toJSON(_, ctx) {
    const obj = super.toJSON(_, {...ctx, mapAsMap: false}, Object)
    return Object.assign(Object.create(null), obj)
  }
}

module.exports = {
  tag,
  collection: 'map',
  nodeClass: YAMLNullObject,
  identify: v => !!(
    typeof v === 'object' &&
    v &&
    !Object.getPrototypeOf(v)
  )
}

or even cuter:

import { YAMLMap } from 'yaml'
const tag = '!nullobject'
export default class YAMLNullObject extends YAMLMap {
  get tag() { return tag }
  set tag(_) {}
  toJSON(_, ctx) {
    const obj = super.toJSON(_, {...ctx, mapAsMap: false}, Object)
    return Object.assign(Object.create(null), obj)
  }

  static get tag() { return tag }
  static get collection() { return 'map' }
  static identify(v) {
    return !!(
      typeof v === 'object' &&
      v &&
      !Object.getPrototypeOf(v)
    )
  }
  static get nodeClass () { return YAMLNullObject }
}

@eemeli
Copy link
Owner

eemeli commented Apr 13, 2023

Sorry for the delay; needed to think about your proposals and was afk for Easter.

I rather like the idea of moving the currently hard-to-reach createNode functions to be static factory methods; that would indeed make them much more useful for custom collections. As a bit of bikeshedding, something like YAMLMap.from() or YAMLMap.create() would probably be more appropriate than YAMLMap.createNode(). I don't think I've ever written code calling new this(), but of course that makes sense in a static method.

For backwards compatibility, I think tagObj.createNode needs to take priority over the potentially inherited factory method of the collection. Of the existing YAML 1.1 types, at least !!omap would otherwise break without further changes.

Would you be interested in submitting a PR with these changes?

I've also been thinking that it might make sense to put together a separate package of useful custom tags; e.g. null objects and regexps could be useful to many users. Would you be interested in contributing to such a collection?

@isaacs
Copy link
Contributor Author

isaacs commented Apr 13, 2023

Sorry for the delay; needed to think about your proposals and was afk for Easter.

It's taken me a year to even start looking at yaml2 for node-tap, so I think you're good 😂

Would you be interested in submitting a PR with these changes?

Sure, thanks for the direction. That all sounds very reasonable.

Re the bikeshed, YAMLMap.from() seems to mirror the idiom in JavaScript like Array.from(), Uint8Array.from(), etc.

I've also been thinking that it might make sense to put together a separate package of useful custom tags; e.g. null objects and regexps could be useful to many users. Would you be interested in contributing to such a collection?

Yes! That's essentially all that tap-yaml is, ideally such a collection would make it unnecessary.

@eemeli
Copy link
Owner

eemeli commented Apr 13, 2023

Sorry for the delay; needed to think about your proposals and was afk for Easter.

It's taken me a year to even start looking at yaml2 for node-tap, so I think you're good 😂

Heh, sometimes fixing what ain't broken does take quite a bit of energy.

Re the bikeshed, YAMLMap.from() seems to mirror the idiom in JavaScript like Array.from(), Uint8Array.from(), etc.

Yeah, that was my thinking as well. Of course in this case there are a couple of arguments instead of just one, but from() does sound like the most appropriate name.

I've also been thinking that it might make sense to put together a separate package of useful custom tags; e.g. null objects and regexps could be useful to many users. Would you be interested in contributing to such a collection?

Yes! That's essentially all that tap-yaml is, ideally such a collection would make it unnecessary.

Excellent! I'll ping you if/once I get something like that off the ground.

@isaacs
Copy link
Contributor Author

isaacs commented Apr 13, 2023

Yeah, that was my thinking as well. Of course in this case there are a couple of arguments instead of just one, but from() does sound like the most appropriate name.

The TypedArray.from() methods all take multiple args, you can pass in a map function and thisp. Definitely feels right to me. ClassName.create() was pretty popular a while back, but seems to have fallen out of fashion.

@isaacs
Copy link
Contributor Author

isaacs commented Apr 20, 2023

Is there a build step or something to get the yaml/json fixtures for tests? I'm getting this failure:

 FAIL  tests/json-test-suite.ts
  ● Test suite failed to run

    ENOENT: no such file or directory, scandir '/Users/isaacs/dev/tapjs/yaml/tests/json-test-suite/test_parsing'

isaacs added a commit to isaacs/yaml that referenced this issue Apr 20, 2023
@eemeli
Copy link
Owner

eemeli commented Apr 21, 2023

The JSON and YAML test suites use data from git submodules, so run this in the repo root to set them up:

git submodule update --init

@isaacs
Copy link
Contributor Author

isaacs commented Apr 21, 2023

@eemeli If you have any time to provide eyes on #467, it'd be much appreciated.

It's just a first pass, not sure if it's the most elegant way to go about it, but seems to work for my custom tags as described in this issue (albeit without checking if the custom tag gets the right kind of collection), and passes existing tests.

isaacs added a commit to isaacs/yaml that referenced this issue Apr 29, 2023
isaacs added a commit to isaacs/yaml that referenced this issue Apr 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants