Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate datasources based on OpenAPI specification #127

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

remilapeyre
Copy link

This is an experiment to see whether generating the provider based on
the OpenAPI specification at https://github.com/hashicorp/boundary/blob/main/internal/gen/controller.swagger.json
could work.

The schema is converted from the definitions given in the document to
to map[string]*schema.Schema, with two special cases:

  • when there is an object in an object, I convert it to a one element
    list as terraform-plugin-sdk v2 does not know how to express this,
  • when there is an opaque attribute (map[string]interface{}), I
    skip it completely as terraform-plugin-sdk does not expose
    DynamicPseudoType that would make it possible to express this
    attribute in native Terraform, the workaround I use in the Consul
    provider is to use schema.TypeString and jsonencode() but it is not
    ideal. Since I only worked on the datasources here I chose to skip
    those attributes for now.

Once the schema is converted, we create the ReadContext function that
is needed for the datasource. As it can be a bit tricky to use the Go
client for each service, I chose to use directly the global *api.Client
and to manually add the query params and get the raw response. While it
would not be recommended for an external project to use the client this
way, it fits nicely here and keep the code simple. Finally the result is
written to the state, looking at the schema we generated previously to
convert it.

The tests are written manually so the developper can make sure that
everything is working as expected even thought the code was generated and
not written manually.

While the conversion of the schema could be made at runtime and only
one ReadContext function is actually needed, I find generating the code
make it quite easy to review and should make it easier for contributors
already accustomed to writing Terraform providers to look for errors or
fork the provider for their needs.

While I only worked on datasources returning lists of elements for now,
I think the same approach could be used to generate datasources returning
a single element and ultimately resources. This would make it very easy
to keep the Terraform provider in sync with new Boundary versions,
especially as the OpenAPI spec is created from the Protobuf files and
the CLI is already generated on a similar principle.

The code in generate_datasource.go is not very nice, but it does get the
job done. I may spin it off in its own project in the future to add more
feature to it.

Closes #99

@malnick
Copy link
Contributor

malnick commented Aug 7, 2021

Thank you for submitting this and a BIG apology on our part for letting it languish for the last couple of weeks! I'm going to split this review between me, @louisruch and @s-christoff since it's a large PR. Love seeing the inclusion of data sources and autogen from our API spec.

@malnick malnick added the enhancement New feature or request label Aug 7, 2021
@remilapeyre
Copy link
Author

Thank you for submitting this and a BIG apology on our part for letting it languish for the last couple of weeks! I'm going to split this review between me, @louisruch and @s-christoff since it's a large PR. Love seeing the inclusion of data sources and autogen from our API spec.

Hi @malnick, thanks for taking the time to review this. One thing that I did not add in this PR is setting the Sensitive: true field where appropriate, I wanted to have your feedback for this. I think there is basically two options, either add a parameter to list the sensitive fields in scripts/generate_datasource.go or add this information in the OpenAPI spec as a vendor extension (I don't think there is a way to have this information without custom properties). I would prefer the second one, it would require cooperation from the Boundary team but it would make it much more easier as all the Terraform schema would come from one source only.

@malnick
Copy link
Contributor

malnick commented Aug 11, 2021

I like the idea of adding it to the API spec as well, there might be other wins we can get from that in tooling that leverages that spec too.

I'm roping in @jefferai on this PR since it's fairly cross functional and hits on a few key areas that are under his prevue. More on this very soon as we figure out the best way to break down the review. Thank you again!

@malnick malnick requested review from jefferai and removed request for schristoff August 11, 2021 21:11
@jefferai
Copy link
Member

I'm actually going to tag @jimlambrt on this since marking things sensitive is a part of the next step of eventing and probably we can have some fields automatically marked this way in openapi.

That said, I also don't think it necessarily matters, as Boundary will never round trip a secret value set from Terraform back in a read. I will admit I don't have a great idea of how Sensitive is handled in TF so maybe that's not relevant?

malnick
malnick previously approved these changes Aug 12, 2021
Copy link
Contributor

@malnick malnick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great!

I had some general questions that I answered myself through the review. @louisruch is going to take a pass on this PR Monday, so there might be some more feedback on that review soon. Thank you again for this great contribution, both in terms of short term gains (data sources!) and the automation to make it easy for us in the future. It's a huge win for us!

resource.TestCheckResourceAttr("data.boundary_accounts.foo", "items.#", "1"),
resource.TestCheckResourceAttr("data.boundary_accounts.foo", "items.0.%", "11"),
resource.TestCheckResourceAttrSet("data.boundary_accounts.foo", "items.0.auth_method_id"),
resource.TestCheckResourceAttr("data.boundary_accounts.foo", "items.0.authorized_actions.#", "6"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love the detailed checking on all resource attributes here, including authorized actions.

internal/provider/data_source_auth_methods.go Show resolved Hide resolved
scripts/generate_datasource.go Show resolved Hide resolved
@remilapeyre
Copy link
Author

HI @jefferai,

That said, I also don't think it necessarily matters, as Boundary will never round trip a secret value set from Terraform back in a read.

This is good to know, we don't have to set anything Sensitive for the datasources then. I do plan to extend this to generate the resources to later on as it would make it very easy to have the complete provider auto-generated. Once we get there we will need to know which field are Sensitive as they would be leaked in the plan otherwise. This will be for a later PR thought.

@jefferai
Copy link
Member

jefferai commented Aug 16, 2021

Curious, any reason you based this on OpenAPI?

The OpenAPI output itself is generated based on our source of truth, our proto struct and service definitions. Because the OpenAPI generation is subject to various constraints of the OpenAPI v2 spec, and may one day become a v3 spec, we use the proto definitions to build our Go API libraries and our CLI. That way we're not generating based on something that is itself generated.

Edit: one of the reasons this has me concerned is that we produce OpenAPI because people ask us for it and because it might be useful for dynamic generation of web UI elements in the future but we don't really curate it. It's pretty much whatever grpc-gateway spits out. So I worry that something in there is currently or will be a bit funky because we're not paying close attention to it in the way that we are the underlying protos/services and the Go SDK output.

@remilapeyre
Copy link
Author

Hi @jefferai, I initially tried to base the generator on the protobufs but I found it more difficult to get it working than basing the generator on the OpenAPI, I think it was mostly because everything regarding the protobufs is in internal/ so it was difficult to get Go to read them properly without vendoring everything (which would have required to keep them in sync) or having a lot of boilerplate. I also wanted to avoid moving the Terraform provider generator in github.com/hashicorp/boundary both in fear that it may be cumbersome for you to have to check that the updates to the provider are correct every time you make a change to Boundary, as opposed to launching the script once per release, and because this is not what any of the users would expect and it would be an additional thing to learn to contribute to the provider.

I shared your fears regarding relying on the OpenAPI document, and I don't like OpenAPI in general, but since most of the work here is done by grpc-gateway which is the JSON proxy anyway I expect to be mostly correct as some other users of the grpc-gateway project are probably relying on it.

Regarding the transition from OpenAPI 2 to OpenAPI 3, I don't think it will be very difficult to update the generator which is currently quite short.

@remilapeyre
Copy link
Author

Also regarding Boundary and Terraform, but this time to make it possible to use Boundary proxies from Terraform I opened hashicorp/terraform#29402 that may interest you.

@jefferai
Copy link
Member

Everything is in internal by default until we need to move it out. It may be a good time to consider moving the generated files into the sdk module given that this would let them be used for purposes like this and we keep those APIs stable(ish) at this point anyways given that they are now part of the product. That would let this be based on the generated protos.

We also generate the Go structs in api/ from the protos. Depending on whether you need things like the comments from the OpenAPI, this might actually be the easiest thing to use, and we do curate the Go types that are created. Either of those options seem better than using the OpenAPI though.

@jefferai
Copy link
Member

Talked with the team internally -- we're going to move some of the files out of internal and into sdk. We're still figuring out exactly how we want to do to this based on some other needs we have. So it may take the form of the generated protobufs (and the .proto files) being put in sdk, it may take the form of those types being wrapped by SDK types, or it could be type aliases. Stay tuned...

@jefferai
Copy link
Member

Hi there -- we have two paths that are being considered, and curious which is easier for you:

  1. You could base on the Go SDK (under github.com/hashicorp/boundary/api) which contains Go structs converted via protoreflect from the generated proto structs. This is slightly lossy compared to the original proto messages, but it's just type conversions to our API which you'd likely be doing anyways. So this may save you some work if you reflect against it.
  2. We could alias or embed the generated proto types into the sdk module. From there you could use protoreflect against the message types.

Thoughts?

@remilapeyre
Copy link
Author

Hi @jefferai, sorry to answer this late, I've had a lot of work this week. I thought about both options and I don't have a strong preference for either, it seems to me that making the protobufs part of the external API would be more committing regarding backward compatibility. Since the types are already in github.com/hashicorp/boundary/api it seems to me like using those in the generator would be better as it would add less constraints.

@jefferai
Copy link
Member

You could use either! The PB files are now in the sdk module in the main branch.

@remilapeyre
Copy link
Author

Hi everyone, sorry for the delay for getting back I've had a lot of work last weeks. I've been looking into converting the current generator from using the swagger definition to either using reflect or protoreflect but there is some details I'm not sure about. For example we extract the Description from the swagger but it is not in the Go types, and only as a comment in the protobufs so we can't extract it (unless using ast but I suppose we want to avoid that as it would made maintenance harder).

Do you think we should add this in the Go types as a a struct tag?

@malnick
Copy link
Contributor

malnick commented Oct 25, 2021

Sorry for the lull in comms over here @remilapeyre ! We've been a little caught up in the day-to-day, but @jefferai will be reaching out again soon.

@jefferai
Copy link
Member

Sorry for the delay!

I think encoding the description in struct tags would likely then break whatever grpc-gateway is doing to use that description for openapi.

Probably we should either:

  1. Not include description in the data source schema (asking people to reference the non-data-source resources for the values)
  2. Copy whatever grpc-gateway's openapi generator is doing to populate the description for openapi

There's an option I think is better than either but is more work:
3. separate the descriptions from the protobufs. Put the descriptions in some kind of external structure in the sdk package and have templating code insert it into the proto files before generation, or something similar. There are a couple of real benefits to this approach:

  • Easier translation, if we ever translate the descriptions, because the strings can be separated by lang/locale
  • Exact same descriptions can be used in openapi, TF, and anywhere else that we want to use them, without them getting out-of-sync

So while I can live with (1) or (2) right now and either will work, I think long term we (being, the Boundary team, and you if interested) should push towards (3).

@hashicorp-cla
Copy link

hashicorp-cla commented Mar 12, 2022

CLA assistant check
All committers have signed the CLA.

@macmiranda
Copy link

Any plans to give this traction in 2024?

@jefferai
Copy link
Member

jefferai commented May 6, 2024

@macmiranda The original author seems to have abandoned it but I think someone on the Boundary team might be picking it up soon.

This is an experiment to see whether generating the provider based on
the OpenAPI specification at https://github.com/hashicorp/boundary/blob/main/internal/gen/controller.swagger.json
could work.

The schema is converted from the definitions given in the document to
to map[string]*schema.Schema, with two special cases:
  - when there is an object in an object, I convert it to a one element
  list as terraform-plugin-sdk v2 does not know how to express this,
  - when there is an opaque attribute (`map[string]interface{}`), I
  skip it completely as terraform-plugin-sdk does not expose
  `DynamicPseudoType` that would make it possible to express this
  attribute in native Terraform, the workaround I use in the Consul
  provider is to use `schema.TypeString` and `jsonencode()` but it is not
  ideal. Since I only worked on the datasources here I chose to skip
  those attributes for now.

Once the schema is converted, we create the `ReadContext` function that
is needed for the datasource. As it can be a bit tricky to use the Go
client for each service, I chose to use directly the global *api.Client
and to manually add the query params and get the raw response. While it
would not be recommended for an external project to use the client this
way, it fits nicely here and keep the code simple. Finally the result is
written to the state, looking at the schema we generated previously to
convert it.

The tests are written manually so the developper can make sure that
everything is working as expected even thought the code was generated and
not written manually.

While the conversion of the schema could be made at runtime and only
one `ReadContext` function is actually needed, I find generating the code
make it quite easy to review and should make it easier for contributors
already accustomed to writing Terraform providers to look for errors or
fork the provider for their needs.

While I only worked on datasources returning lists of elements for now,
I think the same approach could be used to generate datasources returning
a single element and ultimately resources. This would make it very easy
to keep the Terraform provider in sync with new Boundary versions,
especially as the OpenAPI spec is created from the Protobuf files and
the CLI is already generated on a similar principle.

The code in generate_datasource.go is not very nice, but it does get the
job done. I may spin it off in its own project in the future to add more
feature to it.

Closes hashicorp#99
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Data providers
5 participants