Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add config parsing module #4228

Closed

Conversation

codeboten
Copy link
Contributor

@codeboten codeboten commented Aug 21, 2023

This code is moved from the OpenTelemetry Collector to attempt sharing the same code across both the collector and the go otel implementation.

TODO:

  • removing the confmap import
  • add make target to generate config code from json schema

config/config.go Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Sep 27, 2023

Codecov Report

Merging #4228 (5c2b253) into main (e901cf0) will decrease coverage by 1.1%.
The diff coverage is 50.6%.

Impacted file tree graph

@@           Coverage Diff           @@
##            main   #4228     +/-   ##
=======================================
- Coverage   82.2%   81.1%   -1.1%     
=======================================
  Files        144     147      +3     
  Lines      10005   10364    +359     
=======================================
+ Hits        8229    8411    +182     
- Misses      1636    1798    +162     
- Partials     140     155     +15     
Files Coverage Δ
config/metric_view.go 90.4% <90.4%> (ø)
config/config.go 74.7% <74.7%> (ø)
config/generated_config.go 17.5% <17.5%> (ø)

@codeboten codeboten changed the title [wip] add config parsing library add config parsing module Sep 27, 2023
@codeboten codeboten force-pushed the codeboten/prototype-go-lib branch 3 times, most recently from b787ecf to c04e435 Compare September 27, 2023 21:03
This module parses configuration as defined per the opentelemetry-configuration schema.

Signed-off-by: Alex Boten <aboten@lightstep.com>
@@ -0,0 +1,661 @@
// Code generated by github.com/atombender/go-jsonschema, DO NOT EDIT.
Copy link
Member

@pellared pellared Sep 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a high level look. Here are some questions which come to my mind.

  1. How and where do we want to use this module? What is it responsible for?
  2. What we will do once there would be a breaking change in the JSON schema? If it will make a breakin change in the exported API then users may encounter problems at build time when bumping modules.
  3. Is this really work properly with JSON unmarshal? Does it not miss json tags?
  4. Do we want to support yaml.v3?
  5. Are we able e.g. to fix some naming nit issues like: Otlp -> OTLP?
  6. Do we think that it is a good idea to have the generated code exported? Maybe we should have it unexported or internal and then map it some more Go idomatic types (e.g. get rid of pointers etc)?

I have a feeling that the module should provide functions like func FromJSON(r io.Reader) (*Config, error) and func FromYAML(r io.Reader) (*Config, error) that would read a config in given format from a stream. The function would unmarshal -> validate -> map to a hand-made Config struct.

Probably adding package docs and testable example would also help in reviewing and validating the design.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the scope of this PR could be the part defined as Configurator.Parse in open-telemetry/opentelemetry-specification#3437.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not a part of the Configuration WG, but this is my best attempt at answering the questions.

How and where do we want to use this module? What is it responsible for?

This code comes from the Collector. For example, the generated config file current location is this: https://github.com/open-telemetry/opentelemetry-collector/blob/main/service/telemetry/generated_config.go. It is part of the work from the OpenTelemetry Configuration WG and it would have the same owners here as it has there.

The intent in the Collector is to provide the configuration structs to be used to configure the Collector's internal telemetry, but it makes more sense to have it here since these would also be used as part of the OpenTelemetry Go support for the general configuration schema being worked on by the Config WG.

What we will do once there would be a breaking change in the JSON schema? If it will make a breakin change in the exported API then users may encounter problems at build time when bumping modules.

I think that is to be expected at this point, and the way to signal this is to release this module with 0.X versioning.

Is this really work properly with JSON unmarshal? Does it not miss json tags?

Json tags are not necessary in general (https://go.dev/play/p/5ZH3i283Ufi) and this uses mapstructure tags in any case to provide JSON support.

Do we want to support yaml.v3?

mapstructure tags should provide support for YAML, it is what we use in the Collector.

Are we able e.g. to fix some naming nit issues like: Otlp -> OTLP?

I think that would be an issue on https://github.com/open-telemetry/opentelemetry-configuration

Do we think that it is a good idea to have the generated code exported? Maybe we should have it unexported or internal and then map it some more Go idomatic types (e.g. get rid of pointers etc)?

This is necessary today due to the Collector code structure. We can think about a refactor later before making this module 1.0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pellared & @mx-psi for the comments.

Are we able e.g. to fix some naming nit issues like: Otlp -> OTLP?

Yes this is configurable via the go-jsonschema which should be ok to do. I suppose we could also propose the change upstream as @mx-psi suggested, but if all we care about are the generated go structs, it's probably enough to move forward w/ the changes in the go-jsonschema generation.

Do we think that it is a good idea to have the generated code exported? Maybe we should have it unexported or internal and then map it some more Go idomatic types (e.g. get rid of pointers etc)?

I don't have a good alternative at this point. We may want to follow the same model as semantic conventions and release the module behind a version that matches released versions of the configuration. WDYT?

Probably adding package docs and testable example would also help in reviewing and validating the design.

Yup... 100%. I was mostly just trying to move the minimum amount of code over from the Collector into a separate package to start untangling the dependencies from that side.

I'm ok with adding the functionality to do the parsing and configure the SDK as part of this initial PR, or we can break it up. Whatever makes the most sense here, i'm open to suggestions

Copy link
Member

@pellared pellared Sep 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First I want to understand how people would like to use it. From my perspective the OpenTelemetry Go users would primarily need a configuration model (POJO like, without any tags) and something which creates the providers based on the model. I do not want the config package to leak things like mapstructure tags only because some user wants to use it with github.com/mitchellh/mapstructure.

How about a design like the following:

package config

// Config represents the SDK configuration model.
type Config struct {}

// NewSDK creates SDK providers based on the configuration model.
func NewSDK(ctx, context.Context, c Config) (*SDK, error) {}

type SDK struct {}
func (sdk *SDK) Shutdown() error {}
func (sdk *SDK) TraceProvider() *trace.TraceProvider {}
func (sdk *SDK) MeterProvider() *meter.MeterProvider {}

The config parsers could be located in child modules to reduce the number of dependencies (not everyone wants to take e.g. yaml.v3 as a dependency). We can have config/configyaml module with exported API like following:

package configyaml

func Parse(r io.Reader) (Config, error) {}

Then in future we could create more modules handling other "protocols" such as json, protobuf.

Example usage:

package main

import (
	"context"
	"errors"
	"log"
	"os"
	"os/signal"

	"go.opentelemetry.io/contrib/config"
	"go.opentelemetry.io/contrib/config/configyaml"
)

func main() {
	if err := run(); err != nil {
		log.Fatalln(err)
	}
}


func run() (err error) {
	// Handle SIGINT (CTRL+C) gracefully.
	ctx, stop := signal.NotifyContext(context.Background(), os.Interrupt)
	defer stop()

	otelSDK, err := setupOTelSDK(ctx, serviceName, serviceVersion)
	if err != nil {
		return
	}
	// Handle shutdown properly so nothing leaks.
	defer func() {
		err = errors.Join(err, otelSDK.Shutdown(context.Background()))
	}()

	// ... other application bootstrapping code ...

	return
}

func setupOTelSDK() (*config.SDK, error) {
	f, err := os.ReadFile("otel.yaml")
	if err != nil {
		return nil, err
	}
	defer f.Close()

	cfg, err := configyaml.Parse(f)
	if err != nil {
		return nil, err
	}

	return config.NewSDK(ctx, cfg)
}

Would such design be usable for the collector?

As a side note, I can predict that they may be OTel Go users who would even prefer using such config package (without any file parsing) by creating config.Config in code than using the SDK.
Pros: it may be less code to write, more declarative
Cons: not strongly typed, lower control.

Maybe even the config package could be moved to the main go.opentelemetry.io/otel when it stabilizes? It depends if the configuration model specification and it if would be part of the SDK 😉

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the collector still benefit from a config package which would bootstrap the SDK based on the passed config?

The collector could still benefit from the package for configuring the SDK. This would at least remove the need to have this live in multiple places.

Why not having the parsing code in collector repository?

I do worry that having to parse the configuration in a separate repo, then mapping the values to the config.Config object will end up causing some unnecessary duplicate work for both repositories but maybe there's no way around it if the decision in this repo is not to make struct tags available for consumers.

@pellared Just to be clear, the idea of config.Config would be to make structs generated by jsonschema available without the struct tags?

Copy link
Member

@pellared pellared Sep 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pellared Just to be clear, the idea of config.Config would be to make structs generated by jsonschema available without the struct tags?

Correct. If the JSON schema is used to define the configuration model (and not only the YAML/JSON file scheme for defining the configuration) then I think that using code generation using jsonschema is probably the most effective solution 👍 Even if the JSON schema will not be used to define the model, it could be still the most pragmatic approach as it reduces the possibility of human errors.

The only think regarding the current struct output (apart from naming) that worries me is the usage of pointers. I find pointers annoying. It leads to creating helpers like ptrInt, ptrString.

Maybe we could use (and define) something like this instead of using pointers:

package schema

type Optional[T any] struct {
  value T
  empty bool
}

func OptionalValue[T any](T v) {
  return Optional{value: v}
}

func OptionalEmpty[T any](T v) {
  return Optional{empty: true}
}

func (o Optional[T]) Empty() bool {
  return o.empty
}

func (o Optional[T]) Value() T {
  if o.empty {
    panic("empty value")
  }
  return o.value
}

func (o Optional[T]) ValueOrDefault(dft T) T {
  if o.empty {
    return dft
  }
  return o.value
}

I think if we could use such generic type then the code would be more readable and maintainable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having that said (written 😉) I think that using jsonschema for generation with not perfect names and pointers would be an excellent initial PR.

We can then create follow-up issues (and PRs) to handle other concerns (step by step) e.g.

  • config: Refine model naming
  • config: Do not use pointers in model
  • config: Add a function which creates SDK from model

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to push back on the use of a custom Optional type. For as long as we had json and yaml marshalers, a pointer with omitEmpty has been how to represent this kind of Optional. We shouldn't make a helper type just to avoid some helper functions.

I am a bit concerned about the names of the schema and how they might be looking to change over time. After the first release with this module there needs to be little churn as it can make different versions incompatible with each other.

Copy link
Member

@pellared pellared Oct 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created area: config Related to config functionality for easier tracking. The first PR could be only about solving #4369.

@pellared
Copy link
Member

I think this PR helps validating the open-telemetry/opentelemetry-specification#3437

CC @jack-berg

Copy link
Contributor

@MadVikingGod MadVikingGod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is an excellent start to this approach.

A README.md with how this will be used, what constraints there are, and what features are needed would go a long way to help evaluate if this is good enough for the first draft.

sdkmetric "go.opentelemetry.io/otel/sdk/metric"
)

func viewOptionsFromConfig(views []View) []sdkmetric.Option {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we have these private functions and not building the SDKs from these options?

@codeboten
Copy link
Contributor Author

@pellared @mx-psi @MadVikingGod i've produced another PR that includes a subset of the changes here. It is trimmed down to the code to generate models from the jsonschema, the generated code, and a readme... i figured this would be a better way to get the changes in. See #4376

Moving just that code over would also allow us to pull code out of the collector contrib package right away, see the draft pr here: open-telemetry/opentelemetry-collector#8620

@pellared
Copy link
Member

@codeboten Can we close this one?

Copy link
Contributor Author

@codeboten codeboten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Closing this PR as it will be broken up into several PRs

@codeboten codeboten closed this Oct 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants