Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make semconv packages explicitly versioned #3764

Closed
tigrannajaryan opened this issue Oct 18, 2021 · 16 comments
Closed

Make semconv packages explicitly versioned #3764

tigrannajaryan opened this issue Oct 18, 2021 · 16 comments
Labels
Feature Request Suggest an idea for this project

Comments

@tigrannajaryan
Copy link
Member

tigrannajaryan commented Oct 18, 2021

Is your feature request related to a problem? Please describe.
OpenTelemetry semantic convention are considered to be a living standard to which changes are allowed over time.
As the conventions change over time, every version of the Otel specification captures/freezes a version of semantic
conventions. The telemetry sources are expected to explicitly declare the version of the semantic conventions
they use by recording the corresponding Schema URL in the emitted telemetry.

This repo currently declares a single semconv package. When the semantic conventions change in the specification
we will be forced to change the semconv package, which may break users of semconv package (e.g. if a name of an
attribute is change it will break the importing code, it won't compile any more).

Describe the solution you'd like
We iterated on this problem in Go SDK and the solution we came up with is to have one semconv package per version
of semantic conventions, see the PR and current structure.

I suggest that we implement the same approach here, in this repo.

Describe alternatives you've considered
One alternate is to keep current design, but it doesn't match the intent of telemetry schemas and how they allow
semantic conventions to evolve.

Additional context
See OTEP0152 for the motivation of schemas and how the changes are allowed.

@tigrannajaryan tigrannajaryan added the Feature Request Suggest an idea for this project label Oct 18, 2021
@jkwatson
Copy link
Contributor

Hi @tigrannajaryan . I was going to bring up this precise issue at the maintainers meeting today. In last week's Java SIG meeting, we went through a few possible options, and we're still ruminating on it. One thing that we're considering is to move the semconv module out of this repository altogether, and publish it like we do the proto bindings now.

We have yet to decide on a final implementation, but one option is definitely to have versioned artifacts, but we also have to then have versioned java packages, as the java module system does not allow exporting the same package from more than one jar artifact in the same classloader.

@tigrannajaryan
Copy link
Member Author

One thing that we're considering is to move the semconv module out of this repository altogether, and publish it like we do the proto bindings now.

This is also a good approach.

as the java module system does not allow exporting the same package from more than one jar artifact in the same classloader.

What if the version number is explicitly part of the package name? In that case it will be allowed, right?

We do want multiple semconv packages to be importable by one application. The use case is when you use multiple instrumentation libraries and they happen to depend on different versions of semconv. This is an explicitly allowed situation by OTEP.

@jkwatson
Copy link
Contributor

Yes, we almost certainly need to version the java packages. But, if we do that, we could publish a single jar file with all the packaged generated code in it, rather than have to also include the version in the artifact name.

We also pondered if we could somehow generate diffs in subsequent releases, rather than generating the entire contents every time. And, in addition, we suspect it would be very useful to break up the semconv by type in some way, so we could publish different files/packages/jars/something per convention type (http vs. rpc vs. db, etc)

@anuraaga
Copy link
Contributor

I think one big issue we have right now is semantic conventions usually don't change in spec versions. Sometimes they do though. We don't want to version semantic attributes with respect to the spec version since it means many versions with no breaking changes. Ideally semantic attributes were versioned separately from the spec, using actual semver to understand the changes. It means any attribute name change in semantic conventions bumps it's major version, possibly causing us to be at v2 or v3 of semantic attributes while the spec is at 1.8 or 1.9.

Is semver of semantic attributes achievable?

@tigrannajaryan
Copy link
Member Author

We don't want to version semantic attributes with respect to the spec version since it means many versions with no breaking changes. Ideally semantic attributes were versioned separately from the spec, using actual semver to understand the changes.

OTEP0152 explicitly says that semconv version and spec version are the same. This is simple to understand and I believe is the right approach.

The duplication is likely not a big deal in most compiled languages (strings should be interned). For dynamic languages perhaps the semconv package generator can avoid creating duplicate strings (still won't help with attribute names). Perhaps a different approach is needed for dynamic languages to avoid bloating the package size.

@jkwatson
Copy link
Contributor

The package size has nothing to do with compiled vs. dynamic. If we have class files that represent the constants (like we today), then the semconv package will continue to grow over time, as more versions of the classes are piled in. Unless we go with putting the version into the artifact name, which is definitely not the normal way that Java packaging would be done (you'd have to change the maven coordinates and the version with every upgrade...something people will not be used to having to do).

@iNikem
Copy link
Contributor

iNikem commented Oct 18, 2021

Do current spec versioning and compatibility guidelines say anything about deprecating and removing features with time? Should read it again...

@tigrannajaryan
Copy link
Member Author

Do current spec versioning and compatibility guidelines say anything about deprecating and removing features with time? Should read it again...

@iNikem at the moment OTEP0152 is the authoritative source about how schemas/semantic conventions evolve. I have PRs in progress to get it merged to the spec. However, there is nothing about removal of semantic conventions for now. We only support renaming conventions. More types of changes can be added as amending OTEPs.

@iNikem
Copy link
Contributor

iNikem commented Oct 18, 2021

@tigrannajaryan I am thinking about https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/versioning-and-stability.md. Which does not say almost anything about semantic convention :(

@iNikem
Copy link
Contributor

iNikem commented Oct 18, 2021

We iterated on this problem in Go SDK and the solution we came up with is to have one semconv package per version
of semantic conventions, see the PR and current structure.

Which means 12 packages per year, with mostly identical content. 36 packages in 3 years... I don't dare to imagine auto-completion experience in IDEs.

Full disclosure: I don't have any better idea :(

@tigrannajaryan
Copy link
Member Author

Full disclosure: I don't have any better idea :(

A possible alternate approach is to only create new packages when there are any changes in the semantic conventions. If no changes you use the latest version that is no greater than they one you target (we only had one release with a schema change so far). But, I feel it may cause confusion. One version = one package is easy to understand.

@tigrannajaryan
Copy link
Member Author

@tigrannajaryan I am thinking about https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/versioning-and-stability.md. Which does not say almost anything about semantic convention :(

Yes, it says "NOT DEFINED:" :-)

Once schemas OTEP is merged to the spec I can work on defining this. OTEP has the ideas we need.

@anuraaga
Copy link
Contributor

A possible alternate approach is to only create new packages when there are any changes in the semantic conventions.

This is currently the approach I'm most in favor of, I think it causes the least confusion for actual users of the conventions in instrumentation without having many duplicate packages. Without any corresponding version in the spec, we could just bump the package version in our repo when we find a breaking change. Wouldn't real semver provide the least confusion, both for SDKs and backends so they know when something needs to be updated in a hard way? I don't think end users care about the version of the spec at all, they just use versioned instrumentation libraries or backends that provide features, the version of the spec / semantic attributes is basically abstracted away.

@tigrannajaryan
Copy link
Member Author

Wouldn't real semver provide the least confusion, both for SDKs and backends so they know when something needs to be updated in a hard way?

I am not sure I fully understand what "semver" means in this context. The semantic conventions do change in a breaking manner. This is allowed and it is the primary purpose of schemas. How would "semver" apply here? Will we increment major version number every time? It will mean schema version numbers are completely decoupled from spec version numbers. The approved Schemas OTEP chose not to do this. I believe this is the right approach, by keeping schema version numbers and spec version numbers aligned there is less confusion. It is now too late to change this, we have already made multiple schema releases according to this principle. Changing the principle now is against OTEP and would cause more confusion.

@anuraaga
Copy link
Contributor

semver here means incrementing the major version when there is a breaking change as usual - this has become the popular way of versioning things since it tends to be important to know when there's a breaking change ;) I have a feeling this point was lost in the OTEP.

I don't think the package versions in Java have to match a version in the spec, so if the spec does not decide to change semantic attributes to semver, it should be OK. I don't think we'll be publishing a new package and/or artifact every version bump though even when there aren't any breaking changes since while not ideal conceptually, it seems better than #3764 (comment)

@jack-berg
Copy link
Member

Resolved by #5786.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature Request Suggest an idea for this project
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants