Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate metrics on vcenter receiver in a dual cluster deployment #30879

Closed
adasescu opened this issue Jan 30, 2024 · 5 comments · Fixed by #31113
Closed

Duplicate metrics on vcenter receiver in a dual cluster deployment #30879

adasescu opened this issue Jan 30, 2024 · 5 comments · Fixed by #31113
Labels
bug Something isn't working receiver/vcenter

Comments

@adasescu
Copy link

Component(s)

receiver/vcenter

What happened?

Description

Metrics from VMs belonging to a cluster are shown in the other cluster as well.
In our vCenter we have the following tree structure:

<root>
+-Datacenter1
   +-Cluster1
   | +-Host1
   | +-Host2
   +-Cluster2
   | +-Host-C2-1
   | +-Host-C2-2
   | | +-VM1
   | | +-VM2
   | | +-VM3
   | | +-VM4
   | +-Host-C2-3
   | +-ResourcePool1 
   | | +-VM1
   | | +-VM2
   | | +-VM3
   | | +-VM4

Metrics from VM1, VM2, VM3 and VM4 appear under Cluster1->Host-C2-2 and on Cluster2->Host-C2-2. It seems that the relationship between the cluster and the host isn't reflected in the scraper code.

Also I couldn't find integration tests for multiple clusters examples.

Steps to Reproduce

Expected Result

The metrics for VMs to be under the cluster they belong to.

Actual Result

The VMs are reported under both clusters.

Collector version

0.77.0-dev, 0.90.1-dev

Environment information

Environment

vCenter Version: 7.0.3 Build: 19234570

OpenTelemetry Collector configuration

No response

Log output

No response

Additional context

No response

@adasescu adasescu added bug Something isn't working needs triage New item requiring triage labels Jan 30, 2024
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@djaglowski
Copy link
Member

@schmikei, this seems like an important bug. Are you able to take a look?

@djaglowski djaglowski removed the needs triage New item requiring triage label Jan 31, 2024
@schmikei
Copy link
Contributor

schmikei commented Feb 1, 2024

@schmikei, this seems like an important bug. Are you able to take a look?

Yeah I think there may be a legitimate issue since VMs are not directly assigned to a cluster when requested from the API Inventory. The inventory for the VMs are at a per datacenter level from my understanding so each VM will need a request its host into cluster if my head is heading in the right direction for a fix.

I can take a look into seeing if we can derive it from the runtime host but it may have some performance implications for each VM if we have to do that lookup step anywhere in the scrape.

Will try and take a look!

@schmikei
Copy link
Contributor

schmikei commented Feb 6, 2024

Small update on this, just have been swamped with other work

I believe I have a fix related to mapping in memory the VM references associated from their Host or Resource Pool from the Inventory API to their respective Cluster. I think I have most of the spiked out code in place, will hopefully get a PR out sometime this week if not next week.

@djaglowski
Copy link
Member

Thanks @schmikei

djaglowski pushed a commit that referenced this issue Mar 8, 2024
…31113)

**Description:** First pass at better contextualizing VMs in a
multicluster deployment.

## The Problem

For multicluster deployments the original scraper code incorrectly
interpreted assumptions on VM ownership belonging to multiple clusters.

## The Solution

The VM folder in a datacenter is completely separated from a cluster.
Because of this we have to determine the VM's host system or resource
polo in order to appropriate set the `vcenter.cluster.name`
appropriately.

## Things that have changed 
- ClusterComputes are now retrieved as a superset Computes
- For the resource attribute `vcenter.cluster.name` to show up, the
inventory type must match the string "ClusterComputeResource"
- The receiver now maintains a dynamic map of vm Managed Object
Reference Value to compute name

## Customer impacts
- VM metrics will more appropriately set the `vcenter.cluster.name`
resource attribute in multi-cluster environments. This resource
attribute will not be emitted for VMs on a runtime hosts outside of a
cluster.
- VM metrics will now emit with `vcenter.resource_pool.name` and
`vcenter.resource_pool.inventory_path` when the VM is a member of one.


**Link to tracking Issue:** Resolves #30879

**Testing:**

**Documentation:** <Describe the documentation added.>
DougManton pushed a commit to DougManton/opentelemetry-collector-contrib that referenced this issue Mar 13, 2024
…pen-telemetry#31113)

**Description:** First pass at better contextualizing VMs in a
multicluster deployment.

## The Problem

For multicluster deployments the original scraper code incorrectly
interpreted assumptions on VM ownership belonging to multiple clusters.

## The Solution

The VM folder in a datacenter is completely separated from a cluster.
Because of this we have to determine the VM's host system or resource
polo in order to appropriate set the `vcenter.cluster.name`
appropriately.

## Things that have changed 
- ClusterComputes are now retrieved as a superset Computes
- For the resource attribute `vcenter.cluster.name` to show up, the
inventory type must match the string "ClusterComputeResource"
- The receiver now maintains a dynamic map of vm Managed Object
Reference Value to compute name

## Customer impacts
- VM metrics will more appropriately set the `vcenter.cluster.name`
resource attribute in multi-cluster environments. This resource
attribute will not be emitted for VMs on a runtime hosts outside of a
cluster.
- VM metrics will now emit with `vcenter.resource_pool.name` and
`vcenter.resource_pool.inventory_path` when the VM is a member of one.


**Link to tracking Issue:** Resolves open-telemetry#30879

**Testing:**

**Documentation:** <Describe the documentation added.>
XinRanZhAWS pushed a commit to XinRanZhAWS/opentelemetry-collector-contrib that referenced this issue Mar 13, 2024
…pen-telemetry#31113)

**Description:** First pass at better contextualizing VMs in a
multicluster deployment.

## The Problem

For multicluster deployments the original scraper code incorrectly
interpreted assumptions on VM ownership belonging to multiple clusters.

## The Solution

The VM folder in a datacenter is completely separated from a cluster.
Because of this we have to determine the VM's host system or resource
polo in order to appropriate set the `vcenter.cluster.name`
appropriately.

## Things that have changed 
- ClusterComputes are now retrieved as a superset Computes
- For the resource attribute `vcenter.cluster.name` to show up, the
inventory type must match the string "ClusterComputeResource"
- The receiver now maintains a dynamic map of vm Managed Object
Reference Value to compute name

## Customer impacts
- VM metrics will more appropriately set the `vcenter.cluster.name`
resource attribute in multi-cluster environments. This resource
attribute will not be emitted for VMs on a runtime hosts outside of a
cluster.
- VM metrics will now emit with `vcenter.resource_pool.name` and
`vcenter.resource_pool.inventory_path` when the VM is a member of one.


**Link to tracking Issue:** Resolves open-telemetry#30879

**Testing:**

**Documentation:** <Describe the documentation added.>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working receiver/vcenter
Projects
None yet
3 participants