Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: metrics from /sys/class/watchdog #2309

Closed
mirzak opened this issue Mar 4, 2022 · 3 comments · Fixed by #2880
Closed

Feature request: metrics from /sys/class/watchdog #2309

mirzak opened this issue Mar 4, 2022 · 3 comments · Fixed by #2880

Comments

@mirzak
Copy link

mirzak commented Mar 4, 2022

Documentation: https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-class-watchdog

The most interesting metric would be:

What:		/sys/class/watchdog/watchdogn/bootstatus
Date:		August 2015
Contact:	Wim Van Sebroeck <wim@iguana.be>
Description:
		It is a read only file. It contains status of the watchdog
		device at boot. It is equivalent to WDIOC_GETBOOTSTATUS of
		ioctl interface.

Which enables you determinate reboot/reset cause.

@discordianfish
Copy link
Member

How does the context look like? But yeah in general that sounds reasonable.

@discordianfish
Copy link
Member

As usual though, parsing should go into procfs. Or we use the ioctl interface if that can be done as non-root

@gavinkflam
Copy link
Contributor

gavinkflam commented Dec 3, 2023

Hi all, I am adding watchdog stats to procfs. I am waiting for the pull request to be merged and included in a new procfs release.

prometheus/procfs#594

gavinkflam added a commit to gavinkflam/node_exporter that referenced this issue Dec 5, 2023
Signed-off-by: Gavin Lam <gavin.oss@tutamail.com>
gavinkflam added a commit to gavinkflam/node_exporter that referenced this issue Dec 20, 2023
Signed-off-by: Gavin Lam <gavin.oss@tutamail.com>
gavinkflam added a commit to gavinkflam/node_exporter that referenced this issue Mar 8, 2024
Signed-off-by: Gavin Lam <gavin.oss@tutamail.com>
gavinkflam added a commit to gavinkflam/node_exporter that referenced this issue Mar 8, 2024
Signed-off-by: Gavin Lam <gavin.oss@tutamail.com>
gavinkflam added a commit to gavinkflam/node_exporter that referenced this issue Mar 9, 2024
Signed-off-by: Gavin Lam <gavin.oss@tutamail.com>
gavinkflam added a commit to gavinkflam/node_exporter that referenced this issue Mar 9, 2024
Signed-off-by: Gavin Lam <gavin.oss@tutamail.com>
SuperQ pushed a commit that referenced this issue Mar 9, 2024
Signed-off-by: Gavin Lam <gavin.oss@tutamail.com>
gitperr pushed a commit to gitperr/node_exporter that referenced this issue Apr 30, 2024
gitperr pushed a commit to gitperr/node_exporter that referenced this issue Apr 30, 2024
Signed-off-by: David O'Rourke <david.orourke@gmail.com>

chore:remove constant from function (prometheus#2884)

Signed-off-by: tyltr <tylitianrui@126.com>

build(deps): bump github.com/jsimonetti/rtnetlink from 1.4.0 to 1.4.1 (prometheus#2909)

Bumps [github.com/jsimonetti/rtnetlink](https://github.com/jsimonetti/rtnetlink) from 1.4.0 to 1.4.1.
- [Release notes](https://github.com/jsimonetti/rtnetlink/releases)
- [Commits](jsimonetti/rtnetlink@v1.4.0...v1.4.1)

---
updated-dependencies:
- dependency-name: github.com/jsimonetti/rtnetlink
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

fix hwmon nil ptr (prometheus#2873)

* fix hwmon nil ptr

syslink maybe lost in some cases.

---------

Signed-off-by: TaoGe <6657718+yowenter@users.noreply.github.com>

Fix hwmon error capture (prometheus#2915)

Fix golangci-lint "ineffectual assignment" by correctly capturing any
errors within the hwmon gathering loop.

Signed-off-by: Ben Kochie <superq@gmail.com>

Update common Prometheus files (prometheus#2917)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

Revert "Add ZFS freebsd per dataset stats (prometheus#2753)" (prometheus#2925)

This reverts commit f34aaa6.

Signed-off-by: Caleb Webber <caleb@codingthemsoftly.com>

filesystem: fix mountTimeout not working issue (prometheus#2903)

Signed-off-by: DongWei <jiangxuege@hotmail.com>

Fix description for NodeDiskIOSaturation alert (prometheus#2929)

NodeDiskIOSaturation description should say 30m per the "for" clause

Signed-off-by: Taylor Sly <slyt@users.noreply.github.com>

Enforce no subprocess policy (prometheus#2926)

Add depguard to golangci-lint to enforce the no-os/exec policy.

Signed-off-by: Ben Kochie <superq@gmail.com>

filesystem: surface device errors (prometheus#2923)

filesystem: surface filesystem device error

Fixes: prometheus#2918
---------

Signed-off-by: Pamela Mei i540369 <pamela.mei@sap.com>

Revert "filesystem: fix mountTimeout not working issue (prometheus#2903)" (prometheus#2932)

This reverts commit 9f1f791.

Signed-off-by: Ben Kochie <superq@gmail.com>

Update common Prometheus files (prometheus#2939)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

Update common Prometheus files (prometheus#2946)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

Update common Prometheus files (prometheus#2949)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

Add multi-cluster support for Nodes dashboard (prometheus#2945)

Signed-off-by: Adrian Berger <adria.berger94@gmail.com>

disable selinux,fix end-to-end-test.sh error(prometheus#2934) (prometheus#2937)

Signed-off-by: heyitao <heyitao@uniontech.com>
Co-authored-by: heyitao <heyitao@uniontech.com>

Add new collector and metrics for watchdog (prometheus#2309) (prometheus#2880)

Signed-off-by: Gavin Lam <gavin.oss@tutamail.com>

Enable watchdog module by default; Add no data error (prometheus#2953)

Signed-off-by: Gavin Lam <gavin.oss@tutamail.com>

Update common Prometheus files (prometheus#2954)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

build(deps): bump google.golang.org/protobuf from 1.32.0 to 1.33.0 (prometheus#2955)

Bumps google.golang.org/protobuf from 1.32.0 to 1.33.0.

---
updated-dependencies:
- dependency-name: google.golang.org/protobuf
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Update common Prometheus files (prometheus#2959)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

Sanitize ethtool metric name keys

Apply the same metric name sanitization to the keys as to the metric
names. This avoids conflicting help strings in the metric registry.

Fixes: prometheus#2893

Signed-off-by: Ben Kochie <superq@gmail.com>

Update common Prometheus files

Signed-off-by: prombot <prometheus-team@googlegroups.com>

chore: fix some typos (prometheus#2974)

Signed-off-by: occupyhabit <wangmengjiao@outlook.com>

collector/textfile: Avoid inconsistent help-texts (prometheus#2962)

Avoid metrics with inconsistent help-texts. The earlier behaviour has
been preserved in the sense that the first encountered instance is still
used to generate metrics, whereas the subsequent inconsistent ones are
ignored along with a few peripheral changes.

```
 # HELP node_scrape_collector_duration_seconds node_exporter: Duration of a collector scrape.
 #TYPE node_scrape_collector_duration_seconds gauge
 node_scrape_collector_duration_seconds{collector="textfile"} 0.0004005
 # HELP node_scrape_collector_success node_exporter: Whether a collector succeeded.
 # TYPE node_scrape_collector_success gauge
 node_scrape_collector_success{collector="textfile"} 1
 # HELP node_textfile_mtime_seconds Unixtime mtime of textfiles successfully read.
 # TYPE node_textfile_mtime_seconds gauge
 node_textfile_mtime_seconds{file="/Users/rexagod/repositories/misc/node_exporter/ne-bar.prom"} 1.710812009e+09
 node_textfile_mtime_seconds{file="/Users/rexagod/repositories/misc/node_exporter/ne-foo.prom"} 1.710811982e+09
 # HELP node_textfile_scrape_error 1 if there was an error opening or reading a file, 0 otherwise
 # TYPE node_textfile_scrape_error gauge
 node_textfile_scrape_error 1
 # HELP promhttp_metric_handler_errors_total Total number of internal errors encountered by the promhttp metric handler.
 # TYPE promhttp_metric_handler_errors_total counter
 promhttp_metric_handler_errors_total{cause="encoding"} 0
 promhttp_metric_handler_errors_total{cause="gathering"} 0
 # HELP promhttp_metric_handler_requests_in_flight Current number of scrapes being served.
 # TYPE promhttp_metric_handler_requests_in_flight gauge
 promhttp_metric_handler_requests_in_flight 1
 # HELP promhttp_metric_handler_requests_total Total number of scrapes by HTTP status code.
 # TYPE promhttp_metric_handler_requests_total counter
 promhttp_metric_handler_requests_total{code="200"} 0
 promhttp_metric_handler_requests_total{code="500"} 0
 promhttp_metric_handler_requests_total{code="503"} 0
 # HELP tau_infrastructure_performing_maintenance_task At what timestamp a given task started or stopped, the last time it was run.
 # TYPE tau_infrastructure_performing_maintenance_task gauge
 tau_infrastructure_performing_maintenance_task{main_task="nightly",start_or_stop="start",sub_task="main"} 1.64728080198446e+09
```

Fixes: prometheus#2317

Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>

Update common Prometheus files (prometheus#2973)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

zfs: Log mib when sysctl read fails on FreeBSD

When the zfs collector fails on FreeBSD it doesn't log which `mib` triggered the issue. This makes diagnostics hard.

Incompatibilities in the list of supported mibs is not uncommon with major os updates. By adding this change, it'll be easier for users to report the specific mib that is triggering the failure.

Related to prometheus#2847

Signed-off-by: Daniel Kimsey <90741+dekimsey@users.noreply.github.com>

chore: fix typo in comment

Signed-off-by: looklose <shishuaiqun@yeah.net>

fibre_channel: update procfs to take into account optional attributes (prometheus#2933)

Signed-off-by: machine424 <ayoubmrini424@gmail.com>

refactor: Optimize code by using built-in constants in the standard library (prometheus#2989)

Signed-off-by: coderwander <770732124@qq.com>

os_release.go: Removed caching of modtime/filename of os-release file. (prometheus#2987)

Signed-off-by: Jonathan Davies <jpds@protonmail.com>

fix: data race of NetClassCollector metrics initialization when multiple requests happen (prometheus#2995)

Signed-off-by: John Guo <john@johng.cn>

Update common Prometheus files (prometheus#2992)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

Update build (prometheus#3000)

* Update Go to 1.22.
* Update Go modules.
* Use new version collector.
* Use standard library slices package.

Signed-off-by: Ben Kochie <superq@gmail.com>

Fix watchdog_test lint and test failures on macos. (prometheus#3003)

Ensure identical build flags embedded in both files.

Signed-off-by: Chris Cleeland <chris.cleeland@gmail.com>

Release v1.8.0 (prometheus#3002)

* [CHANGE] exec_bsd: Fix labels for `vm.stats.sys.v_syscall` sysctl prometheus#2895
* [CHANGE] diskstats: Ignore zram devices on linux systems prometheus#2898
* [CHANGE] textfile: Avoid inconsistent help-texts  prometheus#2962
* [CHANGE] os: Removed caching of modtime/filename of os-release file prometheus#2987
* [FEATURE] xfrm: Add new collector prometheus#2866
* [FEATURE] watchdog: Add new collector prometheus#2880
* [ENHANCEMENT] cpu_vulnerabilities: Add mitigation information label prometheus#2806
* [ENHANCEMENT] nfsd: Handle new `wdeleg_getattr` attribute prometheus#2810
* [ENHANCEMENT] netstat: Add TCPOFOQueue to default netstat metrics prometheus#2867
* [ENHANCEMENT] filesystem: surface device errors prometheus#2923
* [ENHANCEMENT] os: Add support end parsing prometheus#2982
* [ENHANCEMENT] zfs: Log mib when sysctl read fails on FreeBSD prometheus#2975
* [ENHANCEMENT] fibre_channel: update procfs to take into account optional attributes prometheus#2933
* [BUGFIX] cpu: Fix debug log in cpu collector prometheus#2857
* [BUGFIX] hwmon: Fix hwmon nil ptr prometheus#2873
* [BUGFIX] hwmon: Fix hwmon error capture prometheus#2915
* [BUGFIX] zfs: Revert "Add ZFS freebsd per dataset stats prometheus#2925
* [BUGFIX] ethtool: Sanitize ethtool metric name keys prometheus#2940
* [BUGFIX] fix: data race of NetClassCollector metrics initialization prometheus#2995

Signed-off-by: Ben Kochie <superq@gmail.com>

Add logging for ethtool device include/exclude and metrics include flags (prometheus#2979)

Signed-off-by: Sam Leiken <sam.k.leiken@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants