Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support time based retention #17413

Merged
merged 33 commits into from
Jun 7, 2024
Merged

Conversation

stelfrag
Copy link
Collaborator

@stelfrag stelfrag commented Apr 16, 2024

Summary

Fixes: #17082

Changes

  • Default tiers enabled 3
    • Tier 0 - high resolution (update every)
    • Tier 1 - per minute
    • Tier 2 - per hour
    • Configure disk space using dbengine tier x disk space MB = nnnn
      • eg For tier 0, use dbengine tier 0 disk space MB = 1024
    • Configure time retention using dbengine tier x retention days = nnnn
      • eg For tier 0, use dbengine tier 0 retention days = 14
  • Backfill option is now global for all tiers (none, full, new)
    • option dbengine tier backfill
  • dbengine tier x disk space MB options can be 0 to use as much space needed for the desired time based retention
    • The remaining disk space for each tier (minus 5%) will be used for the dbengine retention chart (and alert)

The disk space related options

  • dbengine multihost disk space MB
  • dbengine disk space MB

have been removed. The dbengine multihost disk space MB is automatically renamed to dbengine tier 0 disk space MB

Similarly to configure db space for other tiers (up to 5) eg

  • dbengine tier 0 disk space MB
  • dbengine tier 1 disk space MB
  • dbengine tier 2 disk space MB
  • dbengine tier 3 disk space MB
  • dbengine tier 4 disk space MB

You can disable disk space checks (use available space) by specifying 0 as disk space eg.

  • dbengine tier 0 disk space MB = 0

Defaults

The default settings on a newly installed agent when no options are set are:

  • High resolution tier (tier 0)
    • Default diskspace 1GB or 14 days of data
    • Store values per update every (default = 1)
  • Tier 1
    • Default diskspace 1GB or 90 days of data
    • Store values per minute
  • Tier 2
    • Default diskspace 1GB or 2 years (2 x 365 days) of data
    • Store values per hour
  • Retention charts that store the percentage of space and time used (vs the configured values)

Time retention not automatically activated

If the following options have been specified in the config ([db] section) the retention days will not be
automatically activated. In this case the dbengine tier x retention days option must be used.

  • dbengine multihost disk space MB
  • dbengine tier 1 update every iterations
  • dbengine tier 2 update every iterations
  • dbengine tier 3 update every iterations
  • dbengine tier 4 update every iterations
  • dbengine tier 1 disk space MB
    • or dbengine tier 1 multihost disk space MB
  • dbengine tier 2 disk space MB
    • or dbengine tier 2 multihost disk space MB
  • dbengine tier 3 disk space MB
    • or dbengine tier 3 multihost disk space MB
  • dbengine tier 4 disk space MB
    • or dbengine tier 4 multihost disk space MB

@stelfrag stelfrag force-pushed the limit_tier0_update_every branch 3 times, most recently from db8ff30 to 56f2033 Compare April 23, 2024 11:36
@stelfrag stelfrag force-pushed the limit_tier0_update_every branch 2 times, most recently from d280a21 to fbec2cb Compare April 29, 2024 06:35
@ilyam8 ilyam8 force-pushed the limit_tier0_update_every branch from fb168d5 to b3f487e Compare May 6, 2024 10:26
@stelfrag stelfrag force-pushed the limit_tier0_update_every branch 2 times, most recently from 15c5f52 to 6dc60fa Compare May 13, 2024 10:57
src/daemon/main.c Outdated Show resolved Hide resolved
@thiagoftsm
Copy link
Contributor

thiagoftsm commented May 16, 2024

@stelfrag I ran different tests with this PR, and I did not observe anything anomalous:

  • I compiled on a host without previous netdata installation
  • I tested on a host with netdata default options running and the PR ran normally; After this I changed the default collection time to one invalid, and I had the error message. Netdata continued running as expected.
  • I also tested using ram mode to be sure nothing was changed with it.

Tests were done compiling with netdata-installer.

@ktsaou
Copy link
Member

ktsaou commented May 16, 2024

@stelfrag please rebase this to test it.

@stelfrag stelfrag force-pushed the limit_tier0_update_every branch 6 times, most recently from d84d12b to 3fe699b Compare June 5, 2024 19:55
stelfrag and others added 20 commits June 6, 2024 11:49
Add new dbengine tier 0 multihost disk space MB
Time based retention defaults to disabled if all parameters are default
Co-authored-by: Ilya Mashchenko <ilya@netdata.cloud>
…ngine tier 0 multihost disk space MB" is not specified
Calculated disk space matches the calculated used for datafile rotation
Fix percentage contribution calculation of metadata to each tier
…ion charts vs space reported in /api/v2/node_instances
RRDENG_MIN_DISK_SPACE_MB 256 MB (from 64)
dbengine multihost disk space MB maps to dbengine tier 0 disk space MB
Disk space for tiers "dbengine tier X disk space MB"
… MB.

If the value is non zero for tier 0, it must be at least 256 MB
@stelfrag stelfrag marked this pull request as draft June 6, 2024 20:45
@stelfrag stelfrag marked this pull request as ready for review June 7, 2024 13:12
@stelfrag stelfrag merged commit 1aa8a3b into netdata:master Jun 7, 2024
144 of 145 checks passed
@stelfrag stelfrag deleted the limit_tier0_update_every branch June 10, 2024 07:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feat]: Retention in days with cap in size
4 participants