[RLlib] Add off-policy'ness metric to new API stack. #48227

sven1977 · 2024-10-23T19:37:09Z

Add off-policy'ness metric to new API stack.

The old stack has a convenient metric measuring how many policy updates the sampling/behavior policy is behind the actually trained one. For algorithms like APPO and IMPALA, this metrics is crucial for understanding learning deficiencies and its values should be as close to 1 as possible (1 update behind).

The metric can be found in the result dict under (analogous to the old API stack's metric):
learners/[module_id]/diff_num_grad_updates_vs_sampler_policy

Why are these changes needed?

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…off_policyness_metric_to_new_api_stack

Signed-off-by: sven1977 <svenmika1977@gmail.com>

simonsays1980

LGTM.

simonsays1980 · 2024-10-24T16:05:34Z

rllib/core/learner/learner.py

@@ -98,21 +98,6 @@
 LR_KEY = "learning_rate"


-@dataclass


Finally this thing is gone :)

Signed-off-by: JP-sDEV <jon.pablo80@gmail.com>

Signed-off-by: mohitjain2504 <mohit.jain@dream11.com>

wip

Loading
Loading status checks…

2342bc9

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 requested a review from simonsays1980 as a code owner October 23, 2024 19:37

sven1977 enabled auto-merge (squash) October 24, 2024 09:42

github-actions bot added the go label Oct 24, 2024

sven1977 added 2 commits October 24, 2024 11:47

Merge branch 'master' of https://github.com/ray-project/ray into add_…

Loading
Loading status checks…

3707a13

…off_policyness_metric_to_new_api_stack

wip

Loading
Loading status checks…

e708c21

Signed-off-by: sven1977 <svenmika1977@gmail.com>

github-actions bot disabled auto-merge October 24, 2024 12:34

sven1977 enabled auto-merge (squash) October 24, 2024 13:38

sven1977 added tests-ok rllib rllib-newstack labels Oct 24, 2024

simonsays1980 approved these changes Oct 24, 2024

View reviewed changes

rllib/core/learner/learner.py

@@ -98,21 +98,6 @@

LR_KEY = "learning_rate"

@dataclass

Copy link

Collaborator

simonsays1980 Oct 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally this thing is gone :)

sven1977 merged commit 2673bf3 into ray-project:master Oct 24, 2024
6 checks passed

sven1977 deleted the add_off_policyness_metric_to_new_api_stack branch October 24, 2024 17:54

Jay-ju pushed a commit to Jay-ju/ray that referenced this pull request Nov 5, 2024

[RLlib] Add off-policy'ness metric to new API stack. (ray-project#48227)

f14706b

JP-sDEV pushed a commit to JP-sDEV/ray that referenced this pull request Nov 14, 2024

[RLlib] Add off-policy'ness metric to new API stack. (ray-project#48227)

d4a5ec4

Signed-off-by: JP-sDEV <jon.pablo80@gmail.com>

mohitjain2504 pushed a commit to mohitjain2504/ray that referenced this pull request Nov 15, 2024

[RLlib] Add off-policy'ness metric to new API stack. (ray-project#48227)

94aea7d

Signed-off-by: mohitjain2504 <mohit.jain@dream11.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] Add off-policy'ness metric to new API stack. #48227

[RLlib] Add off-policy'ness metric to new API stack. #48227

sven1977 commented Oct 23, 2024

simonsays1980 left a comment

simonsays1980 Oct 24, 2024

[RLlib] Add off-policy'ness metric to new API stack. #48227

[RLlib] Add off-policy'ness metric to new API stack. #48227

Conversation

sven1977 commented Oct 23, 2024

Why are these changes needed?

Related issue number

Checks

simonsays1980 left a comment

Choose a reason for hiding this comment

simonsays1980 Oct 24, 2024

Choose a reason for hiding this comment