Resource optimized placement strategy #8815

ledjon-behluli · 2024-01-11T23:32:13Z

This PR adds support for resource optimized placement strategy.

ResourceOptimizedPlacement is a placement strategy which attempts to optimize resource distribution across the cluster.

It assigns weights to runtime statistics to prioritize different resources and calculates a normalized score for each silo.
The silo with the lowest score is chosen for placing the activation. Normalization ensures that each property contributes proportionally to the overall score. Users can adjust the weights based on their specific requirements and priorities for load balancing.

In addition to normalization, an online adaptive algorithm provides a smoothing effect (filters out high frequency components) and avoids rapid signal drops by transforming it into a polynomial alike decay process. This contributes to avoiding resource saturation on the silos and especially newly joined silos.

Silos which are overloaded by definition of the load shedding mechanism are not considered as candidates for new placements.

When the local silo's score is within the preference margin of another remote silo, than the local silo is picked as the target.

Since there could be more than 1 silo that has the same exact score, we pick 1 of them randomly so that we don't continuously pick the first one, out of the shorted-listed once.

This strategy is 'static' because it will do the best possible placement considering the current view of the whole cluster, as we know this view may be change dramatically even if no new placement is requested, because of various business logic of the users code.
A 'dynamic' resource optimization may be attempted to rebalance the silos, but this is out of the scope of this PR as it is:

More complicated.
Should be part of a live-migration strategy, not a placement one.

Microsoft Reviewers: Open in CodeFlow

… make it a global acting strategy

…ehluli/orleans into resource-based-strategy

…load shedding mechanism

src/Orleans.Runtime/Placement/ResourceOptimizedPlacementDirector.cs

src/Orleans.Runtime/Configuration/Options/ResourceOptimizedPlacementOptions.cs

ledjon-behluli · 2024-01-12T22:34:42Z

@ReubenBond Here are some benchmarks for moded filter (passing processNoiseCovariance, vs it being a field), pretty much the same in speed (1000 measurements), but there is a drop of 50% in allocs. Albeit its like ~60 bytes and its alloc'd only once upon instantiation, so we save ~6Kb for 100 silos 😂.

Method	Mean	Error	StdDev	Median	Ratio	RatioSD	Allocated	Alloc Ratio
OriginalFilter	18.02 μs	0.784 μs	2.311 μs	17.14 μs	1.00	0.00	120 B	1.00
ModedFilter	14.87 μs	0.310 μs	0.910 μs	14.69 μs	0.84	0.10	56 B	0.47

ReubenBond · 2024-01-12T22:38:48Z

If we end up only using DualModeKalmanFilter<float>, then perhaps we should specialize it. Looking at the JIT output, it's quite a bit shorter, with fewer calls:

ledjon-behluli · 2024-01-12T22:47:31Z

If we end up only using DualModeKalmanFilter<float>, then perhaps we should specialize it. Looking at the JIT output, it's quite a bit shorter, with fewer calls:

Wouldn't dynamic PGO take care of de-virtualizing them once it "warms up"

ReubenBond · 2024-01-12T22:49:13Z

Wouldn't dynamic PGO take care of de-virtualizing them once it "warms up"

I don't know, but we can check. Do you have that benchmark code somewhere?

ledjon-behluli · 2024-01-12T22:51:20Z

Wouldn't dynamic PGO take care of de-virtualizing them once it "warms up"

I don't know, but we can check. Do you have that benchmark code somewhere?

Yeah here:

[SimpleJob, MemoryDiagnoser]
public class KalmanFilterBenchmarks
{
    double[] _measurements = new double[1000];

    [GlobalSetup]
    public void Setup() =>  _measurements = 
        Enumerable.Range(0, 1000).Select(_ => Random.Shared.NextDouble() * 99.8 + 0.1).ToArray();

    [Benchmark(Baseline = true)]
    public double OriginalFilter()
    {
        DualModeKalmanFilter<double> _originalFilter = new();
        double result = 0;
        foreach (double measurement in _measurements)
        {
            result = _originalFilter.Filter(measurement);
        }
        return result; 
    }

    [Benchmark]
    public double ModedFilter()
    {
        DualModeKalmanFilter_Moded<double> _modedFilter = new();
        double result = 0;
        foreach (double measurement in _measurements)
        {
            result = _modedFilter.Filter(measurement);
        }
        return result;
    }
}

ReubenBond · 2024-01-12T23:00:50Z

| Method     | Mean     | Error     | StdDev    | Ratio | Code Size | Allocated | Alloc Ratio |
|----------- |---------:|----------:|----------:|------:|----------:|----------:|------------:|
| NonGeneric | 7.512 us | 0.0093 us | 0.0082 us |  1.00 |     460 B |      40 B |        1.00 |
| Generic    | 9.770 us | 0.0224 us | 0.0209 us |  1.30 |     376 B |      40 B |        1.00 |

Generic is 30% slower than non-generic on my machine. The alloc is the filter instance itself. Unsure about the code size, but my guess would be that non-generic inlines more.

ledjon-behluli · 2024-01-12T23:01:16Z

@ReubenBond non-generic is a bit faster

Method	Mean	Error	StdDev	Median	Ratio	RatioSD	Allocated	Alloc Ratio
OriginalFilter	19.48 μs	1.130 μs	3.331 μs	19.24 μs	1.00	0.00	104 B	1.00
ModedFilter	15.53 μs	0.418 μs	1.231 μs	15.47 μs	0.82	0.16	40 B	0.38
ModedFilter_NonGeneric	11.39 μs	0.327 μs	0.889 μs	11.11 μs	0.60	0.10	40 B	0.38

ReubenBond · 2024-01-12T23:01:50Z

I think we should go with non-generic for now

ledjon-behluli · 2024-01-12T23:02:09Z

Posted at the same time 😂

ledjon-behluli · 2024-01-12T23:03:15Z

I think we should go with non-generic for now

Agree, even for a RAM of 256GB, thats 2.56e+11 which is still well within the range of float's 3.4e+38

src/Orleans.Runtime/Placement/ResourceOptimizedPlacementDirector.cs

src/Orleans.Runtime/Configuration/Options/ResourceOptimizedPlacementOptions.cs

…on ctor

src/Orleans.Runtime/Placement/ResourceOptimizedPlacementDirector.cs

…= 0f && score <= 1f

src/Orleans.Runtime/Placement/ResourceOptimizedPlacementDirector.cs

src/Orleans.Runtime/Configuration/Options/ResourceOptimizedPlacementOptions.cs

…ke it easier for the users to understand + add comments to explain that weights are relative to each other + modified the director to take into account potential totalWeight = 0 + removed config exception throwing if sum = 0; as the score will be 0 but due to the jitter it will act as it were RandomPlacement

ledjon-behluli · 2024-01-13T22:18:13Z

@ReubenBond I've made some small little fixes, added some comments, and switched options to take int instead of float to make it more natural for the end users.

The weights don't strictly need to have a hard upper limit (currently 100) due to normalization, but I believe its better to place a boundary for the sake of sanity. This is debatable ofc!

Other than the above "issue", I don't see any further things we need to do, please let me know if you have something else in mind, otherwise this LGTM and is ready for merging.

ledjon-behluli · 2024-01-14T22:48:49Z

Update:

I've changed ResourceStatistics to contain only non-nullable elements for 2 reasons:

It simplified CalculateScore by not having to perform null checking. Which is no problem as the logic remains the same and is correct. CDM-KF either way was treating nulls as 0s.
By removing the nullable types from this struct we make its size go down from 56 bytes -> 32 bytes.

It would be expected that 'float?' has a size of 5 = 4 (float) + 1 (hasValue), but the alignment of the type is the size of its largest field.
For 'float?' -> 4 (float) + 1 (hasValue) + 3 [padding to reach largest field i.e. 'float' (4 bytes)] = 8 (total)
For 'long?' -> 8 (long) + 1 (hasValue) + 7 [padding to reach largest field i.e. 'long' (8 bytes)] = 16 (total)
Total (nullable): 8 (float?) + 8 (float?) + 16 (long?) + 16 (long?) + 1 (bool) + 7 (padding) = 56 bytes
Total (non-nullable): 4 (float) + 4 (float) + 8 (long) + 8 (long) + 1 (bool) + 7 (padding) = 32 bytes

Applied packing to the struct to shave off an extra 7 bytes, making it at the end 25 bytes

This will help increase the number of ValueTuple<int, ResourceStatistics> (inside MakePick) that can be stack allocated from 64 -> 128 (at 4KB stack limit), therefor covering clusters with up to 128 silos, before switching to ArrayPool.

…pying

Ledjon Behluli added 26 commits January 10, 2024 00:31

wip

a50c16a

wip

804d767

added neccessary serives to DefaultSiloServices

737c1ba

wip

621ccf4

.

8990407

fixing some validations

76dfc7d

reincoorporated physical ram

2cdbb09

some tests for options and validator, and XML docs

ad4b5c2

renamed file

c3831d2

made ResourceOptimizedPlacement strategy public in case users want to…

e52ff54

… make it a global acting strategy

wip

aceadec

wip

7c4cd12

added neccessary serives to DefaultSiloServices

74d8c87

wip

e825819

.

66a31ec

fixing some validations

0a7237c

reincoorporated physical ram

cb774bd

some tests for options and validator, and XML docs

2a0622c

renamed file

7ee2a95

made ResourceOptimizedPlacement strategy public in case users want to…

8f6d36a

… make it a global acting strategy

Merge branch 'resource-based-strategy' of https://github.com/ledjon-b…

599046b

…ehluli/orleans into resource-based-strategy

removed IlocalSiloDetails as there is no need for it

32bc695

Merge branch 'main' into merging

eac44aa

xml docs

8fce9f8

again xml docs

db27b61

incoorporated support for avoiding overloaded silos by definition of …

cc863e0

…load shedding mechanism

ReubenBond reviewed Jan 11, 2024

View reviewed changes

src/Orleans.Runtime/Placement/ResourceOptimizedPlacementDirector.cs Show resolved Hide resolved

ReubenBond reviewed Jan 11, 2024

View reviewed changes

src/Orleans.Runtime/Configuration/Options/ResourceOptimizedPlacementOptions.cs Outdated Show resolved Hide resolved

ReubenBond reviewed Jan 11, 2024

View reviewed changes

src/Orleans.Runtime/Configuration/Options/ResourceOptimizedPlacementOptions.cs Show resolved Hide resolved

ReubenBond reviewed Jan 11, 2024

View reviewed changes

src/Orleans.Runtime/Configuration/Options/ResourceOptimizedPlacementOptions.cs Outdated Show resolved Hide resolved

switched to non-generic CDM-KF

77414bc

ReubenBond reviewed Jan 13, 2024

View reviewed changes

src/Orleans.Runtime/Placement/ResourceOptimizedPlacementDirector.cs Outdated Show resolved Hide resolved

Normalize weights & fix LocalSiloPreferenceMargin

1a1c5aa

ReubenBond reviewed Jan 13, 2024

View reviewed changes

src/Orleans.Runtime/Configuration/Options/ResourceOptimizedPlacementOptions.cs Outdated Show resolved Hide resolved

added some comments + fixed _localSiloPreferenceMargin not being set …

407a7ef

…on ctor

ledjon-behluli commented Jan 13, 2024

View reviewed changes

src/Orleans.Runtime/Placement/ResourceOptimizedPlacementDirector.cs Show resolved Hide resolved

Ledjon Behluli added 2 commits January 13, 2024 15:04

rearranged CalculateScore code a bit, and added assertion for score >…

28e3068

…= 0f && score <= 1f

fix potential for a 'DivideByZeroException'

a807d8a

ledjon-behluli commented Jan 13, 2024

View reviewed changes

src/Orleans.Runtime/Placement/ResourceOptimizedPlacementDirector.cs Outdated Show resolved Hide resolved

ledjon-behluli commented Jan 13, 2024

View reviewed changes

src/Orleans.Runtime/Configuration/Options/ResourceOptimizedPlacementOptions.cs Outdated Show resolved Hide resolved

perf improvements

145c38d

added 'in' modifier to CalculateScore to avoid potential defensive co…

e707edc

…pying

ledjon-behluli changed the title ~~Static resource optimized placement strategy~~ Resource optimized placement strategy Jan 15, 2024

ReubenBond approved these changes Jan 16, 2024

View reviewed changes

ReubenBond merged commit 2e7714c into dotnet:main Jan 16, 2024
19 checks passed

ledjon-behluli mentioned this pull request Jan 16, 2024

Provide cross-platform environment statistics collection + Modify OverloadDetector to account for memory too #8820

Merged

github-actions bot locked and limited conversation to collaborators Feb 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resource optimized placement strategy #8815

Resource optimized placement strategy #8815

ledjon-behluli commented Jan 11, 2024 •

edited

ledjon-behluli commented Jan 12, 2024 •

edited

ReubenBond commented Jan 12, 2024

ledjon-behluli commented Jan 12, 2024

ReubenBond commented Jan 12, 2024 •

edited

ledjon-behluli commented Jan 12, 2024 •

edited

ReubenBond commented Jan 12, 2024

ledjon-behluli commented Jan 12, 2024

ReubenBond commented Jan 12, 2024

ledjon-behluli commented Jan 12, 2024

ledjon-behluli commented Jan 12, 2024

ledjon-behluli commented Jan 13, 2024

ledjon-behluli commented Jan 14, 2024

Resource optimized placement strategy #8815

Resource optimized placement strategy #8815

Conversation

ledjon-behluli commented Jan 11, 2024 • edited

Microsoft Reviewers: Open in CodeFlow

ledjon-behluli commented Jan 12, 2024 • edited

ReubenBond commented Jan 12, 2024

ledjon-behluli commented Jan 12, 2024

ReubenBond commented Jan 12, 2024 • edited

ledjon-behluli commented Jan 12, 2024 • edited

ReubenBond commented Jan 12, 2024

ledjon-behluli commented Jan 12, 2024

ReubenBond commented Jan 12, 2024

ledjon-behluli commented Jan 12, 2024

ledjon-behluli commented Jan 12, 2024

ledjon-behluli commented Jan 13, 2024

ledjon-behluli commented Jan 14, 2024

ledjon-behluli commented Jan 11, 2024 •

edited

ledjon-behluli commented Jan 12, 2024 •

edited

ReubenBond commented Jan 12, 2024 •

edited

ledjon-behluli commented Jan 12, 2024 •

edited