Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide "what changed" array (of Booleans) to derived stores #6777

Open
rmunn opened this issue Sep 27, 2021 · 11 comments · May be fixed by #8656
Open

Provide "what changed" array (of Booleans) to derived stores #6777

rmunn opened this issue Sep 27, 2021 · 11 comments · May be fixed by #8656

Comments

@rmunn
Copy link
Contributor

rmunn commented Sep 27, 2021

Describe the problem

When a derived store has multiple parent stores, its callback might sometimes want to know which parent store changed and triggered the callback. This would mostly be useful in advanced callbacks that use set (and, if #6750 is merged, update) to calculate their value.

Describe the proposed solution

The derived store's callback could take an optional parameter after set (and after update, if #6750 is merged), which would be an array of Booleans describing which parent store(s) were updated. The Booleans would be in the same order as the parent stores, i.e. if stores[i] was updated, then updated[i] (or changed[i], see naming discussion below) would be true.

This would actually be quite simple to implement, looking something like this:

diff --git a/src/runtime/store/index.ts b/src/runtime/store/index.ts
index 51a13b85e..a67f11fe2 100644
--- a/src/runtime/store/index.ts
+++ b/src/runtime/store/index.ts
@@ -182,6 +182,7 @@ export function derived<T>(stores: Stores, fn: Function, initial_value?: T): Rea
                const values = [];
 
                let pending = 0;
+               let changed = [];
                let cleanup = noop;
 
                const sync = () => {
@@ -189,7 +190,8 @@ export function derived<T>(stores: Stores, fn: Function, initial_value?: T): Rea
                                return;
                        }
                        cleanup();
-                       const result = fn(single ? values[0] : values, set, update);
+                       const result = fn(single ? values[0] : values, set, update, single ? changed[0] : changed);
+                       changed.fill(false);
                        if (auto) {
                                set(result as T);
                        } else {
@@ -202,6 +204,7 @@ export function derived<T>(stores: Stores, fn: Function, initial_value?: T): Rea
                        (value) => {
                                values[i] = value;
                                pending &= ~(1 << i);
+                               changed[i] = true;
                                if (inited) {
                                        sync();
                                }

Of course, type definitions would also need to change, so the actual PR would have more to it than that. But the core of this feature can be implemented in just four lines of code.

Alternatives considered

Some use cases for this feature can be handled with the status quo:

// Can't do this yet:
const discountedPrice = derived(
  [price, discount],
  ([$price, $discount], set, [priceChanged, discountChanged]) => {
  if (priceChanged) {
    console.log('we promise to keep discounted price the same even when regular price changes');
  }
  if (discountChanged) {
    console.log('new discount is calculated from current price');
    set($price - $discount);
  }
});

// Have to do this instead:
const manualDiscounts = (() => {
  let currentPrice;
  price.subscribe(p => currentPrice = p);
  return derived(discount, $discount => currentPrice - $discount);
})();

Perhaps not the best example, as the manualDiscounts store is simpler to read than the version with a changed array. But imagine a complex state being stored in the derived store: the current user, the shopping cart contents, and data about the product being viewed right now (product description, review scores, top five reviews...).

Importance

nice to have

Miscellaneous information

The first version of this proposal asked for a bitmap instead of a Boolean array. But once I realized that array destructuring would allow giving names to the items in the array, so that you could write [priceChanged, discountChanged] and then reference priceChanged instead of changed[0], the Boolean array version of the idea became much, much better, so I abandoned the bitmap idea.

@rmunn
Copy link
Contributor Author

rmunn commented Sep 27, 2021

Like #6750, this would probably be a breaking change for anyone creating custom derived stores, because the Typescript definition of the callback passed to derived would end up changing. But existing code that actually uses derived stores would not be broken, because you can always accept fewer arguments in a callback than are actually passed to it (c.f. Array.map and so on).

@Prinzhorn
Copy link
Contributor

The answer is probably performance/gc (is it though?), but why not pass an array like [true, false] and check changed[0]? This also aligns with the array indices of [$price, $discount], which could become useful if you dynamically create derived stores. To me using bit masks in JavaScript is incredibly rare and relying on falsyness also feels dirty to me (I remember when people, who presumably never heard of code quality, used to do ~a.indexOf(b) and thought it was a good idea). In addition your example neatly conceals that the third check would be changed & 4 and not changed & 3. So the ux of this API does not feel Svelte at all.

@rmunn
Copy link
Contributor Author

rmunn commented Sep 28, 2021

I thought of this while working on implementing #6750, when I saw that the code tracks a pending bitmap. The advantage of a bitmap over a boolean array is indeed size, but you're right that a boolean array would be more user-friendly. And the GC issues would be nullified if the array was re-used the way the values array gets re-used. It would need to be documented — "Treat the boolean array as read-only, and if you modify it, you have only yourself to blame if your store acts funny" — but it would work just fine to pass [false, false, true] instead of a bitmap equal to 4.

The only downside to a boolean array, then, is a performance cost that I think would be slight, but would be worth measuring. Every derived store would create an extra array of N booleans, where N is the number of stores the derived store is subscribing to. Would that extra array per store, and the work done to keep it up-to-date, be worth the performance cost? The performance cost of one extra int per store would be minimal, but I don't know the performance cost of an extra array. I suspect the better DX will make it worth it, though; I'll think about it.

In addition your example neatly conceals that the third check would be changed & 4 and not changed & 3.

I actually had a third store in my original example and then realized that the example was getting bigger than it needed to be. Missed the fact that it would hide the bitmap-ness of the check needed.

@Prinzhorn
Copy link
Contributor

Please take anything I say with a grain of salt, I've never needed a derived store and it's my first time looking at the store source.

when I saw that the code tracks a pending bitmap

I figured that and on first sight it does seem like a viable idea. I don't know why the original code went with a bitmap, I assume nobody tried to use a derived store with more than 32 stores yet? Or maybe people do and don't know they run into undefined behavior? Or am I missing something and it doesn't break with 33 stores? And it's entirely plausible that there are situations (like in a game or simulation) where you automatically create derived stores of dynamic length > 32.

"you have only yourself to blame if your store acts funny"

Does it act funny? I thought the store would use this array write-only? Reset it using changed.fill(false) and then update it changed[i] = true. So if the user touches it, all that happens is that they might get their touched values back in the next call. Can this even happen, isn't the entire derived store code synchronous, so between fill() and [i] = true no user gets the chance to touch the thing? Also that's already the case right now with the values array, it could be modified by a user, right?

The advantage of a bitmap over a boolean array is indeed size
The only downside to a boolean array, then, is a performance cost that I think would be slight, but would be worth measuring

I'd be surprised if the difference can be measured, but I'm curious if you really want to set up a benchmark. But looking at the current implementation there is also a values array even if single === true, so I don't think these micro optimizations matter.

Every derived store would create an extra array of N booleans, where N is the number of stores the derived store is subscribing to

Only track it if fn.length === 3?

const auto = fn.length < 2;

@rmunn
Copy link
Contributor Author

rmunn commented Sep 28, 2021

The bit that would break with more than 32 stores is the pending bitmap, which is only used to ensure that the "diamond dependency" problem doesn't cause undesired updates. It's hard to explain without an example, and the best example is probably the unit test in test/store/index.ts:

svelte/test/store/index.ts

Lines 234 to 254 in e45d180

it('prevents glitches', () => {
const lastname = writable('Jekyll');
const firstname = derived(lastname, n => n === 'Jekyll' ? 'Henry' : 'Edward');
const fullname = derived([firstname, lastname], names => names.join(' '));
const values = [];
const unsubscribe = fullname.subscribe(value => {
values.push(value);
});
lastname.set('Hyde');
assert.deepEqual(values, [
'Henry Jekyll',
'Edward Hyde'
]);
unsubscribe();
});

See how the lastname store updating causes a single update in the fullname store, where both firstname and lastname are updated at once? That's the purpose of the pending bitmap. Try removing the code that sets bits in the pending bitmap and you'll see that that test will fail. That's probably the best way to understand what that bitmap is doing.

So if derived store C were deriving from 33 or more stores, including store A in the 33rd or later position, and it also happens to be deriving from store B in the 33rd or later position, and store B derives from store A, then there would be a glitch where store A updating would cause store C to see store A's new value with store B's old value once, then a second update with store A's new value with store B's new value.

Was that paragraph as clear as mud? :-) Then you'll see why nobody has run into this issue yet, because it's something that could almost never happen. Though perhaps it would be best to document this situation and say "There's a practical limit of 32 stores" or something.

@Prinzhorn
Copy link
Contributor

@rmunn thanks for the explanation! I also dug into the store code a bit. I was a little confused about the second argument to subscribe, which is not documented anywhere. But now the whole thing makes sense with invalidating the store to make derived stores work.

Also TIL the iteration order of Set is stable and matches insertion order.

@rmunn rmunn changed the title Provide "what changed" bitmap to derived stores Provide "what changed" array (of Booleans) to derived stores Sep 29, 2021
@rmunn
Copy link
Contributor Author

rmunn commented Sep 29, 2021

Just realized that with a Boolean array, you can use array dereferencing just as you can with store values. So instead of writing if (changed[0]) you can write if (priceChanged). That's way better DX, and is a big enough win over bitmaps that my hesitation over Boolean arrays is just gone. I've updated the feature request to ask for Boolean arrays rather than bitmaps.

@WHenderson
Copy link

The bit that would break with more than 32 stores is the pending bitmap, which is only used to ensure that the "diamond dependency" problem doesn't cause undesired updates. It's hard to explain without an example, and the best example is probably the unit test in test/store/index.ts:

svelte/test/store/index.ts

Lines 234 to 254 in e45d180

it('prevents glitches', () => {
const lastname = writable('Jekyll');
const firstname = derived(lastname, n => n === 'Jekyll' ? 'Henry' : 'Edward');
const fullname = derived([firstname, lastname], names => names.join(' '));
const values = [];
const unsubscribe = fullname.subscribe(value => {
values.push(value);
});
lastname.set('Hyde');
assert.deepEqual(values, [
'Henry Jekyll',
'Edward Hyde'
]);
unsubscribe();
});

FYI, here is a diagram of a diamond dependency:

  • As soon as a changes, b and c are recalculated.
  • As soon as b or c changes, d is recalculated.
graph TD
    a --> b --> d
    a --> c --> d

Tracking "pending" and the undocumented second argument to subscribe is for ensuring d doesn't get calculated twice here.

Worth noting that it doesn't handle what I call the "diamond+ dependency" problem:

graph TD
    a       --> d
    a --> b --> d
    a --> c --> d

in this case, when a changes, svelte stores will recalculate d more than necessary. I should probably raise a bug about this...

Pulled from the documentation of one of my store libraries. See: @crikey/stores-base/README.md for more dependency examples

@WHenderson
Copy link

I can see utility with this proposal.

I would suggest some discussion around the callback signature though:

With the proposed signature (values, set, update, changed), users are (almost1) forced to declare their callbacks as asynchronous. What if a user want's a simple synchronous derived value (one that returns the calculated result directly) but also wants to interrogate the changed argument?

Furthermore, I do wonder if this belongs in the basic implementation of derived. It's use cases are niche enough that it may be better suited to a separate library?

1 a user could access the provided argument via the arguments array. but.. yuck

@rmunn rmunn linked a pull request May 29, 2023 that will close this issue
5 tasks
@rmunn
Copy link
Contributor Author

rmunn commented May 29, 2023

PR #6786 implementing this change has been replaced with PR #8656.

@rmunn
Copy link
Contributor Author

rmunn commented May 29, 2023

With the proposed signature (values, set, update, changed), users are (almost1) forced to declare their callbacks as asynchronous. What if a user want's a simple synchronous derived value (one that returns the calculated result directly) but also wants to interrogate the changed argument?

I agree that discussion about the callback signature is useful. However, there's nothing wrong with not using a parameter, and calling it _ as a standard convention to signal that this parameter is unused. In Javascript you can't have two parameters named _, but you could write something like this:

var d = derived([parent1, parent2], (values, _s, _u, changed) => "some derived value");

This signals that you're not using the set or update parameters, while remaining quite readable. It's not quite as good as being able to write (values, _, _, changed) the way you can in some languages, but (values, _s, _u, changed) is still quite readable and signals the coder's intent nicely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants