Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: allow persistent_cache to be used as a decorator #3550

Merged
merged 4 commits into from
Jan 24, 2025

Conversation

dmadisetti
Copy link
Collaborator

@dmadisetti dmadisetti commented Jan 23, 2025

📝 Summary

fixes #2653 #3471

🔍 Description of Changes

Enables persistent_cache to be used as a decorator for functions, and cache to be used as a context block. e.g.

@mo.persistent_cache
def expensive_function_written_to_disk():
    ...

# or

with mo.cache("expensive_block_in_memory") as c:
    ...

cache is also used as the general entry point for custom "Loaders"


The breadth of the API makes the implementation a bit hairy, but I think that if it's a smooth experience for the user then it's worth it.

@akshayka

Copy link

vercel bot commented Jan 23, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
marimo-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jan 23, 2025 11:03pm
marimo-storybook ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jan 23, 2025 11:03pm

@dmadisetti dmadisetti linked an issue Jan 23, 2025 that may be closed by this pull request
Copy link
Contributor

@akshayka akshayka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean implementation! Couple questions on signature changes.

Comment on lines +596 to +600
```python
with persistent_cache(name="my_cache"):
variable = expensive_function() # This will be cached to disk.
print("hello, cache") # this will be skipped on cache hits
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a decorator example as well?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know what you think of the overloading? Could walk it back a little too

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(that being said, I added examples in some of the overload docs)


def cache(
name: Union[str, Optional[Callable[..., Any]]] = None,
*args: Any,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why change the signature to use varargs/varkwargs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC this can dispatch to either _cache_call or _cache_context, but they take different args/kwargs, so variability is needed. Rather than using Any, tho, I think this could have more specific type annotations using unpack and a union those two signatures' args/kwargs?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any is pretty consistent with other uses of * packing in the codebase, but I did leave a comment, since * is more to keep values ambiguous until they're fed into the correct constructors

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, to be clear, wasn't saying that needed to happen, just that more specific typing was an option if desired -- while a nice addition, unpack is pretty recent and not super widely used (in marimo and beyond, ime)

Thanks for your work on this! I'm trying this branch out on a couple long computations rn 🚀

Comment on lines 577 to 578
*args: Any,
**kwargs: Any,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question, why varargs/kwargs?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to matching the signature throughout, extra kwargs are now used by Loaders to enable:

with MyLoader.cache("my_cache", constructor_values=123) as c:
    ...

or

@MyLoader.cache
def my_fn(args):
    ...

a bit of feature creep, but even when testing these changes, I noticed how cumbersome

with cache(_loader=MyLoader.partial(constructor_values=123)) as c:
    ...

was, and seemed like low hanging fruit.

Comment on lines 369 to 378
@singledispatch
def _cache_invocation(
arg: Any,
loader: Union[LoaderPartial, Loader, LoaderType],
*args: Any,
frame_offset: int = 1,
**kwargs: Any,
) -> Union[_cache_call, _cache_context]:
del loader, args, kwargs, frame_offset
raise TypeError(f"Invalid type for cache: {type(arg)}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty clean!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cheers!

@dmadisetti
Copy link
Collaborator Author

Tried to throw in some overloading to clean up the type signatures.


From a comment I left:

Single dispatch cues off only the first argument, and expects a similar signature for every overload: https://peps.python.org/pep-0443/
However.
The context and call APIs are slightly different, so * expansions are used to propagate that information down to the actually implementation.
As such, we also leverage the @overload decorator to provide the correct signature and documentation for the > singledispatch entry points.


That being said, overloading seems to do interesting things in Jedi. Cursor over persistent cache gives the implementation doc string
image

But opening it as a function gives the first overload

image

Also unsure what will happen when docs tries to deploy this.

Updated docs are a little half baked (but acceptable) until we decide how we want to handle this, or whether the signature hacks are going a bit too far.

@dmadisetti
Copy link
Collaborator Author

Ok, so only the implementation doc strings are rendered:

image

Which means, probably best to have the big doc strings on the implementation, and maybe more minor ones on the @overloads. Or is this a jedi bug?

@akshayka
Copy link
Contributor

Which means, probably best to have the big doc strings on the implementation, and maybe more minor ones on the @Overloads. Or is this a jedi bug?

Hmm, the docs aren't rendered using jedi but using mkdocs.

For completion, I think I wouldn't worry too much about what how jedi renders. We are exploring moving to an LSP anyway.

@dmadisetti
Copy link
Collaborator Author

I was saying showing both doc strings in the documentation panel.
But true, LSP is likely to change it. I'll remove the little stubs I wrote

@akshayka
Copy link
Contributor

Which means, probably best to have the big doc strings on the implementation

But yes, that makes sense to me.

Copy link
Contributor

@akshayka akshayka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with the overloading approach! LGTM modulo moving main docstrings to the implementation, if that's what you think is best.

@akshayka akshayka merged commit f253192 into main Jan 24, 2025
34 checks passed
@akshayka akshayka deleted the dm/cache-decorator-api branch January 24, 2025 02:28
Copy link

🚀 Development release published. You may be able to view the changes at https://marimo.app?v=0.10.17-dev12

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

persisted memoization (ie saving @mo.cache values)
3 participants