Caching¶
marimo comes with utilities to cache intermediate computations. These utilities come in two types: caching the return values of expensive functions in memory, and caching the values of variables to disk.
Caching expensive functions¶
Use mo.cache
to cache the return values of functions in
memory, based on the function arguments, closed-over values, and the notebook
code defining the function.
The resulting cache is similar to functools.cache
, but with the benefit that
mo.cache
won't return stale values (because it keys on
closed-over values) and isn't invalidated when the cell defining the decorated
function is simply re-run (because it keys on notebook code). This means that
like marimo notebooks, mo.cache
has no hidden state
associated with the cached function, which makes you more productive while developing iteratively.
For a cache with bounded size, use mo.lru_cache
.
marimo.cache
¶
Cache the value of a function based on args and closed-over variables.
Decorating a function with @mo.cache
will cache its value based on
the function's arguments, closed-over values, and the notebook code.
Usage.
mo.cache
is similar to functools.cache
, but with three key benefits:
mo.cache
persists its cache even if the cell defining the cached function is re-run, as long as the code defining the function (excluding comments and formatting) has not changed.mo.cache
keys on closed-over values in addition to function arguments, preventing accumulation of hidden state associated withfunctools.cache
.mo.cache
does not require its arguments to be hashable (only pickleable), meaning it can work with lists, sets, NumPy arrays, PyTorch tensors, and more.
mo.cache
obtains these benefits at the cost of slightly higher overhead
than functools.cache
, so it is best used for expensive functions.
Like functools.cache
, mo.cache
is thread-safe.
The cache has an unlimited maximum size. To limit the cache size, use
@mo.lru_cache
. mo.cache
is slightly faster than mo.lru_cache
, but in
most applications the difference is negligible.
Args:
pin_modules
: if True, the cache will be invalidated if module versions differ.
marimo.lru_cache
¶
lru_cache(
_fn: Optional[Callable[..., Any]] = None,
*,
maxsize: int = 128,
pin_modules: bool = False
) -> _cache_base
Decorator for LRU caching the return value of a function.
mo.lru_cache
is a version of mo.cache
with a bounded cache size. As an
LRU (Least Recently Used) cache, only the last used maxsize
values are
retained, with the oldest values being discarded. For more information,
see the documentation of mo.cache
.
Usage.
Args:
maxsize
: the maximum number of entries in the cache; defaults to 128. Setting to -1 disables cache limits.pin_modules
: if True, the cache will be invalidated if module versions differ.
Caching variables to disk¶
Use mo.persistent_cache
to cache variables computed in an expensive block of
code to disk. The next time this block of code is run, if marimo detects a
cache hit, the code will be skipped and your variables will be loaded into
memory, letting you pick up where you left off.
Cache location
By default, caches are stored in __marimo__/cache/
, in the directory of the
current notebook. For projects versioned with git
, consider adding
**/__marimo__/cache/
to your .gitignore
.
marimo.persistent_cache
¶
persistent_cache(
name: str,
*,
save_path: str | None = None,
pin_modules: bool = False,
_loader: Optional[Loader] = None
)
Bases: object
Save variables to disk and restore them thereafter.
The mo.persistent_cache
context manager lets you delimit a block of code
in which variables will be cached to disk when they are first computed. On
subsequent runs of the cell, if marimo determines that this block of code
hasn't changed and neither has its ancestors, it will restore the variables
from disk instead of re-computing them, skipping execution of the block
entirely.
Restoration happens even across notebook runs, meaning you can use
mo.persistent_cache
to make notebooks start instantly, with variables
that would otherwise be expensive to compute already materialized in
memory.
Usage.
with persistent_cache(name="my_cache"):
variable = expensive_function() # This will be cached to disk.
print("hello, cache") # this will be skipped on cache hits
In this example, variable
will be cached the first time the block
is executed, and restored on subsequent runs of the block. If cache
conditions are hit, the contents of with
block will be skipped on
execution. This means that side-effects such as writing to stdout and
stderr will be skipped on cache hits.
For function-level memoization, use @mo.cache
or @mo.lru_cache
.
Note that mo.state
and UIElement
changes will also trigger cache
invalidation, and be accordingly updated.
Warning. Since context abuses sys frame trace, this may conflict with
debugging tools or libraries that also use sys.settrace
.
Args:
name
: the name of the cache, used to set saving path- to manually invalidate the cache, change the name.save_path
: the folder in which to save the cache, defaults to "marimo/cache" in the directory of the notebook filepin_modules
: if True, the cache will be invalidated if module versions differ between runs, defaults to False.