How Package Metadata Retrieval (Caching) Works

The evaluation of activity (both contributor and community) and generation of recommended alternatives by Trusty is a relatively expensive operation. Additionally, while there are millions of packages out there only a relatively small subset represent the majority of developer dependencies. With that in mind we implemented a caching strategy to reduce the number of calls to GitHub and avoid expensive duplication of effort around generative AI use for alternatives modelling.

If a package is cached (i.e. we have pre-computed the rankings and generated package alternatives), we will immediately return the result of the query and display the results. If however the package has not been analyzed, we will enqueue it for processing. Once the results are available, they will be displayed on the next refresh of the page. The operation to recover a package may take a few seconds or a few minutes depending on the web traffic we are experiencing.

We pre-populate the cache with most used packages to improve user experience. A cache miss may in itself indicate that a package is not that widely known or used.

We will invalidate the cache every few days to ensure that information being presented is always current.