fix: use lru-cache for packuments by wraithgar · Pull Request #7463 · npm/cli · GitHub
Skip to content

fix: use lru-cache for packuments#7463

Merged
wraithgar merged 2 commits into
latestfrom
gar/oom
May 7, 2024
Merged

fix: use lru-cache for packuments#7463
wraithgar merged 2 commits into
latestfrom
gar/oom

Conversation

@wraithgar

@wraithgar wraithgar commented May 2, 2024

Copy link
Copy Markdown
Contributor

Fixes #7276

Comment thread workspaces/arborist/lib/arborist/index.js Outdated
Comment thread workspaces/arborist/lib/arborist/index.js Outdated
@wraithgar

Copy link
Copy Markdown
Contributor Author

Comment thread workspaces/arborist/lib/arborist/index.js Outdated
lukekarrys added a commit that referenced this pull request May 3, 2024
Somewhat related to #7276 and #7463. I don't think there is a reason to cache the promise here. And if we ever did choose to replace this with an LRUCache we would need to know the size of what we are caching which will be easier if we only cache the resulting manifest.

I also added a comment about why I think we are removing the license from manifests here. License was removed in #7126 which looks to be a purposeful change but I could not find a reason. Adding the license back in causes many snapshots to fail because the license is now present in lockfiles, so that's how I came up with the comment.
@wraithgar

Copy link
Copy Markdown
Contributor Author

_contentLength is 0 when revalidating the cache with new content. This needs to be fixed before we can trust _contentLength.

@wraithgar

Copy link
Copy Markdown
Contributor Author

Packument size to memory ratio (units are 1000 not 1024):

44.19% | [mean contentLength < 10k]
46.89% | [median contentLength < 10k]
61.59% | [mean contentLength < 1M]
60.48% | [median contentLength < 1M]
74.53% | [mean contentLength > 1M]
80.35% | [median contentLength > 1M]

The current tradeoff here is that we are swapping memory for disk reads (the cache reads from disk). On modern machines with large amounts of memory this is likely not a very big tradeoff, most high-memory high-end systems use ssd disks. This is more impactful of low memory systems, where running out of memory means the process dies.

I suggest an initial limit of 1M for the maximum packument we'll put in the cache, and then we use a multiplier of .6 for all packuments when reporting their size to lru-cache. This will over-report smaller packuments in favor of not accidentally under reporting medium sized ones an accidentally going further over the desired memory usage.

I don't have a good way to calculate what percentage of the heap_size_limit we should allow this cache to use. Again this is a memory/disk read tradeoff. Perhaps conservatively limiting the lru cache is best here.

lukekarrys added a commit that referenced this pull request May 6, 2024
Somewhat related to #7276 and #7463. I don't think there is a reason to cache the promise here. And if we ever did choose to replace this with an LRUCache we would need to know the size of what we are caching which will be easier if we only cache the resulting manifest.

I also added a comment about why I think we are removing the license from manifests here. License was removed in #7126 which looks to be a purposeful change but I could not find a reason. Adding the license back in causes many snapshots to fail because the license is now present in lockfiles, so that's how I came up with the comment.
Comment thread workspaces/arborist/lib/arborist/index.js Outdated
@lukekarrys lukekarrys changed the title WIP oom testing fix: use lru-cache for packuments May 6, 2024
@lukekarrys lukekarrys marked this pull request as ready for review May 6, 2024 22:28
@lukekarrys lukekarrys requested a review from a team as a code owner May 6, 2024 22:28
@lukekarrys

Copy link
Copy Markdown
Contributor

This is ready for tweaking based on your findings @wraithgar. I was able to get all the tests passing with the new packument cache in place. The only changes were to logging and a few tests with the mock-registry needed times: N to be set since packuments aren't cached as aggressively as before.

Comment thread workspaces/arborist/lib/arborist/index.js
Comment thread workspaces/arborist/lib/packument-cache.js Outdated
Comment thread workspaces/arborist/lib/packument-cache.js Outdated
Comment thread workspaces/arborist/lib/packument-cache.js Outdated
Comment thread workspaces/arborist/lib/packument-cache.js Outdated
@lukekarrys

Copy link
Copy Markdown
Contributor

@wraithgar Approved and ready to land I think. Can you squash with an appropriate body message that closes #7276?

@wraithgar

Copy link
Copy Markdown
Contributor Author

npm/pacote#369 is still an issue but this doesn't make that issue any worse so it shouldn't block this PR

This adds a new packument cache that is an instance of `lru-cache`.
It uses that package's ability to limit content based on size, and has
some multipliers based on research to mostly correctly approximate the
correlation between packument size and its memory usage.  It also limits
the total size of the cache based on the actual heap available.

Closes: #7276
Related: npm/pacote#369
@wraithgar

Copy link
Copy Markdown
Contributor Author

Here is the research used to calculate the multipliers.

packument size profiling.xlsx

@wraithgar wraithgar merged commit 722c0fa into latest May 7, 2024
@wraithgar wraithgar deleted the gar/oom branch May 7, 2024 16:33
@github-actions github-actions Bot mentioned this pull request May 7, 2024
@melroy89

melroy89 commented May 7, 2024

Copy link
Copy Markdown

The current tradeoff here is that we are swapping memory for disk reads (the cache reads from disk)

We could of course also try to increase the default node memory limit? Or is that a bad idea.

@wraithgar

Copy link
Copy Markdown
Contributor Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] npm 10.4.0+ running out of memory with no package-lock.json

3 participants