feat: pregel memory optimizations by SemyonSinchenko · Pull Request #847 · graphframes/graphframes · GitHub
Skip to content

feat: pregel memory optimizations#847

Merged
SemyonSinchenko merged 5 commits into
graphframes:mainfrom
SemyonSinchenko:835-required-edge-columns
Jun 11, 2026
Merged

feat: pregel memory optimizations#847
SemyonSinchenko merged 5 commits into
graphframes:mainfrom
SemyonSinchenko:835-required-edge-columns

Conversation

@SemyonSinchenko

@SemyonSinchenko SemyonSinchenko commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator

What changes were proposed in this pull request?

Edge columns are not persisted by default anymore.

Why are the changes needed?

x2 less peak memory usage + small benefit on local benchmarks:

This PR

[info] Result "org.graphframes.benchmarks.LabelPropagationBenchmark.benchmarkLabelPropagation":
[info]   40.889 ±(99.9%) 14.652 s/op [Average]
[info]   (min, avg, max) = (39.969, 40.889, 41.450), stdev = 0.803
[info]   CI (99.9%): [26.237, 55.540] (assumes normal distribution)

Main

[info] Result "org.graphframes.benchmarks.LabelPropagationBenchmark.benchmarkLabelPropagation":
[info]   41.504 ±(99.9%) 12.715 s/op [Average]
[info]   (min, avg, max) = (40.712, 41.504, 42.025), stdev = 0.697
[info]   CI (99.9%): [28.789, 54.219] (assumes normal distribution)

Close #835

P.S. I do not see this as a breaking change. It was not documented anywhere that edges are persisted by default. I see it as a small change that will benefit 99% of users and only ~1% that used edge-columns in Pregel should do a small update.

@SemyonSinchenko

Copy link
Copy Markdown
Collaborator Author

@SemyonSinchenko SemyonSinchenko self-assigned this Jun 9, 2026
@SemyonSinchenko SemyonSinchenko merged commit 10ecd74 into graphframes:main Jun 11, 2026
8 checks passed
@SemyonSinchenko SemyonSinchenko deleted the 835-required-edge-columns branch June 11, 2026 21:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: requiredEdgeColumns API in Pregel

2 participants