feat: louvain community detection for module boundary analysis by carlos-alm · Pull Request #133 · optave/ops-codegraph-tool · GitHub
Skip to content

feat: louvain community detection for module boundary analysis#133

Merged
carlos-alm merged 1 commit into
mainfrom
feat/community-detection
Feb 26, 2026
Merged

feat: louvain community detection for module boundary analysis#133
carlos-alm merged 1 commit into
mainfrom
feat/community-detection

Conversation

@carlos-alm

Copy link
Copy Markdown
Contributor

Summary

  • Add codegraph communities command that runs Louvain clustering on the dependency graph, compares discovered communities against directory structure, and surfaces architectural drift (split/merge candidates, drift score)
  • Supports file-level (default) and function-level (--functions) modes with configurable resolution parameter
  • Integrated into codegraph stats, MCP server (communities tool), and programmatic API exports

Details

New file: src/communities.js (~200 lines) — core module with communitiesData(), communitySummaryForStats(), and communities() CLI display function. Uses graphology + graphology-communities-louvain (pure JS, ~50KB).

CLI options:

  • --functions — function-level instead of file-level
  • --resolution <n> — Louvain resolution (default 1.0, higher = more communities)
  • --drift — show only drift analysis
  • -T, -j, -d — standard flags

Drift analysis:

  • Split candidates: directories with members in 2+ communities
  • Merge candidates: communities spanning 2+ directories
  • Drift score: 0-100 composite metric

Dependencies: graphology ^0.25.4, graphology-communities-louvain ^2.0.2 (regular deps)

Test plan

  • 13 new integration tests covering file-level, function-level, drift analysis, noTests filtering, resolution parameter, drift-only mode, stats summary, and empty graph edge case
  • Full test suite passes (710 tests)
  • Lint clean (Biome)
  • CLI verified: communities, --functions, --drift -T, --resolution 1.5 -j, stats -T

Add `codegraph communities` command that runs Louvain clustering on the
dependency graph, compares discovered communities against directory
structure, and surfaces architectural drift (split/merge candidates,
drift score). Supports file-level (default) and function-level modes,
configurable resolution, and drift-only output. Integrated into stats,
MCP, and programmatic API.

Impact: 9 functions changed, 8 affected
@claude

claude Bot commented Feb 26, 2026

Copy link
Copy Markdown

@greptile-apps

greptile-apps Bot commented Feb 26, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR adds Louvain community detection to analyze module boundaries and architectural drift. The implementation is well-structured with comprehensive test coverage (13 tests, 710 total passing).

Key strengths:

  • Clean separation of concerns: graph construction, clustering, drift analysis, and display logic are properly isolated
  • Robust edge case handling: empty graphs, self-loops, duplicate edges, and missing dependencies all handled gracefully
  • Smart integration: async imports with try-catch in stats() allow graceful degradation if graphology is unavailable
  • Proper SQL safety: uses prepared statements, read-only operations, and validates node existence before creating edges
  • Comprehensive testing: covers both file-level and function-level modes, resolution parameters, drift-only mode, and empty graph scenarios

Technical details:

  • Builds undirected graphs from SQLite (file nodes + imports OR function nodes + calls)
  • Filters self-loops and deduplicates edges correctly
  • Drift analysis identifies split candidates (directories spanning communities) and merge candidates (communities spanning directories)
  • Modularity score properly bounded 0-1, drift score normalized 0-100

The code follows project conventions, has no security issues, and integrates cleanly into CLI, MCP, and programmatic APIs.

Confidence Score: 5/5

  • This PR is safe to merge with no issues found
  • The implementation is thoroughly tested, handles edge cases properly, uses safe SQL practices, integrates cleanly with existing code, and follows established project patterns. No logical errors, security vulnerabilities, or breaking changes detected.
  • No files require special attention

Important Files Changed

Filename Overview
src/communities.js New core module implementing Louvain clustering with proper graph construction, drift analysis, and edge case handling
src/cli.js Added communities command with proper option parsing and async handler for stats integration
src/mcp.js Added communities tool to MCP server with proper input schema and async import handling
src/queries.js Integrated communities summary into stats with graceful fallback for missing dependencies
tests/integration/communities.test.js Comprehensive test suite with 13 tests covering file/function-level modes, drift analysis, and edge cases

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    CLI[CLI: codegraph communities] --> CommData[communitiesData]
    Stats[CLI: codegraph stats] --> CommSummary[communitySummaryForStats]
    MCP[MCP Server] --> CommData
    API[Programmatic API] --> CommData
    
    CommData --> BuildGraph[buildGraphologyGraph]
    CommSummary --> CommData
    
    BuildGraph --> DB[(SQLite DB)]
    BuildGraph --> FileLevel{Mode?}
    FileLevel -->|file-level| FileNodes[nodes: files<br/>edges: imports]
    FileLevel -->|function-level| FnNodes[nodes: functions<br/>edges: calls]
    
    FileNodes --> GraphObj[Graphology Graph]
    FnNodes --> GraphObj
    
    GraphObj --> Louvain[Louvain Algorithm]
    Louvain --> Communities[Community Assignments]
    
    Communities --> DirAnalysis[Directory Analysis]
    DirAnalysis --> Split[Split Candidates]
    DirAnalysis --> Merge[Merge Candidates]
    DirAnalysis --> DriftScore[Drift Score 0-100]
    
    Communities --> Output[JSON Output]
    Split --> Output
    Merge --> Output
    DriftScore --> Output
Loading

Last reviewed commit: cc28daa

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

9 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@carlos-alm carlos-alm merged commit f3e36ad into main Feb 26, 2026
18 checks passed
@carlos-alm carlos-alm deleted the feat/community-detection branch February 26, 2026 23:35
carlos-alm pushed a commit that referenced this pull request Feb 27, 2026
Update README, CLAUDE.md, BACKLOG, titan-paradigm, recommended-practices,
and CLI/MCP examples to reflect today's merged PRs: complexity metrics
(#130/#139), Louvain community detection (#133/#134), and manifesto rule
engine (#138). Updates MCP tool count from 21 to 24 (25 in multi-repo),
marks backlog items 6/11/21/22 as done, and adds real CLI output examples.
carlos-alm pushed a commit that referenced this pull request Feb 27, 2026
Update README, CLAUDE.md, BACKLOG, titan-paradigm, recommended-practices,
and CLI/MCP examples to reflect today's merged PRs: complexity metrics
(#130/#139), Louvain community detection (#133/#134), and manifesto rule
engine (#138). Updates MCP tool count from 21 to 24 (25 in multi-repo),
marks backlog items 6/11/21/22 as done, and adds real CLI output examples.
carlos-alm added a commit that referenced this pull request Feb 27, 2026
* fix: strict type validation for threshold values in complexity queries

Replace loose `!= null` checks with `typeof === 'number' && Number.isFinite()`
to prevent `Number("")`, `Number(null)`, and `Number(true)` from silently
coercing into valid SQL values. Add integration test verifying exceeds
arrays and summary.aboveWarn are correctly computed.

Addresses Greptile review feedback on #136.

Impact: 2 functions changed, 3 affected

* docs: add complexity, communities, and manifesto to all docs

Update README, CLAUDE.md, BACKLOG, titan-paradigm, recommended-practices,
and CLI/MCP examples to reflect today's merged PRs: complexity metrics
(#130/#139), Louvain community detection (#133/#134), and manifesto rule
engine (#138). Updates MCP tool count from 21 to 24 (25 in multi-repo),
marks backlog items 6/11/21/22 as done, and adds real CLI output examples.

* fix: remove redundant condition in paginate guard clauses

When limit === undefined, limit !== 0 is always true — the && check
was dead code. Simplified to just check limit === undefined.

Impact: 2 functions changed, 18 affected

* docs: update dogfood report with fix statuses

All 4 bugs now fixed (PR #117 merged, #116 closed via reverse-dep
cascade). 3 of 4 suggestions addressed. MCP tool counts updated
18→23 / 19→24. Rating upgraded 7/10 → 9/10 post-fix.

* fix: rename misleading test to match actual behavior

Test was named "handles non-numeric thresholds gracefully" but only
validated baseline exceeds/aboveWarn with valid thresholds. Actual
non-numeric threshold tests exist separately. Renamed to "produces
correct exceeds and aboveWarn with valid thresholds".

* fix: update stale MCP tool count in dogfood skill (21→24)

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
carlos-alm added a commit that referenced this pull request Feb 27, 2026
* fix: strict type validation for threshold values in complexity queries

Replace loose `!= null` checks with `typeof === 'number' && Number.isFinite()`
to prevent `Number("")`, `Number(null)`, and `Number(true)` from silently
coercing into valid SQL values. Add integration test verifying exceeds
arrays and summary.aboveWarn are correctly computed.

Addresses Greptile review feedback on #136.

Impact: 2 functions changed, 3 affected

* docs: add complexity, communities, and manifesto to all docs

Update README, CLAUDE.md, BACKLOG, titan-paradigm, recommended-practices,
and CLI/MCP examples to reflect today's merged PRs: complexity metrics
(#130/#139), Louvain community detection (#133/#134), and manifesto rule
engine (#138). Updates MCP tool count from 21 to 24 (25 in multi-repo),
marks backlog items 6/11/21/22 as done, and adds real CLI output examples.

* fix: remove redundant condition in paginate guard clauses

When limit === undefined, limit !== 0 is always true — the && check
was dead code. Simplified to just check limit === undefined.

Impact: 2 functions changed, 18 affected

* docs: update dogfood report with fix statuses

All 4 bugs now fixed (PR #117 merged, #116 closed via reverse-dep
cascade). 3 of 4 suggestions addressed. MCP tool counts updated
18→23 / 19→24. Rating upgraded 7/10 → 9/10 post-fix.

* fix: rename misleading test to match actual behavior

Test was named "handles non-numeric thresholds gracefully" but only
validated baseline exceeds/aboveWarn with valid thresholds. Actual
non-numeric threshold tests exist separately. Renamed to "produces
correct exceeds and aboveWarn with valid thresholds".

* fix: update stale MCP tool count in dogfood skill (21→24)

* feat: add complexity analysis for Python, Go, Rust, Java, C#, Ruby, PHP

Parameterize the complexity algorithm to support all 10 languages instead
of just JS/TS/TSX. Add per-language COMPLEXITY_RULES, HALSTEAD_RULES, and
COMMENT_PREFIXES with three else-if detection patterns (else-wraps-if,
explicit elif, alternative field). Guard against tree-sitter keyword leaf
tokens that share node type names with their parent constructs.

Impact: 4 functions changed, 4 affected

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Zeeeepa pushed a commit to Zeeeepa/codegraph that referenced this pull request Jun 22, 2026
…e#133)

Add `codegraph communities` command that runs Louvain clustering on the
dependency graph, compares discovered communities against directory
structure, and surfaces architectural drift (split/merge candidates,
drift score). Supports file-level (default) and function-level modes,
configurable resolution, and drift-only output. Integrated into stats,
MCP, and programmatic API.

Impact: 9 functions changed, 8 affected

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Zeeeepa pushed a commit to Zeeeepa/codegraph that referenced this pull request Jun 22, 2026
* fix: strict type validation for threshold values in complexity queries

Replace loose `!= null` checks with `typeof === 'number' && Number.isFinite()`
to prevent `Number("")`, `Number(null)`, and `Number(true)` from silently
coercing into valid SQL values. Add integration test verifying exceeds
arrays and summary.aboveWarn are correctly computed.

Addresses Greptile review feedback on optave#136.

Impact: 2 functions changed, 3 affected

* docs: add complexity, communities, and manifesto to all docs

Update README, CLAUDE.md, BACKLOG, titan-paradigm, recommended-practices,
and CLI/MCP examples to reflect today's merged PRs: complexity metrics
(optave#130/optave#139), Louvain community detection (optave#133/optave#134), and manifesto rule
engine (optave#138). Updates MCP tool count from 21 to 24 (25 in multi-repo),
marks backlog items 6/11/21/22 as done, and adds real CLI output examples.

* fix: remove redundant condition in paginate guard clauses

When limit === undefined, limit !== 0 is always true — the && check
was dead code. Simplified to just check limit === undefined.

Impact: 2 functions changed, 18 affected

* docs: update dogfood report with fix statuses

All 4 bugs now fixed (PR optave#117 merged, optave#116 closed via reverse-dep
cascade). 3 of 4 suggestions addressed. MCP tool counts updated
18→23 / 19→24. Rating upgraded 7/10 → 9/10 post-fix.

* fix: rename misleading test to match actual behavior

Test was named "handles non-numeric thresholds gracefully" but only
validated baseline exceeds/aboveWarn with valid thresholds. Actual
non-numeric threshold tests exist separately. Renamed to "produces
correct exceeds and aboveWarn with valid thresholds".

* fix: update stale MCP tool count in dogfood skill (21→24)

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Zeeeepa pushed a commit to Zeeeepa/codegraph that referenced this pull request Jun 22, 2026
* fix: strict type validation for threshold values in complexity queries

Replace loose `!= null` checks with `typeof === 'number' && Number.isFinite()`
to prevent `Number("")`, `Number(null)`, and `Number(true)` from silently
coercing into valid SQL values. Add integration test verifying exceeds
arrays and summary.aboveWarn are correctly computed.

Addresses Greptile review feedback on optave#136.

Impact: 2 functions changed, 3 affected

* docs: add complexity, communities, and manifesto to all docs

Update README, CLAUDE.md, BACKLOG, titan-paradigm, recommended-practices,
and CLI/MCP examples to reflect today's merged PRs: complexity metrics
(optave#130/optave#139), Louvain community detection (optave#133/optave#134), and manifesto rule
engine (optave#138). Updates MCP tool count from 21 to 24 (25 in multi-repo),
marks backlog items 6/11/21/22 as done, and adds real CLI output examples.

* fix: remove redundant condition in paginate guard clauses

When limit === undefined, limit !== 0 is always true — the && check
was dead code. Simplified to just check limit === undefined.

Impact: 2 functions changed, 18 affected

* docs: update dogfood report with fix statuses

All 4 bugs now fixed (PR optave#117 merged, optave#116 closed via reverse-dep
cascade). 3 of 4 suggestions addressed. MCP tool counts updated
18→23 / 19→24. Rating upgraded 7/10 → 9/10 post-fix.

* fix: rename misleading test to match actual behavior

Test was named "handles non-numeric thresholds gracefully" but only
validated baseline exceeds/aboveWarn with valid thresholds. Actual
non-numeric threshold tests exist separately. Renamed to "produces
correct exceeds and aboveWarn with valid thresholds".

* fix: update stale MCP tool count in dogfood skill (21→24)

* feat: add complexity analysis for Python, Go, Rust, Java, C#, Ruby, PHP

Parameterize the complexity algorithm to support all 10 languages instead
of just JS/TS/TSX. Add per-language COMPLEXITY_RULES, HALSTEAD_RULES, and
COMMENT_PREFIXES with three else-if detection patterns (else-wraps-if,
explicit elif, alternative field). Guard against tree-sitter keyword leaf
tokens that share node type names with their parent constructs.

Impact: 4 functions changed, 4 affected

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant