Sunbelt Computer Software

Code Historian

AI-Powered Code History Tracking with RAG-Based Natural Language Search

Never lose track of your code changes again. Search, explore, and restore any version with natural language.

Features • Installation • Usage • Configuration • Architecture

✨ Features

🔄 Automatic Change Capture

Real-time capture of all code changes as you work
Intelligent debouncing to avoid capturing every keystroke
Configurable exclusion patterns for node_modules, build files, etc.
Tracks file creates, modifies, deletes, and renames
Session-based organization for better context

🧠 Semantic Search with RAG

Natural language queries: "What changes did I make to the authentication logic?"
Hybrid search combining vector similarity and keyword matching (BM25)
Context-aware results with relevant code snippets
Temporal filtering: "Changes from last week"
Works with zero configuration — a neural embedding model (all-MiniLM-L6-v2) runs locally out of the box; or plug in Ollama, HuggingFace, or OpenAI
Symbol time-travel: trace the history of a single function/class/variable

💬 Chat Integration

Use @historian in VS Code Chat to explore your code history
Conversational interface powered by your choice of LLM
Ask questions like:
- "When did I last modify the User class?"
- "Find all changes related to database queries"
- "What did the login function look like before the refactor?"

⏪ Code Restoration

Restore any previous version of your code with one click
One-click Undo after a restore
Preview changes, or open them in VS Code's native diff editor (Compare with current / Open diff)
Automatic backup creation before restoration
Works seamlessly with your existing git workflow

🔎 Inline History (CodeLens)

"N changes in history" above a file and "k versions" above each function/class — click to jump straight to that file's or symbol's history
Toggle with codeHistorian.ui.showInlineHistory

📊 Visual Timeline

Beautiful, modern timeline view of all changes
Multiple view modes: Timeline, Cards, or Compact list
Stats dashboard with activity heatmap
Group by date, file, or folder
Inline diff preview with syntax highlighting
Filter by change type, date range, and more

🚀 Installation

VS Code Marketplace (Recommended)

Open VS Code
Go to Extensions (Cmd/Ctrl + Shift + X)
Search for "Code Historian"
Click Install

Or install directly: Install from Marketplace

From Source

git clone https://github.com/KirtiJha/code-historian.git
cd code-historian
npm install
npm run build

Then press F5 in VS Code to launch the extension in development mode.

⚙️ Configuration

Open VS Code Settings (Cmd/Ctrl + ,) and search for "Code Historian".

Embedding Provider

Code Historian supports multiple embedding providers for semantic search:

Provider	Model	Local	Cost	Dimensions	Setup
Built-in (default)	all-MiniLM-L6-v2 (neural)	✅	Free	384	None — runs locally
Ollama	nomic-embed-text	✅	Free	768	Install Ollama
HuggingFace	BAAI/bge-large-en-v1.5	❌	Free tier	1024	API token
OpenAI	text-embedding-3-small	❌	Paid	1536	API key

The built-in provider needs no configuration and runs a small neural sentence-transformer (all-MiniLM-L6-v2) locally via Transformers.js — the model (~23 MB) is downloaded once and cached for offline use. If the local ONNX runtime can't load, it automatically falls back to a lightweight hashing embedding so search always works. Switch to Ollama / HuggingFace / OpenAI for other models:

{
  "codeHistorian.embedding.provider": "ollama",
  "codeHistorian.embedding.model": "nomic-embed-text"
}

🔒 API keys are stored in VS Code's encrypted Secret Storage, not in settings.json.

LLM Provider

For the chat interface, configure your preferred LLM:

Provider	Models	Local	Setup
Ollama	llama3.2, mistral, codellama	✅	Free, local
OpenAI	gpt-4o, gpt-4-turbo, gpt-3.5-turbo	❌	API key required
Anthropic	claude-sonnet-4-20250514, claude-3-haiku	❌	API key required
Google Gemini	gemini-pro, gemini-1.5-flash	❌	API key required

{
  "codeHistorian.llm.provider": "openai",
  "codeHistorian.llm.model": "gpt-4o"
}

API keys are configured per provider (e.g. codeHistorian.llm.openaiApiKey, codeHistorian.llm.anthropicApiKey, codeHistorian.llm.googleApiKey) and are moved into encrypted Secret Storage automatically.

Capture Settings

{
  "codeHistorian.capture.enabled": true,
  "codeHistorian.capture.debounceMs": 2000,
  "codeHistorian.capture.excludePatterns": [
    "**/node_modules/**",
    "**/.git/**",
    "**/dist/**",
    "**/*.lock"
  ],
  "codeHistorian.capture.maxFileSizeKB": 1024
}

📖 Usage

Timeline View

Click the Code Historian icon in the Activity Bar (sidebar)
Browse your change history with multiple view options:
- Timeline View: Classic vertical timeline with connecting lines
- Cards View: Grid layout for visual scanning
- Compact View: Dense list for maximum information
Use filters to narrow down results:
- Filter by change type (Created, Modified, Deleted)
- Filter by date range
- Search by filename or content
Click any change to see detailed diff view
Restore any previous version with one click

Chat Commands

Open VS Code Chat (Cmd/Ctrl + Shift + I) and use @historian:

@historian What changes did I make to the authentication module?
@historian Show me the login function from last week
@historian Find all database-related changes
@historian When did I add the validation logic?

Keyboard Shortcuts

Shortcut	Command
`Ctrl+Shift+H` / `Cmd+Shift+H`	Open Timeline
`Ctrl+Alt+F` / `Cmd+Alt+F`	Search History

🏗️ Architecture

Code Historian uses a modern architecture optimized for VS Code extensions:

┌─────────────────────────────────────────────────────────────┐
│                     VS Code Extension                        │
├──────────────┬──────────────┬───────────────┬───────────────┤
│   Capture    │   Embedding  │    Search     │     LLM       │
│   Engine     │   Service    │    Engine     │  Orchestrator │
│              │              │               │               │
│  • Debounce  │  • HuggingFace│ • Hybrid     │  • OpenAI     │
│  • Diff Gen  │  • Ollama    │   Search     │  • Anthropic  │
│  • Sessions  │  • OpenAI    │ • BM25+Vector│  • Ollama     │
├──────────────┴──────────────┴───────────────┴───────────────┤
│                      Database Layer                          │
│     SQLite (sql.js)          │        LanceDB               │
│     • Metadata               │        • Vector embeddings   │
│     • BM25 keyword search    │        • Similarity search   │
├──────────────────────────────┴──────────────────────────────┤
│                     React Webview UI                         │
│  Timeline • Search • Settings • Diff Viewer • Chat          │
└─────────────────────────────────────────────────────────────┘

Note: the SQLite layer uses sql.js (WebAssembly), which is not built with the FTS5 extension. Keyword search therefore uses candidate retrieval plus a JavaScript BM25 ranker rather than SQLite FTS5.

Key Technologies

Component	Technology	Purpose
Metadata DB	SQLite (sql.js)	In-process metadata storage + BM25 keyword ranking
Vector DB	LanceDB	Embedded vector database with ANN search
Embeddings	Built-in/Ollama/HuggingFace/OpenAI	Semantic code understanding
UI Framework	React 18	Modern, reactive webview interface
Build Tool	esbuild	Fast TypeScript bundling
Chat API	VS Code Chat	Native chat participant integration

Search Pipeline

User Query → Embedding → Vector Search (top-k)
                     ↘
                       RRF Fusion → (optional rerank) → Ranked Results
                     ↗
            → BM25 Keyword Search

The hybrid search combines:

Vector similarity (default 60% weight): Semantic understanding of code
Keyword matching (default 40% weight): BM25 ranking over candidate matches
Reciprocal Rank Fusion: Combines both result sets with overlap boosting
Reranking: a free local BM25 reranker refines the top results (a cloud cross-encoder via Cohere/HuggingFace can be configured instead)

📈 Performance

Approximate design targets (vary by machine, provider, and history size):

Large diffs are gzip-compressed at rest, and history older than maxHistoryDays is pruned automatically.

🔒 Privacy

Your data stays 100% local by default:

✅ SQLite database in VS Code's global storage
✅ LanceDB vectors stored locally
✅ Optional Ollama for completely local AI
✅ No telemetry or external data sharing
✅ API keys stored securely in VS Code settings

When using cloud providers (OpenAI, HuggingFace, Anthropic), only embedding requests and chat queries are sent externally.

🛠️ Development

# Clone the repository
git clone https://github.com/KirtiJha/code-historian.git
cd code-historian

# Install dependencies
npm install

# Build the extension
npm run build

# Watch mode (auto-rebuild on changes)
npm run watch

# Type checking
npm run typecheck

# Linting
npm run lint

# Run tests
npm test

Project Structure

code-historian/
├── src/
│   ├── extension.ts        # Extension entry point
│   ├── constants.ts        # Configuration constants
│   ├── types/              # TypeScript type definitions
│   ├── database/           # SQLite & LanceDB wrappers
│   ├── services/           # Core services
│   │   ├── capture.ts      # Change capture engine
│   │   ├── embedding.ts    # Embedding service
│   │   ├── search.ts       # Hybrid search engine
│   │   ├── llm.ts          # LLM orchestrator
│   │   └── restoration.ts  # Code restoration
│   ├── chat/               # VS Code Chat participant
│   ├── webview/            # React webview UI
│   │   ├── ui/             # React components
│   │   └── provider.ts     # Webview provider
│   └── utils/              # Utilities
├── media/                  # Icons and assets
├── dist/                   # Build output
└── package.json            # Extension manifest

🤝 Contributing

Contributions are welcome! Here's how you can help:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Please read our Contributing Guide for details on our code of conduct and development process.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

LanceDB - Excellent embedded vector database
Ollama - Local AI inference made easy
HuggingFace - State-of-the-art embeddings
VS Code - Amazing extension API
sql.js - SQLite compiled to WebAssembly

Made with ❤️ for developers who value their code history

Report Bug • Request Feature

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.github/workflows		.github/workflows
.vscode		.vscode
media		media
scripts		scripts
src		src
test		test
.gitignore		.gitignore
.prettierrc		.prettierrc
.vscodeignore		.vscodeignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
esbuild.config.ts		esbuild.config.ts
eslint.config.mjs		eslint.config.mjs
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts

Operation	Target	Notes
Change capture	non-blocking	Debounced, batched, off the UI thread
Embedding generation	provider-dependent	Built-in is instant; cloud adds network latency
Vector search	low latency	LanceDB ANN over the local store
Hybrid search	interactive	Vector + BM25 keyword fusion
UI render	smooth	Virtualized timeline

Sunbelt Computer Software

PL/B Language Development and Support

Folders and files

Latest commit

History

Repository files navigation

Code Historian

✨ Features

🔄 Automatic Change Capture

🧠 Semantic Search with RAG

💬 Chat Integration

⏪ Code Restoration

🔎 Inline History (CodeLens)

📊 Visual Timeline

🚀 Installation

VS Code Marketplace (Recommended)

From Source

⚙️ Configuration

Embedding Provider

LLM Provider

Capture Settings

📖 Usage

Timeline View

Chat Commands

Keyboard Shortcuts

🏗️ Architecture

Key Technologies

Search Pipeline

📈 Performance

🔒 Privacy

🛠️ Development

Project Structure

🤝 Contributing

📄 License

🙏 Acknowledgments

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages