GitHub - rainbow-365/free-claude-code: Use claude-code for free in the terminal, VSCode extension or via discord like openclaw · GitHub
Skip to content

rainbow-365/free-claude-code

 
 

Repository files navigation

Free Claude Code

Use Claude Code CLI & VSCode for free. No Anthropic API key required.

License: MIT Python 3.14 uv Tested with Pytest Type checking: Ty Code style: Ruff Logging: Loguru

A lightweight proxy that routes Claude Code's Anthropic API calls to NVIDIA NIM (40 req/min free), OpenRouter (hundreds of models), or LM Studio (fully local).

Features · Quick Start · How It Works · Discord Bot · Configuration


Free Claude Code in action

Claude Code running via NVIDIA NIM, completely free

Features

Feature Description
Zero Cost 40 req/min free on NVIDIA NIM. Free models on OpenRouter. Fully local with LM Studio
Drop-in Replacement Set 2 env vars. No modifications to Claude Code CLI or VSCode extension needed
3 Providers NVIDIA NIM, OpenRouter (hundreds of models), LM Studio (local & offline)
Thinking Token Support Parses <think> tags and reasoning_content into native Claude thinking blocks
Heuristic Tool Parser Models outputting tool calls as text are auto-parsed into structured tool use
Request Optimization 5 categories of trivial API calls intercepted locally, saving quota and latency
Discord Bot Remote autonomous coding with tree-based threading, session persistence, and live progress (Telegram also supported)
Smart Rate Limiting Proactive rolling-window throttle + reactive 429 exponential backoff + optional concurrency cap across all providers
Subagent Control Task tool interception forces run_in_background=False. No runaway subagents
Extensible Clean BaseProvider and MessagingPlatform ABCs. Add new providers or platforms easily

Quick Start

Prerequisites

  1. Get an API key (or use LM Studio locally):
  2. Install Claude Code
  3. Install uv

Clone & Configure

git clone https://github.com/Alishahryar1/free-claude-code.git
cd free-claude-code
cp .env.example .env

Choose your provider and edit .env:

NVIDIA NIM (40 req/min free, recommended)
NVIDIA_NIM_API_KEY="nvapi-your-key-here"
MODEL="nvidia_nim/stepfun-ai/step-3.5-flash"
OpenRouter (hundreds of models)
OPENROUTER_API_KEY="sk-or-your-key-here"
MODEL="open_router/stepfun/step-3.5-flash:free"
LM Studio (fully local, no API key)
MODEL="lmstudio/lmstudio-community/qwen2.5-7b-instruct"

Run It

Terminal 1: Start the proxy server:

uv run uvicorn server:app --host 0.0.0.0 --port 8082

Terminal 2: Run Claude Code:

ANTHROPIC_AUTH_TOKEN="freecc" ANTHROPIC_BASE_URL="http://localhost:8082" claude

That's it! Claude Code now uses your configured provider for free.

Multi-Model Support (Model Picker)

claude-pick is an interactive model selector that lets you choose any model from your active provider each time you launch Claude, without editing MODEL in .env.

Screen.Recording.2026-02-18.at.5.48.41.PM.mov

1. Install fzf (highly recommended for the interactive picker):

brew install fzf        # macOS/Linux

2. Add the alias to ~/.zshrc or ~/.bashrc:

# Use the absolute path to your cloned repo
alias claude-pick="/absolute/path/to/free-claude-code/claude-pick"

Then reload your shell (source ~/.zshrc or source ~/.bashrc) and run claude-pick to pick a model and launch Claude.

Skip the picker with a fixed model (no picker needed):

alias claude-kimi='ANTHROPIC_BASE_URL="http://localhost:8082" ANTHROPIC_AUTH_TOKEN="freecc:moonshotai/kimi-k2.5" claude'
VSCode Extension Setup
  1. Start the proxy server (same as above).
  2. Open Settings (Ctrl + ,) and search for claude-code.environmentVariables.
  3. Click Edit in settings.json and add:
"claude-code.environmentVariables": [
  { "name": "ANTHROPIC_BASE_URL", "value": "http://localhost:8082" },
  { "name": "ANTHROPIC_AUTH_TOKEN", "value": "freecc" }
]
  1. Reload extensions.
  2. If you see the login screen ("How do you want to log in?"): Click Anthropic Console, then authorize. The extension will start working. You may be redirected to buy credits in the browser; ignore it, the extension already works.

To switch back to Anthropic models, comment out the added block and reload extensions.


How It Works

┌─────────────────┐        ┌──────────────────────┐        ┌──────────────────┐
│  Claude Code    │───────>│  Free Claude Code    │───────>│  LLM Provider    │
│  CLI / VSCode   │<───────│  Proxy (:8082)       │<───────│  NIM / OR / LMS  │
└─────────────────┘        └──────────────────────┘        └──────────────────┘
   Anthropic API                     │                       OpenAI-compatible
   format (SSE)              ┌───────┴────────┐                format (SSE)
                             │ Optimizations  │
                             ├────────────────┤
                             │ Quota probes   │
                             │ Title gen skip │
                             │ Prefix detect  │
                             │ Suggestion skip│
                             │ Filepath mock  │
                             └────────────────┘
  • Transparent proxy: Claude Code sends standard Anthropic API requests to the proxy server
  • Request optimization: 5 categories of trivial requests (quota probes, title generation, prefix detection, suggestions, filepath extraction) are intercepted and responded to instantly without using API quota
  • Format translation: real requests are translated from Anthropic format to the provider's OpenAI-compatible format and streamed back
  • Thinking tokens: <think> tags and reasoning_content fields are converted into native Claude thinking blocks so Claude Code renders them correctly

Providers

Provider Cost Rate Limit Models Best For
NVIDIA NIM Free 40 req/min Kimi K2, GLM5, Devstral, MiniMax Daily driver, generous free tier
OpenRouter Free / Paid Varies 200+ (GPT-4o, Claude, Step, etc.) Model variety, fallback options
LM Studio Free (local) Unlimited Any GGUF model Privacy, offline use, no rate limits

Switch providers by changing MODEL in .env. Use the prefix format provider/model/name. Invalid prefix causes an error.

Provider MODEL prefix API Key Variable Base URL
NVIDIA NIM nvidia_nim/... NVIDIA_NIM_API_KEY integrate.api.nvidia.com/v1
OpenRouter open_router/... OPENROUTER_API_KEY openrouter.ai/api/v1
LM Studio lmstudio/... (none) localhost:1234/v1

LM Studio runs locally. Start the server in the Developer tab or via lms server start, load a model, and set MODEL to the model identifier.


Discord Bot

Control Claude Code remotely from Discord. Send tasks, watch live progress, and manage multiple concurrent sessions. Telegram is also supported.

Capabilities:

  • Tree-based message threading: reply to a message to fork the conversation
  • Session persistence across server restarts
  • Live streaming of thinking tokens, tool calls, and results
  • Unlimited concurrent Claude CLI sessions (provider concurrency controlled by PROVIDER_MAX_CONCURRENCY)
  • Voice notes: send voice messages; they are transcribed and processed like regular prompts (see Voice Notes)
  • Commands: /stop (cancel tasks; reply to a message to stop only that task), /clear (standalone: reset all sessions; reply to a message to clear that branch downwards), /stats

Setup

  1. Create a Discord Bot: Go to Discord Developer Portal, create an application, add a bot, and copy the token. Enable Message Content Intent under Bot settings.

  2. Edit .env:

MESSAGING_PLATFORM="discord"
DISCORD_BOT_TOKEN="your_discord_bot_token"
ALLOWED_DISCORD_CHANNELS="123456789,987654321"

Enable Developer Mode in Discord (Settings → Advanced), then right-click a channel and "Copy ID" to get channel IDs. Comma-separate multiple channels. If empty, no channels are allowed.

  1. Configure the workspace (where Claude will operate):
CLAUDE_WORKSPACE="./agent_workspace"
ALLOWED_DIR="C:/Users/yourname/projects"
  1. Start the server:
uv run uvicorn server:app --host 0.0.0.0 --port 8082
  1. Invite the bot (OAuth2 URL Generator, scopes: bot, permissions: Read Messages, Send Messages, Manage Messages, Read Message History). Send a task to an allowed channel and Claude responds with live thinking tokens and tool calls. Use commands above to cancel or clear.

Telegram (Alternative)

To use Telegram instead, set MESSAGING_PLATFORM=telegram and configure:

TELEGRAM_BOT_TOKEN="123456789:ABCdefGHIjklMNOpqrSTUvwxYZ"
ALLOWED_TELEGRAM_USER_ID="your_telegram_user_id"

Get a token from @BotFather; find your user ID via @userinfobot.

Voice Notes

Send voice messages on Telegram or Discord; they are transcribed to text and processed as regular prompts. Two transcription backends are available:

  • Local Whisper (default): Uses Hugging Face transformers Whisper — free, no API key, works offline, CUDA compatible. No ffmpeg required.
  • NVIDIA NIM: Uses NVIDIA NIM Whisper/Parkeet models via gRPC — requires NVIDIA_NIM_API_KEY.

Install the optional voice extras:

# For local Whisper (cpu/cuda)
uv sync --extra voice_local

# For NVIDIA NIM transcription
uv sync --extra voice

# Install both
uv sync --extra voice --extra voice_local

Configuration:

Variable Description Default
VOICE_NOTE_ENABLED Enable voice note handling true
WHISPER_DEVICE cpu | cuda | nvidia_nim cpu
WHISPER_MODEL See supported models below base
HF_TOKEN Hugging Face token for faster model downloads (optional, for local Whisper)
NVIDIA_NIM_API_KEY API key for NVIDIA NIM (required for nvidia_nim device)

Supported WHISPER_MODEL values:

Model Device Description
tiny, base, small, medium, large-v2, large-v3, large-v3-turbo cpu / cuda Local Whisper (Hugging Face)
openai/whisper-large-v3 nvidia_nim Auto language detection (best overall)
nvidia/parakeet-ctc-1.1b-asr nvidia_nim English-only
nvidia/parakeet-ctc-0.6b-asr nvidia_nim English-only
nvidia/parakeet-ctc-0.6b-zh-cn nvidia_nim Mandarin Chinese
nvidia/parakeet-ctc-0.6b-zh-tw nvidia_nim Traditional Chinese
nvidia/parakeet-ctc-0.6b-es nvidia_nim Spanish
nvidia/parakeet-ctc-0.6b-vi nvidia_nim Vietnamese
nvidia/parakeet-1.1b-rnnt-multilingual-asr nvidia_nim Multilingual RNNT

Models

NVIDIA NIM

Full list in nvidia_nim_models.json.

Popular models:

  • nvidia_nim/minimaxai/minimax-m2.5
  • nvidia_nim/qwen/qwen3.5-397b-a17b
  • nvidia_nim/z-ai/glm5
  • nvidia_nim/stepfun-ai/step-3.5-flash
  • nvidia_nim/moonshotai/kimi-k2.5

Browse: build.nvidia.com

Update model list:

curl "https://integrate.api.nvidia.com/v1/models" > nvidia_nim_models.json
OpenRouter

Hundreds of models from StepFun, OpenAI, Anthropic, Google, and more.

Popular models:

  • open_router/stepfun/step-3.5-flash:free
  • open_router/deepseek/deepseek-r1-0528:free
  • open_router/openai/gpt-oss-120b:free

Browse: openrouter.ai/models

Browse free models: https://openrouter.ai/collections/free-models

LM Studio

Run models locally with LM Studio. Load a model in the Chat or Developer tab, then set MODEL to its identifier.

Examples (native tool-use support):

  • lmstudio-community/qwen2.5-7b-instruct
  • lmstudio-community/Meta-Llama-3.1-8B-Instruct-GGUF
  • bartowski/Ministral-8B-Instruct-2410-GGUF

Browse: model.lmstudio.ai


Configuration

Variable Description Default
MODEL Model to use (prefix format: provider/model/name; invalid prefix causes error) nvidia_nim/stepfun-ai/step-3.5-flash
NVIDIA_NIM_API_KEY NVIDIA API key (NIM provider) required
OPENROUTER_API_KEY OpenRouter API key (OpenRouter provider) required
LM_STUDIO_BASE_URL LM Studio server URL http://localhost:1234/v1
PROVIDER_RATE_LIMIT LLM API requests per window 40
PROVIDER_RATE_WINDOW Rate limit window (seconds) 60
PROVIDER_MAX_CONCURRENCY Max simultaneous open provider streams 5
HTTP_READ_TIMEOUT Read timeout for provider API requests (seconds) 300
HTTP_WRITE_TIMEOUT Write timeout for provider API requests (seconds) 10
HTTP_CONNECT_TIMEOUT Connect timeout for provider API requests (seconds) 2
FAST_PREFIX_DETECTION Enable fast prefix detection true
ENABLE_NETWORK_PROBE_MOCK Enable network probe mock true
ENABLE_TITLE_GENERATION_SKIP Skip title generation true
ENABLE_SUGGESTION_MODE_SKIP Skip suggestion mode true
ENABLE_FILEPATH_EXTRACTION_MOCK Enable filepath extraction mock true
MESSAGING_PLATFORM Messaging platform: discord or telegram discord
DISCORD_BOT_TOKEN Discord Bot Token ""
ALLOWED_DISCORD_CHANNELS Comma-separated channel IDs (empty = none allowed) ""
TELEGRAM_BOT_TOKEN Telegram Bot Token ""
ALLOWED_TELEGRAM_USER_ID Allowed Telegram User ID ""
VOICE_NOTE_ENABLED Enable voice note handling true
WHISPER_MODEL Local Whisper model size base
WHISPER_DEVICE cpu | cuda cpu
MESSAGING_RATE_LIMIT Messaging messages per window 1
MESSAGING_RATE_WINDOW Messaging window (seconds) 1
CLAUDE_WORKSPACE Directory for agent workspace ./agent_workspace
ALLOWED_DIR Allowed directories for agent ""

See .env.example for all supported parameters.


Development

Project Structure

free-claude-code/
├── server.py              # Entry point
├── api/                   # FastAPI routes, request detection, optimization handlers
├── providers/             # BaseProvider, OpenAICompatibleProvider, NIM, OpenRouter, LM Studio
│   └── common/            # Shared utils (SSE builder, message converter, parsers, error mapping)
├── messaging/             # MessagingPlatform ABC + Discord/Telegram bots, session management
├── config/                # Settings, NIM config, logging
├── cli/                   # CLI session and process management
└── tests/                 # Pytest test suite

Commands

uv run ruff format     # Format code
uv run ruff check      # Code style checking
uv run ty check        # Type checking
uv run pytest          # Run tests

Extending

Adding a Provider

For OpenAI-compatible APIs (Groq, Together AI, etc.), extend OpenAICompatibleProvider:

from providers.openai_compat import OpenAICompatibleProvider
from providers.base import ProviderConfig

class MyProvider(OpenAICompatibleProvider):
    def __init__(self, config: ProviderConfig):
        super().__init__(config, provider_name="MYPROVIDER",
                         base_url="https://api.example.com/v1", api_key=config.api_key)

    def _build_request_body(self, request):
        return build_request_body(request)  # Your request builder

For fully custom APIs, extend BaseProvider directly:

from providers.base import BaseProvider, ProviderConfig

class MyProvider(BaseProvider):
    async def stream_response(self, request, input_tokens=0, *, request_id=None):
        # Yield Anthropic SSE format events
        ...

Adding a Messaging Platform

Extend MessagingPlatform in messaging/ to add Slack or other platforms:

from messaging.base import MessagingPlatform

class MyPlatform(MessagingPlatform):
    async def start(self):
        # Initialize connection
        ...

    async def stop(self):
        # Cleanup
        ...

    async def send_message(self, chat_id, text, reply_to=None, parse_mode=None, message_thread_id=None):
        # Send a message
        ...

    async def edit_message(self, chat_id, message_id, text, parse_mode=None):
        # Edit an existing message
        ...

    def on_message(self, handler):
        # Register callback for incoming messages
        ...

Contributing

  • Report bugs or suggest features via Issues
  • Add new LLM providers (Groq, Together AI, etc.)
  • Add new messaging platforms (Slack, etc.)
  • Improve test coverage
  • Not accepting Docker Integration for now
# Fork the repo, then:
git checkout -b my-feature
# Make your changes
uv run ruff format && uv run ruff check && uv run ty check && uv run pytest
# Open a pull request

License

This project is licensed under the MIT License. See the LICENSE file for details.

Built with FastAPI, OpenAI Python SDK, discord.py, and python-telegram-bot.

About

Use claude-code for free in the terminal, VSCode extension or via discord like openclaw

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

Contributors

Languages

  • Python 99.4%
  • Shell 0.6%