GitHub - naimjeem/hybrid-ai-cli: a small terminal tool that routes each prompt to local Ollama or the cloud so you’re not burning API tokens on “quick questions,” but you still get a strong model when the task needs it. · GitHub
Skip to content

naimjeem/hybrid-ai-cli

Folders and files

Repository files navigation

Hybrid AI CLI

A lightweight CLI tool that dynamically routes user prompts to either a Local AI model (for simple tasks, saving cost and preserving privacy) or a Cloud AI model (for complex logic and reasoning).

Built as a proof-of-concept for Hybrid Agentic Development.

This repository includes two implementations with the same behavior:

  • Node.js (src/, npm run build) — hyai via npm link or node ./dist/index.js
  • Rust (rust/) — native hyai binary from cargo build --release

Features

  • Cost Efficient & Fast: Uses ollama (default: llama3.2) to quickly assess intent and execute simple queries locally.
  • Multi-Cloud Support: For complex queries, connects to your choice of Cloud AI. Supports:
    • OpenAI (gpt-4o, gpt-4-turbo, etc.)
    • Anthropic (claude-3-5-sonnet-latest, claude-3-opus-20240229, etc.)
    • Google Gemini (gemini-2.5-pro, gemini-2.5-flash, etc.)
    • Zhipu AI / GLM (glm-4, etc.)
  • Seamless Integration: Interactive terminal UI with spinners and formatting (both implementations).

Setup (shared)

  1. Prerequisites: Make sure you have Ollama running locally with a model.

    ollama pull llama3.2
  2. API keys: Copy .env.example to .env and add the keys for the cloud providers you use (see .env.example).

    cp .env.example .env

    The CLI loads .env from your current working directory when you run a command.

  3. Optional — API base URLs: Override endpoints when using proxies, self-hosted gateways, or non-default regions. Both Node and Rust read the same variables (defaults match each provider’s public API):

    • OLLAMA_HOST — local Ollama (default http://127.0.0.1:11434)
    • OPENAI_BASE_URL — OpenAI-compatible root including /v1 (default https://api.openai.com/v1)
    • ANTHROPIC_BASE_URL — Anthropic API origin (default https://api.anthropic.com; requests use /v1/messages)
    • GEMINI_BASE_URL — Google Generative Language API root (default https://generativelanguage.googleapis.com/v1beta)
    • ZHIPU_BASE_URL — Zhipu GLM OpenAI-compatible root (default https://open.bigmodel.cn/api/paas/v4)

    See .env.example for placeholders.

Install — Node.js

npm install
npm run build
npm link

After npm link, the hyai command points at the Node build.

Install — Rust (native)

Requires a Rust toolchain (cargo).

From the repository root:

cargo build --manifest-path rust/Cargo.toml --release

The binary is rust/target/release/hyai. Add it to your PATH, run it by full path, or install it globally:

cargo install --path rust

If both the npm-linked and Cargo-installed binaries are named hyai, whichever comes first on your PATH wins; use full paths if you need to run a specific build.

Usage

Commands and flags are the same for both implementations.

Interactive chat

hyai chat

Optional model overrides (short flags):

hyai chat -l phi3 -c gpt-4-turbo

Single question

hyai ask "Write an email to my boss asking for a day off"

(Often routed to your local model.)

hyai ask "Write a Python script that calculates the nth Fibonacci number efficiently using dynamic programming and logs the memory usage"

(Often routed to your cloud model.)

Rust-only note: From the repo root, you can run without installing:

cargo run --manifest-path rust/Cargo.toml --release -- chat
cargo run --manifest-path rust/Cargo.toml --release -- ask "Hello"

Architecture

Node.js

  1. Config (src/config.ts): Default API base URLs and OLLAMA_HOST (all overridable via .env).
  2. Intent router (src/router.ts): Calls the local Ollama model with a fixed routing prompt; outputs LOCAL or CLOUD (falls back to cloud if routing fails).
  3. Local handler (src/localHandler.ts): Runs the user prompt through Ollama.
  4. Cloud handler (src/cloudHandler.ts): Dispatches by model name to OpenAI, Anthropic, Gemini (fetch to GEMINI_BASE_URL), or GLM (Zhipu).

Rust

  1. Intent router (rust/src/router.rs): Same routing prompt and Ollama call as the TypeScript version.
  2. Local handler (rust/src/local_handler.rs): Ollama HTTP API (/api/chat).
  3. Cloud handler (rust/src/cloud_handler.rs): Same provider selection rules; HTTP clients via reqwest.

Shared Ollama client logic lives in rust/src/ollama.rs. Default HTTP roots and join_url helpers: rust/src/urls.rs. Entry point and CLI parsing: rust/src/main.rs.

About

a small terminal tool that routes each prompt to local Ollama or the cloud so you’re not burning API tokens on “quick questions,” but you still get a strong model when the task needs it.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

Contributors