A lightweight CLI tool that dynamically routes user prompts to either a Local AI model (for simple tasks, saving cost and preserving privacy) or a Cloud AI model (for complex logic and reasoning).
Built as a proof-of-concept for Hybrid Agentic Development.
This repository includes two implementations with the same behavior:
- Node.js (
src/,npm run build) —hyaivianpm linkornode ./dist/index.js - Rust (
rust/) — nativehyaibinary fromcargo build --release
- Cost Efficient & Fast: Uses
ollama(default:llama3.2) to quickly assess intent and execute simple queries locally. - Multi-Cloud Support: For complex queries, connects to your choice of Cloud AI. Supports:
- OpenAI (
gpt-4o,gpt-4-turbo, etc.) - Anthropic (
claude-3-5-sonnet-latest,claude-3-opus-20240229, etc.) - Google Gemini (
gemini-2.5-pro,gemini-2.5-flash, etc.) - Zhipu AI / GLM (
glm-4, etc.)
- OpenAI (
- Seamless Integration: Interactive terminal UI with spinners and formatting (both implementations).
-
Prerequisites: Make sure you have Ollama running locally with a model.
ollama pull llama3.2
-
API keys: Copy
.env.exampleto.envand add the keys for the cloud providers you use (see.env.example).cp .env.example .env
The CLI loads
.envfrom your current working directory when you run a command. -
Optional — API base URLs: Override endpoints when using proxies, self-hosted gateways, or non-default regions. Both Node and Rust read the same variables (defaults match each provider’s public API):
OLLAMA_HOST— local Ollama (defaulthttp://127.0.0.1:11434)OPENAI_BASE_URL— OpenAI-compatible root including/v1(defaulthttps://api.openai.com/v1)ANTHROPIC_BASE_URL— Anthropic API origin (defaulthttps://api.anthropic.com; requests use/v1/messages)GEMINI_BASE_URL— Google Generative Language API root (defaulthttps://generativelanguage.googleapis.com/v1beta)ZHIPU_BASE_URL— Zhipu GLM OpenAI-compatible root (defaulthttps://open.bigmodel.cn/api/paas/v4)
See
.env.examplefor placeholders.
npm install
npm run build
npm linkAfter npm link, the hyai command points at the Node build.
Requires a Rust toolchain (cargo).
From the repository root:
cargo build --manifest-path rust/Cargo.toml --releaseThe binary is rust/target/release/hyai. Add it to your PATH, run it by full path, or install it globally:
cargo install --path rustIf both the npm-linked and Cargo-installed binaries are named hyai, whichever comes first on your PATH wins; use full paths if you need to run a specific build.
Commands and flags are the same for both implementations.
hyai chatOptional model overrides (short flags):
hyai chat -l phi3 -c gpt-4-turbohyai ask "Write an email to my boss asking for a day off"(Often routed to your local model.)
hyai ask "Write a Python script that calculates the nth Fibonacci number efficiently using dynamic programming and logs the memory usage"(Often routed to your cloud model.)
Rust-only note: From the repo root, you can run without installing:
cargo run --manifest-path rust/Cargo.toml --release -- chat
cargo run --manifest-path rust/Cargo.toml --release -- ask "Hello"- Config (
src/config.ts): Default API base URLs andOLLAMA_HOST(all overridable via.env). - Intent router (
src/router.ts): Calls the local Ollama model with a fixed routing prompt; outputsLOCALorCLOUD(falls back to cloud if routing fails). - Local handler (
src/localHandler.ts): Runs the user prompt through Ollama. - Cloud handler (
src/cloudHandler.ts): Dispatches by model name to OpenAI, Anthropic, Gemini (fetchtoGEMINI_BASE_URL), or GLM (Zhipu).
- Intent router (
rust/src/router.rs): Same routing prompt and Ollama call as the TypeScript version. - Local handler (
rust/src/local_handler.rs): Ollama HTTP API (/api/chat). - Cloud handler (
rust/src/cloud_handler.rs): Same provider selection rules; HTTP clients viareqwest.
Shared Ollama client logic lives in rust/src/ollama.rs. Default HTTP roots and join_url helpers: rust/src/urls.rs. Entry point and CLI parsing: rust/src/main.rs.
