Solutions — InferencePort AI
Solutions Overview

One stack for hosting, chat, and billing.

Use the the subscription (Gen) API for day-to-day chatting, the P2G API for production integrations, AI Shield for fraud detection, and InferencePort AI Server to self-host local AI models with secure team access.

The services behind the InferencePort AI stack.

Each part of the platform has a specific job: AI Shield handles fraud detection and abuse prevention, the generation API powers the default chat experience, the P2G API handles metered production workloads, InferencePort AI Server enables secure frictionless self-hosting of local models, and the console ties billing together.

Generation API
Subscription chat
Best for high-volume, low-token chatting. This is the default chat path, but it uses strict plan quotas and abuse controls.
  • Ideal for everyday chatbot traffic
  • Subscription-backed limits
  • OpenAI-style chat endpoint
P2G API
Credit-billed production API
Best for enterprise integrations and stable production services. Requests charge credits from your wallet instead of consuming plan quotas.
  • Predictable wallet balance
  • Separate credit ledger
  • Managed through the console
Tool Marketplace
InferencePort AI Tool Registry
A registry of tools that anyone can upload to that can be used in the app and explored on the website.
  • Create custom tools in minutes
  • Upload tools to the registry
  • Download and run tools for any case
Chat Services
Local + cloud chat
Use local chat for offline work and the cloud chat stack when you need hosted access, sync, or shared usage across devices.
  • Local chat for private work
  • Cloud chat for hosted access
  • Optional sync and media storage
Self Hosting
InferencePort AI Server
A zero-setup AI server that can be hosted directly from the app. Run your own local LLMs and marketplace models while securely sharing access with your team, customers, or organization.
  • Host local models in minutes
  • Invite users by email
  • Authentication via InferencePort AI accounts
  • Create custom API keys for access control
  • 100 free token verifications daily
Fraud Detection
AI Shield
Real-time abuse and fraud detection for sign-ups, logins, and API traffic. Analyzes email, phone, IP, username, and device signals in a single call.
  • Risk score with confidence level
  • Duplicate and linked-account detection
  • Threat intelligence and recommended actions

Subscription generation for chatty workloads, P2G for production.

Two billing models. Two different jobs.

The subscription generation API is bundled with plan quotas, while P2G is charged against credits in your wallet. The console always shows the live pricing and usage values.

P2G (Credits)
Best for production APIs
Credit pack billing
Current default server rates are credit-based and visible in the console. The repository defaults are 0.75 credits per million text tokens, 0.02 per image, 0.01 per video second, and 0.01 per audio second.
  • Separate wallet and ledger
  • Recharge by purchasing credit packs
  • Recommended for enterprise and production use
AI Shield
Best for fraud prevention
Daily limits by plan
Per-plan daily quotas for abuse detection. Contact us to increase your AI Shield limit for high-volume production use cases.
  • 2–500 analyses per day by plan
  • Email, phone, IP, username, and device signals
  • Risk score, confidence, and recommended actions
InferencePort AI Server
Best for self-hosting
Free to start
Every server starts with 100 free token verifications per day. Invite teammates and users securely through InferencePort AI accounts or custom API keys. Additional verification capacity is available on request.
  • 100 free token verifications daily
  • Email-based team access
  • Custom API key management
  • Run local and marketplace models
  • Contact server@inferenceport.ai for higher limits

Start with the right path, then scale from there.

Step 1
Sign in and inspect your account
Open the console to view your wallet, current plan, usage history, and API keys.
Step 2
Choose cloud or self-hosted AI
Use the Generation API and P2G API for hosted workloads, or launch an InferencePort AI Server directly from the app to run your own local models.
Step 3
Ship with the right billing model
Keep chat traffic on subscription quotas and move production or enterprise traffic to the credit-based P2G API.