Setu

AI inference proxy powered by Solana USDC payments. No API keys — just a wallet.

What is Setu?

Setu is an AI inference proxy that lets any developer access models from OpenAI, Anthropic, Google, Moonshot, Zai, and MiniMax using a single Solana wallet. Instead of managing separate API keys and billing accounts with each provider, you top up a USDC balance and Setu routes your requests to the right provider.

  • No API keys needed — authenticate with your Solana wallet
  • Pay with USDC — top up via on-chain USDC transfers or credit card (Polar)
  • Pure passthrough — request bodies are forwarded unchanged, preserving full feature parity with native APIs
  • Pay-as-you-go — per-token billing with a 0.5% markup over base provider rates

Architecture

Client (with Solana wallet)
  │
  ├─ Signs request with wallet private key
  │
  ▼
Setu Router (Cloudflare Worker)
  │
  ├─ Verifies wallet signature (auth middleware)
  ├─ Checks USDC balance (balance-check middleware)
  ├─ If balance < $0.05 → returns 402 with x402 payment options
  │
  ├─ Routes by model:
  │   ├─ OpenAI models    → /v1/responses         → api.openai.com
  │   ├─ Anthropic models → /v1/messages           → api.anthropic.com
  │   ├─ Google models    → /v1/models/{model}     → generativelanguage.googleapis.com
  │   ├─ Moonshot models  → /v1/chat/completions   → api.moonshot.ai
  │   ├─ Zai models       → /v1/chat/completions   → open.bigmodel.cn
  │   ├─ MiniMax models   → /v1/messages           → api.minimax.io
  │   └─ Google (compat)  → /v1/chat/completions   → Google OpenAI-compat endpoint
  │
  ├─ Tracks token usage from provider response
  ├─ Deducts cost from user balance
  └─ Returns response with cost metadata

Supported Providers

ProviderEndpointAPI FormatFeatures
OpenAI/v1/responsesOpenAI Responses APIReasoning, tool calling, vision, streaming, background mode (Pro models)
Anthropic/v1/messagesAnthropic Messages APIPrompt caching, extended thinking, tool calling, vision, streaming
Google/v1/models/{model}:generateContentGoogle Generative AI (native)Tool calling, reasoning, streaming
Google/v1/chat/completionsOpenAI-compatibleTool calling, reasoning, streaming
Moonshot/v1/chat/completionsOpenAI-compatibleTool calling, reasoning, streaming
Zai/v1/chat/completionsOpenAI-compatibleTool calling, reasoning, streaming
MiniMax/v1/messagesAnthropic Messages APITool calling, reasoning, streaming

Available Models

OpenAI

ModelInput $/1MOutput $/1MCache ReadContextMax Output
gpt-5$1.25$10.00$0.125400K128K
gpt-5-chat-latest$1.25$10.00400K128K
gpt-5-codex$1.25$10.00$0.125400K128K
gpt-5-mini$0.25$2.00$0.025400K128K
gpt-5-nano$0.05$0.40$0.005400K128K
gpt-5-pro$15.00$120.00400K272K
gpt-5.1$1.25$10.00$0.13400K128K
gpt-5.1-chat-latest$1.25$10.00$0.125128K16K
gpt-5.1-codex$1.25$10.00$0.125400K128K
gpt-5.1-codex-max$1.25$10.00$0.125400K128K
gpt-5.1-codex-mini$0.25$2.00$0.025400K128K
gpt-5.2$1.75$14.00$0.175400K128K
gpt-5.2-chat-latest$1.75$14.00$0.175128K16K
gpt-5.2-codex$1.75$14.00$0.175400K128K
gpt-5.2-pro$21.00$168.00400K128K
gpt-5.3-codex$1.75$14.00$0.175400K128K
gpt-5.3-codex-spark$1.75$14.00$0.175128K32K

Anthropic

ModelInput $/1MOutput $/1MCache ReadCache WriteContextMax Output
claude-sonnet-4-6$3.00$15.00$0.30$3.75200K64K
claude-sonnet-4-5$3.00$15.00$0.30$3.75200K64K
claude-sonnet-4-5-20250929$3.00$15.00$0.30$3.75200K64K
claude-sonnet-4-0$3.00$15.00$0.30$3.75200K64K
claude-sonnet-4-20250514$3.00$15.00$0.30$3.75200K64K
claude-opus-4-6$5.00$25.00$0.50$6.25200K128K
claude-opus-4-5$5.00$25.00$0.50$6.25200K64K
claude-opus-4-5-20251101$5.00$25.00$0.50$6.25200K64K
claude-opus-4-1$15.00$75.00$1.50$18.75200K32K
claude-opus-4-1-20250805$15.00$75.00$1.50$18.75200K32K
claude-opus-4-0$15.00$75.00$1.50$18.75200K32K
claude-opus-4-20250514$15.00$75.00$1.50$18.75200K32K
claude-haiku-4-5$1.00$5.00$0.10$1.25200K64K
claude-haiku-4-5-20251001$1.00$5.00$0.10$1.25200K64K
claude-3-5-haiku-latest$0.80$4.00$0.08$1.00200K8K
claude-3-5-haiku-20241022$0.80$4.00$0.08$1.00200K8K
claude-3-5-sonnet-20241022$3.00$15.00$0.30$3.75200K8K
claude-3-5-sonnet-20240620$3.00$15.00$0.30$3.75200K8K

Google

ModelInput $/1MOutput $/1MCache ReadContextMax Output
gemini-3.1-pro-preview$2.00$12.00$0.201M65K
gemini-3.1-pro-preview-customtools$2.00$12.00$0.201M65K
gemini-3-pro-preview$2.00$12.00$0.201M64K
gemini-3-flash-preview$0.50$3.00$0.051M65K

Moonshot (Kimi)

ModelInput $/1MOutput $/1MCache ReadContextMax Output
kimi-k2.5$0.60$3.00$0.10256K256K
kimi-k2-thinking$0.60$2.50$0.15256K256K
kimi-k2-thinking-turbo$1.15$8.00$0.15256K256K
kimi-k2-turbo-preview$2.40$10.00$0.60256K256K
kimi-k2-0905-preview$0.60$2.50$0.15256K256K
kimi-k2-0711-preview$0.60$2.50$0.15128K16K

Zai

ModelInput $/1MOutput $/1MCache ReadContextMax Output
glm-5$1.00$3.20$0.20204K131K
glm-4.7$0.60$2.20$0.11204K131K
glm-4.7-flashfreefree200K131K

MiniMax

ModelInput $/1MOutput $/1MContextMax Output
MiniMax-M2.5$0.30$1.20204K131K
MiniMax-M2.1$0.30$1.20204K131K

All prices are base rates. Setu applies a 0.5% markup. Live pricing available at GET /v1/models.

Environments

EnvironmentNetworkUSDC MintMin Top-upTop-up Options
Developmentsolana-devnet4zMMC9...TDt1v$0.10$0.10, $1, $5, $10
Productionsolana (mainnet)EPjFWdd5...TDt1v$5.00$5, $10, $25, $50

Base URL

https://api.setu.ottocode.io

All endpoints are prefixed with /v1.

Client SDK

The @ottocode/ai-sdk package is the recommended way to integrate with Setu. It handles wallet auth, x402 payments, provider routing, and Anthropic prompt caching automatically.

bun add @ottocode/ai-sdk ai

See the AI SDK docs for full usage examples, or the Integration Guide for raw HTTP usage.