Caching
Anthropic prompt caching and Setu server-side caching in @ottocode/ai-sdk.
Anthropic Cache Control
By default, the SDK automatically injects cache_control: { type: "ephemeral" } on the first system block and the last message for Anthropic models. This saves ~90% on cached token costs.
// Default: auto caching (1 system + 1 message breakpoint)
createSetu({ auth });
// Disable completely
createSetu({ auth, cache: { anthropicCaching: false } });
// Manual: SDK won't inject cache_control — set it yourself in messages
createSetu({ auth, cache: { anthropicCaching: { strategy: "manual" } } });
// Custom breakpoint count and placement
createSetu({
auth,
cache: {
anthropicCaching: {
systemBreakpoints: 2, // cache first 2 system blocks
systemPlacement: "first", // "first" | "last" | "all"
messageBreakpoints: 3, // cache last 3 messages
messagePlacement: "last", // "first" | "last" | "all"
},
},
});
// Full custom transform
createSetu({
auth,
cache: {
anthropicCaching: {
strategy: "custom",
transform: (body) => {
// modify body however you want
return body;
},
},
},
});Options Reference
| Option | Default | Description |
|---|---|---|
strategy | "auto" | "auto", "manual", "custom", or false |
systemBreakpoints | 1 | Number of system blocks to cache |
messageBreakpoints | 1 | Number of messages to cache |
systemPlacement | "first" | Which system blocks: "first", "last", "all" |
messagePlacement | "last" | Which messages: "first", "last", "all" |
cacheType | "ephemeral" | The cache_control.type value |
Setu Server-Side Caching
Provider-agnostic caching at the Setu proxy layer:
createSetu({
auth,
cache: {
promptCacheKey: "my-session-123",
promptCacheRetention: "in_memory", // or "24h"
},
});OpenAI / Google
- OpenAI: Automatic server-side prefix caching — no configuration needed
- Google: Requires pre-uploaded
cachedContentat the application level