Caching

promptcmd includes built-in response caching to reduce API costs and improve response times for identical requests. When caching is enabled, identical prompts with the same parameters return cached responses instead of making new API calls.

How Caching Works

When you run a prompt:

promptcmd generates a cache key based on:
- Prompt content
- Model configuration
- Input parameters
- STDIN content
If a cached response exists and hasn't expired, it's returned immediately
Otherwise, the request is sent to the AI provider and the response is cached

Enabling Caching

Global Caching

Enable caching for all providers:

toml

[providers]
cache_ttl = 60  # Cache for 60 seconds

Set to 0 to disable caching globally:

toml

[providers]
cache_ttl = 0  # No caching

Provider-Specific Caching

Override global settings per provider:

toml

[providers]
cache_ttl = 30  # Global: 30 seconds

[providers.anthropic]
cache_ttl = 60  # Anthropic: 60 seconds

[providers.openai]
cache_ttl = 15  # OpenAI: 15 seconds

[providers.ollama]
cache_ttl = 0  # Ollama: No caching

Variant-Specific Caching

Fine-tune caching per variant:

toml

[providers]
cache_ttl = 20

[providers.anthropic]
cache_ttl = 40

[providers.anthropic.frequently-used]
cache_ttl = 120  # Cache for 2 minutes

[providers.anthropic.always-fresh]
cache_ttl = 0  # Never cache

Configuration Sources

Frontmatter

Caching

How Caching Works

Enabling Caching

Global Caching

Provider-Specific Caching

Variant-Specific Caching

Caching ​

How Caching Works ​

Enabling Caching ​

Global Caching ​

Provider-Specific Caching ​

Variant-Specific Caching ​

Caching

How Caching Works

Enabling Caching

Global Caching

Provider-Specific Caching

Variant-Specific Caching