Groups & Load Balancing

Load balancing allows you to distribute load across multiple AI providers. This helps optimize costs and avoid rate limits.

INFO

Distribution is based on the total amount of tokens consumed by referenced models in the group.

Defining Load Balancing Groups

Define in your config.toml (currently not configurable elsewhere). The general syntax can be in compact or expanded form as follows:

compactexpanded

toml

[groups.{group-name}]
models = [model1, model2, ...]

toml

[groups.{group-name}]
models = [
  { name = "openai", weight = 1 },
  { name = "anthropic", weight = 2 }
]

INFO

The referenced model names are resolved to full model names the same way Model Resolution takes place. Their respective configuration are as well resolved according to Config Resolution rules.

Example: Equal Distribution

The following defines a group called balanced that evenly distributes load across anthropic and openai.

toml

[groups.balanced]
models = ["anthropic", "openai"]

Multiple Providers

toml

[groups.all-providers]
models = ["anthropic", "openai", "google"]

Each provider receives approximately 33% of the load.

Example: Weighted Distribution

Give more requests to specific providers:

toml

[groups.weighted]
providers = [
  { name = "openai", weight = 1 },
  { name = "anthropic", weight = 2 }
]

In this example:

anthropic: 66% of requests (weight 2 out of 3 total)
openai: 33% of requests (weight 1 out of 3 total)

Usage of Groups

A group can be referenced by using its name as the model name whereever a model name can be configured.

For example in a prompt file:

yaml

---
model: balanced
---
Your prompt here

Configuration Sources

Frontmatter

Groups & Load Balancing

Defining Load Balancing Groups

Example: Equal Distribution

Multiple Providers

Example: Weighted Distribution

Usage of Groups

Groups & Load Balancing ​

Defining Load Balancing Groups ​

Example: Equal Distribution ​

Multiple Providers ​

Example: Weighted Distribution ​

Usage of Groups ​

Groups & Load Balancing

Defining Load Balancing Groups

Example: Equal Distribution

Multiple Providers

Example: Weighted Distribution

Usage of Groups