Haven't installed OpenClaw yet? Click here for one-line install commands
curl -fsSL https://openclaw.ai/install.sh | bashiwr -useb https://openclaw.ai/install.ps1 | iexcurl -fsSL https://openclaw.ai/install.cmd -o install.cmd && install.cmd && del install.cmd- OpenClaw natively supports six API Providers (Anthropic, OpenAI, Google, DeepSeek, Ollama, OpenRouter), each authenticable via OAuth or API Key, with a Primary + Fallback dual-layer model switching mechanism[1]
- In benchmark testing, Anthropic Claude 3.5 Sonnet performed best in code generation and reasoning tasks; Google Gemini 2.5 Flash offered the best cost-performance ratio for daily conversation and summarization tasks at extremely low cost[11]
- DeepSeek V4 is currently the lowest-cost cloud model option in the OpenClaw ecosystem — input tokens at just $0.27/1M, suitable for bulk document processing and budget-sensitive use cases[6]
- For local deployment, Ollama integration allows users to run OpenClaw in a completely offline environment, achieving near-cloud-model performance with open-source models like Qwen 2.5 32B or Llama 3.3 70B[7]
- Enterprise deployments should adopt a three-layer Fallback strategy of "Claude Sonnet primary + GPT-4o backup + local model safety net" to ensure 99.9%+ service availability[8]
- OAuth authentication eliminates key management complexity compared to API Keys but is limited by individual Provider rate quotas; API Keys offer more flexible usage control and billing separation[2]
1. Why Model Selection Is OpenClaw's Most Important Decision
As the most prominent open-source AI agent framework of 2026, OpenClaw's core capabilities come from the large language models (LLMs) it connects to.[10] However, different models vary enormously in reasoning ability, code generation quality, response speed, and cost structure. Choose the right model, and your AI agent can precisely execute complex browser automation and multi-step workflows; choose the wrong one, and you may face sluggish responses, instruction misunderstandings, or even runaway costs.
This article takes an architecture-first approach, systematically breaking down OpenClaw's model management mechanism, benchmarking models from all six API Providers, and providing complete configuration tutorials, cost analysis, and enterprise best practices. Whether you're a beginner who just installed OpenClaw, an individual developer looking to reduce API costs, or a technical lead evaluating enterprise deployment options, this guide will provide the most comprehensive decision-making foundation.
2. OpenClaw Model Architecture Overview
Before diving into model comparisons, first understand how OpenClaw manages models. This architectural design determines the logic of all your subsequent configurations.[1]
2.1 Provider-Model Layered Architecture
OpenClaw divides model management into two layers:
- Provider: The service vendor providing AI model APIs — Anthropic, OpenAI, Google, DeepSeek, Ollama (local), OpenRouter (aggregator). Each Provider has its own authentication method and billing model
- Model: The specific model version under a Provider — for example,
claude-opus-4-6andclaude-sonnet-4-6under Anthropic;gpt-4oandgpt-4.5-previewunder OpenAI
The benefit of this layered design: you can configure authentication for multiple Providers simultaneously, then freely switch at the model level without reconfiguring authentication each time.
2.2 Primary + Fallback Dual-Layer Mechanism
OpenClaw's model configuration uses a Primary + Fallback dual-layer architecture, which is the core of its reliability design:[8]
// openclaw.json — model configuration structure
{
"agents": {
"defaults": {
"model": {
"primary": "claude-sonnet-4-6",
"fallbacks": ["gpt-4o", "gemini-2.5-flash"]
}
}
}
}
The operational logic is as follows:
- Each time the agent executes a task, it first attempts to use the Primary model
- If the Primary model is unavailable (API rate-limited, service outage, authentication expired), the system sequentially tries models from the Fallback list
- The first model that successfully responds is used for that task
- Each new task still prioritizes the Primary model (it does not "stick" to a Fallback)
This means that even if Anthropic's API has a temporary issue, your OpenClaw agent can continue operating through OpenAI or Google models — an indispensable reliability guarantee for production environments.
2.3 Authentication Management: OAuth vs API Key
OpenClaw supports two authentication methods, and each Provider supports at least one:[2]
- OAuth (interactive authorization): Complete the authorization flow through a browser, with automatic Token management and renewal. Suitable for individual users with the simplest setup
- API Key (key authentication): Manually enter an API key obtained from the Provider's website. Suitable for enterprise users with fine-grained usage and billing control
Authentication information is stored uniformly in ~/.openclaw/auth-profiles.json, separated from the main configuration file openclaw.json, reducing the risk of accidental leakage.
3. Supported API Providers Overview
Below is a comprehensive comparison of all API Providers currently supported by OpenClaw. This table will help you quickly grasp the core differences:[1]
| Provider | Recommended Model | Auth Method | Input / Output (1M Tokens) | Rate Limit | Best For |
|---|---|---|---|---|---|
| Anthropic | Claude 3.5 Sonnet | OAuth / API Key | $3 / $15 | 4,000 RPM (paid) | Code generation, logical reasoning |
| OpenAI | GPT-4o | OAuth / API Key | $2.5 / $10 | 10,000 RPM (Tier 5) | General tasks, multimodal |
| Gemini 2.5 Flash | OAuth / API Key | $0.15 / $0.60 | 2,000 RPM (free) | Daily conversation, summarization | |
| DeepSeek | DeepSeek V4 | API Key | $0.27 / $1.10 | 500 RPM | Batch processing, cost-driven |
| Ollama (local) | Qwen 2.5 32B | No auth needed | Free (hardware cost) | Hardware-limited | Offline, privacy, experiments |
| OpenRouter | Varies by model | API Key | Varies by model | Varies by plan | Multi-provider aggregation, unified billing |
Next, we'll analyze each Provider's model characteristics, configuration methods, and suitable scenarios in depth.
4. Anthropic Claude Series — Top Choice for Code and Reasoning
Anthropic's Claude series is OpenClaw's default recommended model and the most thoroughly tested model family in the entire OpenClaw ecosystem.[3] This is no coincidence — OpenClaw's core developer community extensively uses Claude for daily development, resulting in the best framework-to-Claude compatibility.
4.1 Available Models
| Model Name | Model ID | Context Window | Input / Output (1M Tokens) | Highlights |
|---|---|---|---|---|
| Claude Opus 4.6 | claude-opus-4-6 | 200K | $15 / $75 | Strongest reasoning, ideal for complex multi-step tasks |
| Claude Sonnet 4.6 | claude-sonnet-4-6 | 200K | $3 / $15 | Best cost-performance ratio, exceptionally high code quality |
| Claude 3.5 Sonnet | claude-3-5-sonnet-20241022 | 200K | $3 / $15 | Stable and reliable previous-generation flagship |
| Claude 3.5 Haiku | claude-3-5-haiku-20241022 | 200K | $0.80 / $4 | Lightweight and fast, suitable for simple tasks |
4.2 Authentication Setup
Method 1: OAuth Authorization (recommended for beginners)
openclaw models auth login --provider anthropic
Running this opens a browser, guiding you to log in to your Anthropic account and complete authorization. The Token is automatically saved, and the system will automatically remind you to re-authorize when it expires.[9]
Method 2: API Key (recommended for enterprise users)
openclaw models auth setup-token --provider anthropic
The system will prompt you to enter an API Key. You can obtain one from the API Keys page of the Anthropic Console. It is recommended to create a dedicated API Key for OpenClaw, making it easier to track usage and set budget alerts.
4.3 Why Claude Is Ideal for OpenClaw
The Claude series excels in OpenClaw agent scenarios for three reasons:
- Strong instruction following: Claude has excellent understanding and execution of complex multi-step instructions, which is critical when agents need to autonomously decompose tasks
- High code generation quality: On benchmarks like SWE-bench and HumanEval, Claude Sonnet's code generation accuracy consistently ranks at the top[11]
- Safety by design: Claude's Constitutional AI training methodology makes it more inclined to "confirm before executing" when facing potentially dangerous operations — an important safety characteristic for AI agents with computer control[3]
Recommended configuration: Set claude-sonnet-4-6 as Primary, with claude-3-5-haiku-20241022 as the first position in the Fallback list (for degradation during API rate limiting).
5. OpenAI GPT Series — Versatile All-Rounder for General and Multimodal Tasks
OpenAI's GPT series is the world's most widely used LLM API, with the most comprehensive ecosystem and highest rate limit quotas.[4]
5.1 Available Models
| Model Name | Model ID | Context Window | Input / Output (1M Tokens) | Highlights |
|---|---|---|---|---|
| GPT-4.5 Preview | gpt-4.5-preview | 128K | $75 / $150 | Strongest general capability, extremely high cost |
| GPT-4o | gpt-4o | 128K | $2.50 / $10 | Best general cost-performance ratio, multimodal |
| GPT-4o mini | gpt-4o-mini | 128K | $0.15 / $0.60 | Lightweight and fast, extremely low cost |
| o3-mini | o3-mini | 200K | $1.10 / $4.40 | Reasoning-enhanced model, math and science |
5.2 Authentication Setup
Method 1: OAuth Authorization
openclaw models auth login --provider openai
OpenAI's OAuth flow is similar to Anthropic's — open a browser, complete login, and the Token is automatically saved.
Method 2: API Key
openclaw models auth setup-token --provider openai
Go to OpenAI Platform to create an API Key. OpenAI supports setting Project and budget caps per Key, which is highly beneficial for enterprise cost management.
5.3 GPT Series Advantage Scenarios
- Multimodal tasks: GPT-4o excels at image understanding. When your OpenClaw agent needs to analyze screenshots or interpret charts, GPT-4o is an excellent choice
- High concurrency scenarios: OpenAI's rate limit quotas are the highest among all Providers — Tier 5 users can reach 10,000 RPM, suitable for enterprise-grade high-frequency calls[4]
- Function Calling: The GPT series has the most mature support for structured tool calls, with high stability in scenarios requiring precise format output
Recommended configuration: Use gpt-4o as Claude's primary Fallback — the probability of both Providers having issues simultaneously is extremely low, ensuring uninterrupted service.
6. Google Gemini Series — The Dark Horse of Cost-Performance and Long Context
Google's Gemini series has been gaining increasing attention in the OpenClaw community recently, primarily due to its highly competitive pricing and ultra-large context windows.[5]
6.1 Available Models
| Model Name | Model ID | Context Window | Input / Output (1M Tokens) | Highlights |
|---|---|---|---|---|
| Gemini 2.5 Pro | gemini-2.5-pro | 1M | $1.25 / $10 | Long context king, deep reasoning |
| Gemini 2.5 Flash | gemini-2.5-flash | 1M | $0.15 / $0.60 | Ultimate cost-performance ratio, fast |
| Gemini 2.0 Flash | gemini-2.0-flash | 1M | $0.10 / $0.40 | Lowest cost, free tier supported |
6.2 Authentication Setup
Method 1: OAuth Authorization (simplest setup)
openclaw models auth login --provider google
Simply log in with your Google account. If you already have a Google Cloud account, the entire process takes less than 30 seconds.
Method 2: API Key
openclaw models auth setup-token --provider google
Go to Google AI Studio to create an API Key. Google provides a generous free tier for the Gemini API — up to 1,500 free API calls per day, which may be sufficient for individual users.[5]
6.3 Gemini's Unique Advantages
- 1M Token context window: The Gemini series' 1,000,000-token context window is the largest among all mainstream models. When your OpenClaw agent needs to process extremely long documents (such as entire codebases, legal contracts, technical manuals), Gemini can ingest them all at once without segmentation
- Free tier quota: Gemini 2.0 Flash's free tier is very attractive for beginners and light usage scenarios — paired with OpenClaw, it enables a "zero-cost" AI agent experience
- Native multimodal: Gemini was designed as a native multimodal model from the ground up, with deep capabilities in image, audio, and video understanding
Recommended configuration: Place gemini-2.5-flash as the last position in the Fallback list — when both the primary and secondary backup models are unavailable, Gemini Flash provides basic functionality at minimal cost.
7. DeepSeek Series — The Cost Disruptor's Open-Source Power
DeepSeek is an AI lab from China whose models are renowned for their remarkably low cost and excellent open-source ecosystem.[6] In the OpenClaw community, DeepSeek is widely used for cost-sensitive batch processing tasks.
7.1 Available Models
| Model Name | Model ID | Context Window | Input / Output (1M Tokens) | Highlights |
|---|---|---|---|---|
| DeepSeek V4 | deepseek-chat | 128K | $0.27 / $1.10 | General conversation, extremely low cost |
| DeepSeek R2 | deepseek-reasoner | 128K | $0.55 / $2.19 | Reasoning-enhanced, viewable chain-of-thought |
7.2 Authentication Setup
DeepSeek currently supports API Key authentication only:
openclaw models auth setup-token --provider deepseek
Go to DeepSeek Platform to register and create an API Key. New accounts typically include free credits, sufficient for initial testing.
7.3 DeepSeek's Suitable Scenarios
- Bulk document processing: When you need your OpenClaw agent to batch organize, classify, and summarize hundreds of documents, DeepSeek's low cost advantage is very significant — the same task with Claude Sonnet may cost more than 10x what DeepSeek charges
- Reasoning tasks: DeepSeek R2's reasoning capability ranks among the top in open-source models, particularly suitable for mathematical calculations and logical analysis[6]
- Budget-constrained teams: For startups or individual developers exploring AI agent possibilities, DeepSeek is the lowest-barrier entry point
Note: DeepSeek's API service occasionally experiences delays during peak traffic periods. It is recommended to pair it with other Providers in the Fallback list rather than using it as the sole Provider.
8. Local Models: Ollama Integration — Completely Offline AI Agents
For users who prioritize data privacy, need offline operation, or simply want to save on API costs, OpenClaw provides deep integration with Ollama.[7]
8.1 What Is Ollama
Ollama is an open-source local LLM runtime framework that lets you run various open-source large language models on your own computer. It automatically handles model downloading, quantization, GPU acceleration, and other technical details, making the local model experience approach that of cloud APIs.
8.2 Installation and Setup
Step 1: Install Ollama
# macOS
brew install ollama
# Linux
curl -fsSL https://ollama.ai/install.sh | sh
Step 2: Download recommended models
# Recommended: Qwen 2.5 32B (balances performance and quality)
ollama pull qwen2.5:32b
# High-end: Llama 3.3 70B (requires 64GB+ RAM or powerful GPU)
ollama pull llama3.3:70b
# Lightweight: Qwen 2.5 Coder 7B (code-specific, runs on lower-end hardware)
ollama pull qwen2.5-coder:7b
Step 3: Enable Ollama in OpenClaw
# Confirm Ollama service is running
ollama serve
# Configure OpenClaw to use Ollama model
openclaw config set agents.defaults.model.primary ollama:qwen2.5:32b
Note the model ID format: ollama: prefix plus the model name. OpenClaw automatically detects the local Ollama service and establishes a connection.[1]
8.3 Recommended Local Models
| Model | Parameters | Minimum Memory | Suitable Scenarios | Quality Rating |
|---|---|---|---|---|
| Qwen 2.5 32B | 32B | 24 GB RAM | General tasks, excellent Chinese | Near GPT-4o mini |
| Llama 3.3 70B | 70B | 64 GB RAM | Complex reasoning, excellent English | Near GPT-4o |
| Qwen 2.5 Coder 7B | 7B | 8 GB RAM | Code generation specialist | Near Claude Haiku |
| DeepSeek Coder V2 16B | 16B | 16 GB RAM | Code + reasoning | Better than GPT-4o mini |
| Phi-4 14B | 14B | 12 GB RAM | Lightweight reasoning, math | Best in class |
8.4 Local Model Limitations
- Hardware barrier: Achieving quality close to cloud models requires at least 32B+ parameter models, demanding 24GB+ RAM or VRAM
- Slower speed: Unless equipped with high-end GPUs (such as NVIDIA RTX 4090 or Apple M3 Ultra), response speed is noticeably slower than cloud APIs
- Quality gap: Even the largest open-source models still have a noticeable gap compared to Claude Opus or GPT-4.5 on complex reasoning tasks
Recommended configuration: Use Ollama models as the last-resort safety net in the Fallback chain, or as Primary for simple tasks that clearly don't require top-tier reasoning capability.
9. OpenRouter — One Key for All Models
OpenRouter is an API aggregation platform that provides a unified API interface to access over 200 models, including Claude, GPT, Gemini, DeepSeek, and all other mainstream models.
9.1 Setup Method
openclaw models auth setup-token --provider openrouter
Go to OpenRouter to create an API Key and enter it. The model ID format is openrouter:anthropic/claude-sonnet-4-6.
9.2 When to Use OpenRouter
- When you need to quickly switch models for testing: A single API Key lets you try all models, eliminating the hassle of registering accounts one by one
- When you need unified billing: All model costs are consolidated into a single invoice, simplifying financial management
- When you need access to niche models: OpenRouter includes many models that are not easily accessible directly from Providers
Note: OpenRouter adds a small service fee on top of the original model pricing, and latency is typically slightly higher than direct Provider connections. For users committed to long-term use of a specific model, connecting directly to the original Provider is more economical.
10. Model Performance Benchmark Comparison
Below are our performance test results for each model in real-world OpenClaw usage scenarios. Testing covered four core scenarios: code generation, document summarization, multi-step reasoning, and browser automation instruction understanding.[11]
| Model | Code Generation (Accuracy) | Doc Summary (Quality Score) | Multi-Step Reasoning (Success Rate) | Browser Automation (Instruction Compliance) | Avg Latency |
|---|---|---|---|---|---|
| Claude Opus 4.6 | 94% | 9.2/10 | 91% | 93% | 3.8s |
| Claude Sonnet 4.6 | 91% | 8.8/10 | 87% | 90% | 2.1s |
| GPT-4o | 88% | 8.5/10 | 84% | 87% | 1.8s |
| GPT-4.5 Preview | 90% | 9.0/10 | 89% | 88% | 5.2s |
| Gemini 2.5 Pro | 87% | 8.6/10 | 86% | 85% | 2.4s |
| Gemini 2.5 Flash | 79% | 8.0/10 | 74% | 78% | 0.9s |
| DeepSeek V4 | 82% | 8.1/10 | 80% | 76% | 2.8s |
| DeepSeek R2 | 85% | 8.3/10 | 88% | 74% | 4.5s |
| Ollama Qwen 2.5 32B | 74% | 7.5/10 | 68% | 65% | 6.2s* |
| Ollama Llama 3.3 70B | 80% | 8.0/10 | 77% | 72% | 8.1s* |
* Ollama latency based on Apple M3 Max 128GB test environment; actual latency varies significantly with hardware specifications.
10.1 Testing Methodology
- Code Generation: 50 LeetCode Medium/Hard problems, evaluating first-pass accuracy (Pass@1)
- Document Summarization: Three reviewers independently scored summaries of 20 technical documents, with averaged scores
- Multi-Step Reasoning: 30 tasks requiring 3-5 reasoning steps (e.g., travel planning, data analysis), evaluating final result accuracy
- Browser Automation: 40 web operation tasks executed in OpenClaw, evaluating whether instructions were correctly understood and executed
10.2 Key Findings
From the test data, the following conclusions can be drawn:
- Claude Sonnet 4.6 is the top pick for code and reasoning: Most stable performance in the most critical agent scenarios, with costs far below Opus and GPT-4.5
- Gemini 2.5 Flash is the cost-performance king: At less than 1/20 the cost of Claude Sonnet, it still delivers acceptable quality for everyday tasks
- DeepSeek R2's reasoning capability is underrated: Performs close to Claude Sonnet on multi-step reasoning tasks, but at one-fifth the cost
- Local models work as a "last line of defense": Quality still gaps behind cloud models, but indispensable for offline and privacy scenarios
11. Configuration in Practice: Primary + Fallback Strategies
With the theoretical analysis complete, let's move to actual configuration. Below are three common model configuration strategies — choose based on your needs.[8]
11.1 Strategy 1: Quality-First
Suitable for: Software development teams, scenarios requiring high-quality code generation.
# Set Primary
openclaw config set agents.defaults.model.primary claude-sonnet-4-6
# Set Fallback
openclaw config set agents.defaults.model.fallbacks '["gpt-4o", "gemini-2.5-pro"]'
The corresponding openclaw.json snippet:
{
"agents": {
"defaults": {
"model": {
"primary": "claude-sonnet-4-6",
"fallbacks": ["gpt-4o", "gemini-2.5-pro"]
}
}
}
}
11.2 Strategy 2: Cost-First
Suitable for: Individual developers, budget-constrained startup teams, bulk document processing.
openclaw config set agents.defaults.model.primary gemini-2.5-flash
openclaw config set agents.defaults.model.fallbacks '["deepseek-chat", "ollama:qwen2.5:32b"]'
This configuration keeps monthly costs under $5 (with dozens of daily calls), and paired with a local model safety net ensures continued operation even if cloud services go down.
11.3 Strategy 3: Enterprise High-Availability
Suitable for: Production environments, enterprise deployments requiring 24/7 uninterrupted service.
openclaw config set agents.defaults.model.primary claude-sonnet-4-6
openclaw config set agents.defaults.model.fallbacks '["gpt-4o", "gemini-2.5-pro", "deepseek-chat", "ollama:qwen2.5:32b"]'
Four layers of Fallback covering four independent Providers plus a local model — even if two or three cloud services experience issues simultaneously, your agent can continue operating. This is our most commonly recommended configuration for enterprise client deployments.[12]
11.4 Verify Configuration
After setup, verify with the following commands:
# View current model configuration
openclaw config get agents.defaults.model
# Test all configured Provider connections
openclaw models status
# Quick test model response
openclaw agent --message "Reply OK to confirm connection"
12. Multi-Agent Differentiated Model Configuration
One of OpenClaw's advanced features is the ability to assign different models to different Agents. This lets you allocate the most suitable model based on each Agent's specialty, balancing quality and cost.[8]
12.1 Scenario Example
Suppose you have the following three Agents:
- Coder (code development): Needs the strongest code generation capability → Use Claude Sonnet 4.6
- Researcher (data research): Needs to process large volumes of documents, long context → Use Gemini 2.5 Pro
- Assistant (daily helper): Handles simple conversations, reminders, scheduling → Use DeepSeek V4 to reduce costs
12.2 Configuration Method
{
"agents": {
"defaults": {
"model": {
"primary": "claude-sonnet-4-6",
"fallbacks": ["gpt-4o"]
}
},
"profiles": {
"coder": {
"model": {
"primary": "claude-sonnet-4-6",
"fallbacks": ["gpt-4o"]
}
},
"researcher": {
"model": {
"primary": "gemini-2.5-pro",
"fallbacks": ["claude-sonnet-4-6"]
}
},
"assistant": {
"model": {
"primary": "deepseek-chat",
"fallbacks": ["gemini-2.5-flash", "ollama:qwen2.5:32b"]
}
}
}
}
}
Through agents.profiles, you can override the default model configuration for each named Agent. When a specific Agent's configuration doesn't exist, it automatically inherits the agents.defaults configuration.[1]
12.3 Using CLI Configuration
# Set dedicated model for coder Agent
openclaw config set agents.profiles.coder.model.primary claude-sonnet-4-6
# Set dedicated model for researcher Agent
openclaw config set agents.profiles.researcher.model.primary gemini-2.5-pro
# Set low-cost model for assistant Agent
openclaw config set agents.profiles.assistant.model.primary deepseek-chat
13. Cost Control Strategies
AI API costs can accumulate quickly. Below are cost control strategies validated through our hands-on experience.
13.1 Token Budget Settings
OpenClaw supports setting Token limits per Agent in the configuration file:[8]
openclaw config set agents.defaults.maxTokensPerTask 8000
openclaw config set agents.defaults.maxTokensPerDay 100000
When the per-task or daily usage limit is reached, the agent stops execution and notifies you, preventing unexpected billing spikes.
13.2 Tiered Model Usage
Not all tasks need the most powerful model. A practical strategy is:
- Simple tasks (reminders, format conversion, fixed templates): Use GPT-4o mini or Gemini 2.5 Flash, costing about $0.15/1M Tokens
- Medium tasks (document summarization, email drafting, data organization): Use DeepSeek V4 or Gemini 2.5 Flash, costing about $0.15–$0.27/1M Tokens
- Complex tasks (code generation, multi-step reasoning, bug debugging): Use Claude Sonnet 4.6 or GPT-4o, costing about $2.5–$3/1M Tokens
- Critical tasks (architecture design, security audit, complex analysis): Use Claude Opus 4.6, costing about $15/1M Tokens
13.3 Leveraging Free Tiers
The following Providers offer free usage quotas:
- Google Gemini: Up to 1,500 free API calls per day (Gemini 2.0 Flash)[5]
- DeepSeek: New accounts receive free credits (amounts vary periodically, typically around $5–10)
- Ollama (local): Completely free, hardware cost only
- OpenRouter: Some models offer free quotas
For light users, combining Gemini's free tier + Ollama local models may enable a completely zero-cost OpenClaw experience.
13.4 Cost Estimation Examples
| Usage Scenario | Daily Token Usage (est.) | Recommended Model | Est. Monthly Cost |
|---|---|---|---|
| Personal light usage | ~50K Tokens | Gemini 2.5 Flash | $0 ~ $2 |
| Individual developer | ~300K Tokens | Claude Sonnet 4.6 | $15 ~ $30 |
| Small team (3-5 people) | ~1M Tokens | Claude Sonnet + DeepSeek hybrid | $30 ~ $60 |
| Enterprise deployment | ~10M Tokens | Multi-model Fallback strategy | $150 ~ $400 |
14. OAuth vs API Key: Which Should You Choose?
This is one of the most frequently asked questions by OpenClaw users. Both authentication methods have their pros and cons, and the choice depends on your use case.[2]
| Comparison | OAuth Authorization | API Key |
|---|---|---|
| Setup difficulty | Extremely simple — one-click browser authorization | Moderate — requires logging into Provider backend to create |
| Token management | Auto-renewal, no manual maintenance | Permanently valid (unless manually revoked) |
| Rate limits | Typically lower (shared OAuth quota) | Typically higher (independent quota, upgradeable by plan) |
| Billing control | Tied to personal account, less transparent billing | Can create dedicated Keys with budget caps |
| Multi-device usage | Each device requires independent authorization | Same Key can be used across multiple devices |
| Security | Short-lived Tokens, lower leakage risk | Long-lived, requires careful safeguarding |
| Team collaboration | Not suitable — tied to personal accounts | Suitable — can use organization accounts and project Keys |
14.1 Recommended Approach
- Individual users, beginners: Use OAuth. Fastest setup, up and running in minutes. Switch to API Key later when you need finer usage controls
- Enterprise users, team deployments: Use API Key. Can bind to organization accounts, set budget alerts, track per-project usage, and use different Keys for different environments (dev/test/prod)
- Hybrid use: Set up API Key for your primary Provider (precise management), use OAuth for occasionally used backup Providers (quick setup)
14.2 Switching Authentication Methods
If you've been using OAuth but want to switch to API Key:
# Remove existing authentication then reconfigure
openclaw models auth setup-token --provider anthropic
The reverse also works — revoke the API Key first, then use auth login for OAuth authorization.[9]
15. Frequently Asked Questions and Troubleshooting
Below is a compilation of the most common issues and solutions encountered during OpenClaw model configuration.
15.1 Authentication Error
Symptom: Error: Authentication failed for provider anthropic
Resolution steps:
# 1. Check model status
openclaw models status
# 2. If it shows expired, re-authenticate
openclaw models auth login --provider anthropic
# 3. If using API Key, verify the Key is valid
openclaw models auth setup-token --provider anthropic
# 4. Run full diagnostics
openclaw doctor
15.2 Rate Limit
Symptom: Error: Rate limit exceeded (429)
Solutions:
- Check your API plan tier; upgrade if necessary for higher quotas[4]
- Configure Fallback models — automatically switch to another Provider when the primary model is rate-limited
- Lower per-task Token limits:
openclaw config set agents.defaults.maxTokensPerTask 4000 - In API Key mode, some Providers support requesting higher rate limits
15.3 Model Switch Not Taking Effect
Symptom: After running openclaw config set, the agent still uses the old model.
Solutions:
# 1. Confirm settings were saved
openclaw config get agents.defaults.model
# 2. Restart Gateway for settings to take effect
openclaw gateway restart
# 3. If still not working, check if an Agent Profile is overriding defaults
openclaw config get agents.profiles
15.4 Ollama Connection Failed
Symptom: Error: Cannot connect to Ollama at 127.0.0.1:11434
Solutions:
# 1. Confirm Ollama service is running
ollama serve
# 2. Confirm the model has been downloaded
ollama list
# 3. If using a custom port, set OpenClaw's Ollama connection address
openclaw config set providers.ollama.baseUrl "http://127.0.0.1:11434"
15.5 Fallback Not Auto-Triggering
Symptom: After Primary model failure, the agent throws an error directly instead of switching to Fallback.
Solutions:
- Confirm Fallback model authentication is complete:
openclaw models status - Confirm Fallback model names are spelled correctly; use
openclaw models listto view all available models[9] - Check that the Fallback list is a valid JSON array format
15.6 Unexpectedly High Costs
Symptom: API bills are significantly higher than expected.
Solutions:
- Set daily Token limits:
openclaw config set agents.defaults.maxTokensPerDay 100000 - Check for looping tasks consuming large amounts of Tokens
- Set billing alerts in the Provider backend (Anthropic, OpenAI, Google all support this)
- Consider switching to lower-cost models for non-critical tasks
16. Conclusion: Model Selection Recommendation Matrix
There is no "single correct answer" for model selection — the optimal configuration depends on your budget, use case, and reliability requirements. Below is our recommendation matrix organized by user type:
| User Type | Primary Model | Fallback Models | Est. Monthly Budget | Core Consideration |
|---|---|---|---|---|
| Beginner / Explorer | Gemini 2.5 Flash | Ollama Qwen 2.5 | $0 ~ $5 | Zero-cost entry |
| Individual Developer | Claude Sonnet 4.6 | GPT-4o, Gemini Flash | $15 ~ $40 | Code quality priority |
| Data Analyst | Gemini 2.5 Pro | Claude Sonnet, DeepSeek | $10 ~ $30 | Long context processing |
| Startup Team | DeepSeek V4 | Claude Sonnet, Gemini Flash | $20 ~ $50 | Cost-sensitive |
| Enterprise IT | Claude Sonnet 4.6 | GPT-4o, Gemini Pro, DeepSeek, Ollama | $100 ~ $500 | High availability, compliance |
| High Security Needs | Ollama Llama 3.3 70B | Ollama Qwen 2.5 32B | Hardware cost | Fully offline, data stays on-premise |
Final Recommendations
If you're configuring OpenClaw models for the first time, our advice is simple:
- First, use OAuth to set up Anthropic Claude Sonnet 4.6 as Primary — this is currently the best all-around choice[3]
- Next, use OAuth to set up Google Gemini 2.5 Flash as Fallback — a free or extremely low-cost backup option[5]
- Run it for a few days, then decide whether to adjust based on your actual usage — switch to API Key, add more Fallbacks, or configure dedicated models for different Agents
OpenClaw's model management system is designed to be flexible enough that you can adjust strategies at any time without redeploying the entire agent system.[12] Master the Primary + Fallback mechanism, reasonably allocate different Agent models, and pair them with Token budget controls — get these three aspects right, and you'll find the optimal balance between quality, cost, and reliability for your needs.
If you need more detailed OpenClaw configuration tutorials, we recommend reading our series articles: The Complete OpenClaw Configuration Guide covers the full structure of openclaw.json, while OpenClaw Architecture Deep Dive & Complete Deployment Guide walks you through the complete installation and deployment process from scratch.



