Discover how to configure Baseten's Model APIs for use with Sypha. Gain access to leading open-source models offering enterprise-grade performance, reliability, and competitive pricing.

Baseten provides on-demand, frontier model APIs that are built for production-ready applications rather than just experimentation. Powered by the Baseten Inference Stack, these APIs offer highly optimized inference for open-source models from providers like OpenAI, DeepSeek, Moonshot AI, and Alibaba Cloud.

Website: https://www.baseten.co/products/model-apis/

Getting an API Key

Sign Up or Log In: Visit Baseten and log in to your account.
Go to API Keys: In your dashboard, navigate to the API Keys section.
Create a New Key: Generate a new key and give it a recognizable name (e.g., "Sypha").
Save the Key: Copy the generated API key immediately and store it in a secure location.

Configuration in Sypha

Open Sypha Settings: Click the settings gear icon (⚙️) within the Sypha panel.
Choose Provider: Select "Baseten" under the "API Provider" dropdown menu.
Enter API Key: Paste the Baseten API key you created into the "Baseten API Key" field.
Select Your Model: Choose the model you want to use from the "Model" dropdown list.

IMPORTANT: For Kimi K2 Thinking: To utilize the moonshotai/Kimi-K2-Thinking model, you need to enable Native Tool Call (Experimental) in your Sypha settings. This option is required for this reasoning model to function correctly, as it allows Sypha to interact with native tools.

Supported Models

Sypha supports all models currently available through Baseten Model APIs, including: (Remember to check Baseten's pricing page for the most accurate rates.)

moonshotai/Kimi-K2-Thinking (Moonshot AI) - Enhanced reasoning with step-by-step logic (262K context) - $0.60/$2.50 per 1M tokens
zai-org/GLM-4.6 (Z AI) - Frontier capabilities for agents, reasoning, and coding (200k context) $0.60/$2.20 per 1M tokens
moonshotai/Kimi-K2-Instruct-0905 (Moonshot AI) - Enhanced capability update from September (262K context) - $0.60/$2.50 per 1M tokens
openai/gpt-oss-120b (OpenAI) - 120B MoE with powerful reasoning capabilities (128K context) - $0.10/$0.50 per 1M tokens
Qwen/Qwen3-Coder-480B-A35B-Instruct - Top-tier coding and reasoning (262K context) - $0.38/$1.53 per 1M tokens
Qwen/Qwen3-235B-A22B-Instruct-2507 - High performance in math and logic (262K context) - $0.22/$0.80 per 1M tokens
deepseek-ai/DeepSeek-R1 - First-generation reasoning model from DeepSeek (163K context) - $2.55/$5.95 per 1M tokens
deepseek-ai/DeepSeek-R1-0528 - Revised DeepSeek reasoning model (163K context) - $2.55/$5.95 per 1M tokens
deepseek-ai/DeepSeek-V3-0324 - Fast general-purpose with strong reasoning (163K context) - $0.77/$0.77 per 1M tokens
deepseek-ai/DeepSeek-V3.1 - Hybrid reasoning with enhanced tool usage (163K context) - $0.50/$1.50 per 1M tokens
deepseek-ai/DeepSeek-V3.2 - Efficient long context hybrid reasoning (163K context) - $0.30/$0.45 per 1M tokens

Production-First Architecture

Baseten's APIs are constructed specifically for enterprise production environments:

Enterprise-Grade Reliability

99.99% uptime supported by active-active redundancy measures
Multi-cluster autoscaling for consistent performance, independent of the cloud provider
SOC 2 Type II certified and HIPAA compliant

Optimized Performance

Pre-optimized inference thanks to the Baseten Inference Stack
Latest GPUs distributed across multi-cloud infrastructure
Ultra-fast processing for enterprise workloads

Cost Efficiency

5-10x cheaper compared to equivalent proprietary models
Resource management optimized through a multi-cloud network
Transparent billing with straightforward pricing

Developer Experience

OpenAI compatible making migration incredibly simple
Drop-in replacements for popular cloud models with built-in observability
Easy scaling to dedicated endpoints when your application grows

Special Features

Function Calling & Tool Use

Every model offered through Baseten supports advanced structured outputs and tool operations, making them a great fit for agentic development inside Sypha.

Tips and Notes

Model Auto-Updates: Sypha will dynamically fetch model lists from Baseten, automatically exposing new open-source options as they launch.
High Availability: Baseten's architecture ensures low latency routing for global availability.
Dedicated Support: Speak to Baseten about dedicated capacity setups for heavy production workloads.

Pricing Information

Baseten provides transparent and competitive rates for their models. Prices range primarily from $0.10 to $6.00 per million tokens. Be sure to check the Baseten Model APIs page for exactly up-to-date figures.

Baseten

On this page