Baseten
Discover how to configure Baseten's Model APIs for use with Sypha. Gain access to leading open-source models offering enterprise-grade performance, reliability, and competitive pricing.
Baseten provides on-demand, frontier model APIs that are built for production-ready applications rather than just experimentation. Powered by the Baseten Inference Stack, these APIs offer highly optimized inference for open-source models from providers like OpenAI, DeepSeek, Moonshot AI, and Alibaba Cloud.
Website: https://www.baseten.co/products/model-apis/
Getting an API Key
- Sign Up or Log In: Visit Baseten and log in to your account.
- Go to API Keys: In your dashboard, navigate to the API Keys section.
- Create a New Key: Generate a new key and give it a recognizable name (e.g., "Sypha").
- Save the Key: Copy the generated API key immediately and store it in a secure location.
Configuration in Sypha
- Open Sypha Settings: Click the settings gear icon (⚙️) within the Sypha panel.
- Choose Provider: Select "Baseten" under the "API Provider" dropdown menu.
- Enter API Key: Paste the Baseten API key you created into the "Baseten API Key" field.
- Select Your Model: Choose the model you want to use from the "Model" dropdown list.
IMPORTANT: For Kimi K2 Thinking: To utilize the moonshotai/Kimi-K2-Thinking model, you need to enable Native Tool Call (Experimental) in your Sypha settings. This option is required for this reasoning model to function correctly, as it allows Sypha to interact with native tools.
Supported Models
Sypha supports all models currently available through Baseten Model APIs, including: (Remember to check Baseten's pricing page for the most accurate rates.)
moonshotai/Kimi-K2-Thinking(Moonshot AI) - Enhanced reasoning with step-by-step logic (262K context) - $0.60/$2.50 per 1M tokenszai-org/GLM-4.6(Z AI) - Frontier capabilities for agents, reasoning, and coding (200k context) $0.60/$2.20 per 1M tokensmoonshotai/Kimi-K2-Instruct-0905(Moonshot AI) - Enhanced capability update from September (262K context) - $0.60/$2.50 per 1M tokensopenai/gpt-oss-120b(OpenAI) - 120B MoE with powerful reasoning capabilities (128K context) - $0.10/$0.50 per 1M tokensQwen/Qwen3-Coder-480B-A35B-Instruct- Top-tier coding and reasoning (262K context) - $0.38/$1.53 per 1M tokensQwen/Qwen3-235B-A22B-Instruct-2507- High performance in math and logic (262K context) - $0.22/$0.80 per 1M tokensdeepseek-ai/DeepSeek-R1- First-generation reasoning model from DeepSeek (163K context) - $2.55/$5.95 per 1M tokensdeepseek-ai/DeepSeek-R1-0528- Revised DeepSeek reasoning model (163K context) - $2.55/$5.95 per 1M tokensdeepseek-ai/DeepSeek-V3-0324- Fast general-purpose with strong reasoning (163K context) - $0.77/$0.77 per 1M tokensdeepseek-ai/DeepSeek-V3.1- Hybrid reasoning with enhanced tool usage (163K context) - $0.50/$1.50 per 1M tokensdeepseek-ai/DeepSeek-V3.2- Efficient long context hybrid reasoning (163K context) - $0.30/$0.45 per 1M tokens
Production-First Architecture
Baseten's APIs are constructed specifically for enterprise production environments:
Enterprise-Grade Reliability
- 99.99% uptime supported by active-active redundancy measures
- Multi-cluster autoscaling for consistent performance, independent of the cloud provider
- SOC 2 Type II certified and HIPAA compliant
Optimized Performance
- Pre-optimized inference thanks to the Baseten Inference Stack
- Latest GPUs distributed across multi-cloud infrastructure
- Ultra-fast processing for enterprise workloads
Cost Efficiency
- 5-10x cheaper compared to equivalent proprietary models
- Resource management optimized through a multi-cloud network
- Transparent billing with straightforward pricing
Developer Experience
- OpenAI compatible making migration incredibly simple
- Drop-in replacements for popular cloud models with built-in observability
- Easy scaling to dedicated endpoints when your application grows
Special Features
Function Calling & Tool Use
Every model offered through Baseten supports advanced structured outputs and tool operations, making them a great fit for agentic development inside Sypha.
Tips and Notes
- Model Auto-Updates: Sypha will dynamically fetch model lists from Baseten, automatically exposing new open-source options as they launch.
- High Availability: Baseten's architecture ensures low latency routing for global availability.
- Dedicated Support: Speak to Baseten about dedicated capacity setups for heavy production workloads.
Pricing Information
Baseten provides transparent and competitive rates for their models. Prices range primarily from $0.10 to $6.00 per million tokens. Be sure to check the Baseten Model APIs page for exactly up-to-date figures.