FW-Kimi-K2.6

FW-Kimi-K2.6

Kimi K2.6 is an open-source, native multimodal agentic model with 1T total parameters and 32B activated, featuring long-horizon coding, coding-driven design, elevated agent swarm, and proactive autonomous execution.
Fireworks
Version: 1
Models available for use with Fireworks on Foundry deliver optimized, best-in-class performance on the Fireworks Inference Cloud. Fireworks on Foundry is a Non-Microsoft Product. The following terms apply to a Customer's use of Fireworks on Foundry: When you use Fireworks on Foundry, data is shared between Microsoft and Fireworks AI, Customer Data will be sent outside of Microsoft systems, Customer Data will not be processed pursuant to any Foundry data residency documentation, and different compliance and data handling rules will apply. See Trust Center - Fireworks AI for details. Customers are responsible for evaluating whether data sharing between Microsoft and Fireworks is appropriate for their organization's compliance requirements.

About this model

Kimi K2.6 is Moonshot AI's open-source, native multimodal agentic model that advances practical capabilities in long-horizon coding, coding-driven design, proactive autonomous execution, and swarm-based task orchestration. It shares the same architecture as Kimi K2.5, built on the Kimi K2 base — a Mixture-of-Experts (MoE) language model with 1 trillion total parameters and 32 billion activated parameters per forward pass — with a MoonViT vision encoder (400M parameters) for native multimodal understanding. It supports text, image, and video inputs, with thinking and instant (non-thinking) modes, and a 256K token context window.

Key model capabilities

  • Long-Horizon Coding: significant improvements on complex end-to-end coding tasks across programming languages (Rust, Go, Python) and domains (front-end, DevOps, performance optimization)
  • Coding-Driven Design: transforms simple prompts and visual inputs into production-ready interfaces and lightweight full-stack workflows
  • Elevated Agent Swarm: scales horizontally to 300 sub-agents executing 4,000 coordinated steps, dynamically decomposing tasks into parallel domain-specialized subtasks
  • Proactive & Open Orchestration: powers persistent, 24/7 background agents that proactively manage schedules, execute code, and orchestrate cross-platform operations
  • Preserve Thinking mode: retains full reasoning content across multi-turn interactions for enhanced coding agent performance
  • Interleaved Thinking and Multi-Step Tool Call
  • Native multimodal: supports text, image, and video inputs
  • Dual reasoning modes: instant (non-thinking) and thinking
  • 256K token context window
  • Function calling and tool use

Quick facts

Model providerFireworks
TypeChat completion
LifecycleGenerally available (GA)
Input typetext
Output typetext
Context window262.144k