The AI Summit New York

2025 Agenda

Loading

Building AI Inference at Scale: Why Public Benchmarks Mislead

Dec 11, 2025
Headliners Stage
Headliners

How do we build Nebius Token Factory to address the challenges of building production-scale inference systems? How do the customers' SLAs look, and what delivers optimized performance around the tradeoffs of cost, latency, throughput, and quality?

As organizations race to deploy AI into real products, the gap between public models and providers' benchmarks and real-world performance has never been more apparent. In this keynote, we’ll explore why metrics fail to capture the true behavior of production-scale inference systems, and what it actually takes to build them.

We’ll share how we designed Nebius Token Factory to meet demanding customer SLAs and deliver predictable performance under real workloads. The talk will walk through the practical trade-offs between cost, latency, throughput, and quality, and the engineering techniques that make it possible to optimize across all four.

Speakers
Roman Chernin, Co-Founder - Nebius

Session Type

Presentation

Content Focus

Technical
Secure Your Pass
View all 2025 Agenda

Sponsors

Headline Partners

Loading

Industry Partners

Loading

Diamond Sponsors

Loading

Platinum Sponsors

Gold Sponsors

Silver Sponsors

Bronze Sponsors

Associate Sponsors

Feature Sponsors

Loading

Media & Community Partners