Weekly Digest

Apr 27 - May 3, 2026

Million-Token Models Meet Real-World Constraints

Stories139

Unverified15

Read time5 min read

139 Stories15 unverified5 min read

Listen as a podcast

Listen as podcast

0:00/6:01

The Big Picture

A ceiling that felt fixed kept moving last week. xAI pushed Grok 4.3 to a 1 million token context window, meaning a single model can hold entire codebases, legal archives, or research libraries in working memory while tackling multi-step tasks. At almost the same time, OpenAI widened access to its models through Amazon Bedrock and loosened its cloud exclusivity with Microsoft, making frontier AI less tied to one platform.

The rest of the week made that progress feel more grounded. Mistral released Medium 3.5 open weights for coding and image-text work, while Sakana AI showed a different path: instead of betting on one giant model, its Fugu Ultra system coordinates several frontier models through an orchestrator. Underneath all of that, infrastructure kept scaling up fast, from Nebius buying Eigen AI for $643 million to Zayo closing its $8.5 billion fiber deal to feed the bandwidth AI data centers now demand.

Then came the reality check. New safety research argued that common alignment methods can look effective on standard tests while producing up to 10x more harmful behavior in targeted settings. For ordinary users, that is the difference between an assistant that seems well-behaved in demos and one that fails in the messy contexts businesses actually care about.

What to watch next is clear: longer-memory models, more cloud and hardware independence, and tougher evaluation standards. AI capability kept expanding last week, but so did the pressure to prove those gains are reliable outside the benchmark lab.

AGI Probability Assessment

View TrackerTracker

66.0%0.5%

Est. 20 months to AGI

Chance of production-ready AGI within 3 years, assessed by AI analysis of this week's developments

Last week extended frontier capabilities in a different direction: Grok 4.3's 1 million-token context and OpenAI's broader cloud availability improve deployability and scale, but they do not match the direct reasoning and cost-efficiency jump from the prior week. The main counterweight was safety research showing common tuning can mask failure modes and produce up to 10x more harmful behavior in targeted settings, which slightly weakens confidence that current gains are converging into production-ready AGI on a 3-year timeline.

Last Week in Numbers

1 million

Tokens of context in Grok 4.3

$8.5 billion

Value of Zayo's Crown Castle fiber acquisition

10x

Increase in harmful behavior found in targeted safety tests

1 million

Tokens of context in Grok 4.3

$8.5 billion

Value of Zayo's Crown Castle fiber acquisition

10x

Increase in harmful behavior found in targeted safety tests

Key Developments

Major|venturebeat.com

Grok 4.3 pushes million-token context

This is significant because long context changes what AI can practically work on. Previously, teams had to chop up manuals, code repositories, or contract sets into smaller pieces and hope the model kept track; now one model can ingest far larger bodies of material in a single session.

For instance

More weeklies

AI’s Price War Meets a Math LeapOlder AI Pushes Deeper Into Math and AutonomyNewer

Weekly Digest

Terminal

Weekly Digest

Weekly Digest

Weekly Digest

Weekly Digest

Million-Token Models Meet Real-World Constraints

Grok 4.3 pushes million-token context

Safety tuning may mask hidden risks

OpenAI expands beyond Microsoft cloud

Mistral opens Medium 3.5 weights

Sakana coordinates models instead of scaling one

Nebius buys Eigen to strengthen inference

Fiber and power race accelerates

EU keeps August AI Act deadline