Weekly Digest

May 11-17, 2026

AI Crosses Into Math Gold and Real-World Cyberattacks

Stories104

Unverified7

Read time5 min read

104 Stories7 unverified5 min read

Listen as a podcast

Listen as podcast

0:00/5:44

The Big Picture

For decades, Olympiad gold in math meant a level of reasoning that clearly separated top human problem-solvers from machines. Last week, a 30B model reportedly hit the gold-medal threshold on both IMO 2025 and USAMO 2026, while Google said it detected the first known case of attackers using AI to build a working zero-day exploit in an active campaign. In one week, AI looked more capable at abstract reasoning and more dangerous in the hands of attackers.

Elsewhere, enterprise adoption kept accelerating. PwC expanded Claude across its global workforce and plans to train 30,000 professionals, while OpenAI launched a new deployment unit backed by a strategic partnership that includes a $500 million Brookfield investment and up to $4 billion for customer buildouts. Anthropic also introduced Glasswing, a program and toolset aimed at using frontier models to find software flaws before criminals do.

The pattern is getting hard to miss: AI is moving from demos into infrastructure. A consultant can now use AI as part of daily client work, a large bank can hire OpenAI engineers to wire models directly into operations, and security teams may soon rely on AI both to discover bugs and to defend against AI-assisted exploitation. At the same time, researchers showed how fragile some safeguards remain, including a result suggesting one neuron can disable refusal behavior in several models.

Watch the next few weeks for two fronts: whether math-level reasoning results hold up under broader scrutiny, and whether labs can harden models quickly enough as AI capability spills into cybersecurity and high-stakes enterprise systems.

AGI Probability Assessment

View TrackerTracker

66.9%+0.1%

Est. 18 months to AGI

Chance of production-ready AGI within 3 years, assessed by AI analysis of this week's developments

Last week extended the prior momentum in reasoning with a reported 30B model reaching Olympiad gold thresholds, but that result still needs broader validation and does not by itself show reliable general expert autonomy. The most concrete shifts were outside core AGI capability: Google’s report of an AI-assisted zero-day in the wild and OpenAI’s deeper enterprise deployment show real-world usefulness and risk rising fast, yet they do not substantially reduce the remaining gaps in robustness, breadth, and minimal-oversight operation.

Last Week in Numbers

30B

Size of the model that reportedly reached Olympiad gold threshold

30,000

PwC professionals to be trained through the Claude partnership

$500 million

Brookfield investment in the OpenAI strategic partnership

30B

Size of the model that reportedly reached Olympiad gold threshold

30,000

PwC professionals to be trained through the Claude partnership

$500 million

Brookfield investment in the OpenAI strategic partnership

Key Developments

Major|x.com

30B model reportedly reaches Olympiad gold

This is significant because Olympiad-level math is a strong test of multi-step reasoning, not just memorization. Previously, top benchmark gains often came from narrow test optimization; now a relatively compact 30B model is being claimed to clear gold-medal thresholds on two elite contests.

For instance

More weeklies

AI Pushes Deeper Into Math and AutonomyOlder AI Cracks an 80-Year-Old Geometry ConjectureNewer

Weekly Digest

Terminal

Weekly Digest

Weekly Digest

Weekly Digest

Weekly Digest

AI Crosses Into Math Gold and Real-World Cyberattacks

30B model reportedly reaches Olympiad gold

Google spots first AI-made zero-day exploit

OpenAI pushes deeper into enterprise deployment

PwC scales Claude to global workforce

Anthropic launches Glasswing cyber defense effort

Single neuron study bypasses refusal safeguards

Copilot Studio agents start using computers

Colorado tightens rules for AI decisions