RSI chases AGI style acceleration, but definitions blur

AIMay 28, 2026 at 05:00 PM

Read full story

Source: TechCrunch

TLDR: ATHENS, Greece—AI labs chase recursive self-improvement, from Richard Socher to Alex Karpathy and Anthropic, yet experts can’t pin definitions or timelines.

Key Takeaways:

RSI follows AGI marketing hype, promising AI that improves its own research loop until humans become optional.
Richard Socher aims for fully automatic ideation, implementation, and validation, while Alex Karpathy builds Auto Research and Anthropic shows heavy tool written code.
Georgetown experts and Helen Toner say RSI needs an AI only loop with human free research, and reliability plus handoff challenges still slow it down.

AI is learning to write parts of the roadmap faster than humans can debate the wording. RSI sounds like a countdown, but most teams are still stuck on the unglamorous reliability chores that make the next step possible.

No comments yet. Be the first to share your thoughts!

Q&A

If Claude Code can write much of the code, what missing capability most limits RSI to a slogan rather than a system?

The bottlenecks cluster around self direction for long ambiguous tasks, prioritizing organizational goals, and verification, all of which are central to a human free loop.

Why does “automatic ideation, implementation, and validation” matter more than faster coding in RSI projects?

RSI depends on closing the loop across research stages; speeding up one stage without dependable validation keeps the process open to human correction.

What could cause the predicted RSI timelines to swing in either direction even if scaling laws hold?

Raising compute is not enough; progress hinges on engineering for handoffs between stages, plus alignment and reliability, which can stall growth or unlock sudden compounding.

How does METR’s adequacy parity supremacy framework change how investors and labs interpret “good enough” progress?

It reframes milestones: passing adequacy may look incremental, while parity and supremacy signal qualitatively different takeover dynamics for AI research.

If humans become less involved in research, what new failure mode should teams design for first?

The main risk shifts from missing speed to silent bad loops, where an AI repeatedly generates plausible but wrong experiments, then validates them without human epistemic oversight.

RSI chases AGI style acceleration, but definitions blur

Key Takeaways:

Q&A

If Claude Code can write much of the code, what missing capability most limits RSI to a slogan rather than a system?

Why does “automatic ideation, implementation, and validation” matter more than faster coding in RSI projects?

What could cause the predicted RSI timelines to swing in either direction even if scaling laws hold?

How does METR’s adequacy parity supremacy framework change how investors and labs interpret “good enough” progress?

If humans become less involved in research, what new failure mode should teams design for first?

Top in AI

Google AI Overview keeps stumbling on letter counting

Frontier LLMs fracture on fact-check verdicts 67% of times

Claude Opus 4.8 ramps agent reliability while pricing stays put

Gemini, Claude, and ChatGPT unravel in AI radio trials

Google Meet’s latest update puts Gemini right where you need it