Major@AlphaSignalAI1d ago

Researchers train 14B model without backpropagation

The work shows a 14B language model can be pretrained from scratch using evolution strategies instead of gradient-based training. The method reportedly delivers 100x faster training throughput, runs at 91% of normal inference speed, and stays competitive on reasoning and math tasks, suggesting a practical path for training systems that standard methods cannot handle.

Source

Entities

NVIDIAEggroll