Major@AlphaSignalAI
Researchers train 14B model without backpropagation
The work shows a 14B language model can be pretrained from scratch using evolution strategies instead of gradient-based training. The method reportedly delivers 100x faster training throughput, runs at 91% of normal inference speed, and stays competitive on reasoning and math tasks, suggesting a practical path for training systems that standard methods cannot handle.
Entities
NVIDIAEggroll
Categories
LLMResearchInfrastructure
Related Stories
loading related...