Why async gradient update doesn't get popular in LLM community?

Article URL: https://github.com/sighingnow/Megatron-LM/blob/ht/dev-pipe/megatron/core/pipeline_parallel/schedules.py

Comments URL: https://news.ycombinator.com/item?id=37831330

Points: 1

# Comments: 1

Read more here: External Link