This might help.
John6666
2
Related topics
| Topic | Replies | Views | Activity | |
|---|---|---|---|---|
| Batch size vs gradient accumulation | 9 | 39207 | November 28, 2024 | |
| How to choose optimal batch size for training LLMs? | 4 | 20593 | December 18, 2023 | |
| Selecting batch_size and gradient_accumulation_steps when fine-tuning | 1 | 2467 | December 31, 2023 | |
| Switch batch size and gradient accumulation step values mid training | 0 | 277 | February 28, 2024 | |
| Using gradient_accumulation_steps does not give the same results | 0 | 546 | February 18, 2023 |