6 Comments
User's avatar
N426678's avatar

If you consider to use „more/better“ data. I would recommend you this source.

https://www.financialdatasets.ai/

N426678's avatar

Did you used train, validation and test dataset or do you use the test dataset for both validation and test?

idunno's avatar

10% of the dataset was used for validation. Stock tickers outside of the date of the dataset was also used for testing. Qwen2.5-1.5B-Instruct was surprisingly able to generalize far beyond what was expected. The next model being worked on is Qwen2.5-14B-Instruct and the newest trainer has far more engineering than the first attempt.

Check out the work in progress below. The latest GRPO WIP trainer is Frieza QLoRA_4Bit.py. It fits on 1 A100 GPU:

https://github.com/IYamHim/Ginyu-Unit/

N426678's avatar

Is it possible to run this on multiple GPUs?

Lukas Nel's avatar

Yup yup, but it takes a bit of doing

N426678's avatar

Thank you for this amazing tutorial and insides. Can i run this repo on your dataset, or do i need to change something. This is a repo about multiple gpo GRPO:

https://github.com/Jiayi-Pan/TinyZero