- Base Model :
unsloth/DeepSeek-R1-Distill-Llama-8B
- Training Dataset :
0xZee/dataset-CoT-Quantum-Mechanics-1224
- HF Finetuned Model :
0xZee/DeepSeek-R1-8b-ft-QuantumMechanics-CoT
-
Notifications
You must be signed in to change notification settings - Fork 1
0xZee/DeepSeek-R1-FineTuning
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
Fine-Tuning of DeepSeek-Style Reasoning Models | RL + Quantization Implementation
Topics
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published