Skip to content

Fine-Tuning of DeepSeek-Style Reasoning Models | RL + Quantization Implementation

Notifications You must be signed in to change notification settings

0xZee/DeepSeek-R1-FineTuning

Repository files navigation

Finetune Deepseek_R1_8b on _QuantumMechanics Dataset

  • Base Model : unsloth/DeepSeek-R1-Distill-Llama-8B
  • Training Dataset : 0xZee/dataset-CoT-Quantum-Mechanics-1224
  • HF Finetuned Model : 0xZee/DeepSeek-R1-8b-ft-QuantumMechanics-CoT

Demo NoteBook

About

Fine-Tuning of DeepSeek-Style Reasoning Models | RL + Quantization Implementation

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published