Thinking Model and RLHF Research Notes

This repository serves as a collection of research notes and resources on training large language models (LLMs) and Reinforcement Learning from Human Feedback (RLHF). It focuses on the latest research, methodologies, and techniques for fine-tuning language models.

Repository Contents

Reinforcement Learning and RLHF Overview

A curated list of materials providing an introduction to RL and RLHF:

Research papers and books covering key concepts in reinforcement learning.
Video lectures explaining the fundamentals of RLHF.

Methods for LLM Training

An extensive collection of state-of-the-art approaches for optimizing preferences and model alignment:

Key techniques such as PPO, DPO, KTO, ORPO, and more.
The latest ArXiv publications and publicly available implementations.
Analysis of effectiveness across different optimization strategies.

Purpose of this Repository

This repository is designed as a reference for researchers and engineers working on reinforcement learning and large language models. If you're interested in model alignment, experiments with DPO and its variants, or alternative RL-based methods, you will find valuable resources here.

RL overview

Methods for LLM training

Minimal implementation

Method
DPO

Tutorials

Notes for learning RL: Value Iteration -> Q Learning -> DQN -> REINFORCE -> Policy Gradient Theorem -> TRPO -> PPO

RLHF training techniques explained

Training frameworks

RLHF methods implementation (only with detailed explanations)

GRPO

Articles

Thinking process

Papers

Open-source project to reproduce DeepSeek R1

DeepScaleR - Democratizing Reinforcement Learning for LLMs

Datasets - thinking models

[R1 - distill] https://huggingface.co/datasets/open-r1/OpenR1-Math-220k
[R1 - distill] https://huggingface.co/datasets/simplescaling/s1K-1.1
[R1 - distill] https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k
[R1 - distill] https://huggingface.co/datasets/GAIR/LIMO
[R1 - distill] https://huggingface.co/datasets/AI-MO/NuminaMath-CoT

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
minimal_implementation		minimal_implementation
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Thinking Model and RLHF Research Notes

Repository Contents

Reinforcement Learning and RLHF Overview

Methods for LLM Training

Purpose of this Repository

RL overview

Methods for LLM training

Minimal implementation

Tutorials

RLHF training techniques explained

Training frameworks

RLHF methods implementation (only with detailed explanations)

Articles

Thinking process

Papers

Open-source project to reproduce DeepSeek R1

Datasets - thinking models

Evaluation and benchmarks

About

Releases

Packages

Languages

PawelHaracz/rlhf_thinking_model

Folders and files

Latest commit

History

Repository files navigation

Thinking Model and RLHF Research Notes

Repository Contents

Reinforcement Learning and RLHF Overview

Methods for LLM Training

Purpose of this Repository

RL overview

Methods for LLM training

Minimal implementation

Tutorials

RLHF training techniques explained

Training frameworks

RLHF methods implementation (only with detailed explanations)

Articles

Thinking process

Papers

Open-source project to reproduce DeepSeek R1

Datasets - thinking models

Evaluation and benchmarks

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages