Skip to content

v0.3.0: Full-Parameter RLHF

Compare
Choose a tag to compare
@hiyouga hiyouga released this 16 Nov 08:24
· 2053 commits to main since this release

New features

  • Support full-parameter RLHF training (RM & PPO)
  • Refactor llmtuner core in #1525 by @hiyouga
  • Better LLaMA Board: full-parameter RLHF and demo mode

New models

  • Base models
    • ChineseLLaMA-1.3B
    • LingoWhale-8B
  • Instruct/Chat models
    • ChineseAlpaca-1.3B
    • Zephyr-7B-Alpha/Beta

Bug fix