Name	Name	Last commit message	Last commit date
Latest commit History 18 Commits
data	data
images	images
rules	rules
scripts	scripts
.gitignore	.gitignore
README.md	README.md
requirements.txt	requirements.txt

Name

Last commit message

Last commit date

Latest commit

History

RuleR: Improving LLM Controllability by Rule-based Data Recycling (NAACL'25)

Chinese Version: Zhi Hu

This is the repo for the RuleR project, which proposes a data augmentation method incorporating multiple constraints into the original data samples according to predefined rules, without any human/LLM editing on responses.

(Feel free to email minglii@umd.edu for any questions or feedback.)

News

[2025/01] Our paper has been accepted to the NAACL 2025 main conference!
[2024/06] We initialized the RuleR repo.

Overview
Highlights
Install
Run Code
ToDo
Citation
Our Related Works

Overview

Large language models (LLMs) still lack delicate controllability over their responses, which is critical to enhancing their performance and the user experience. However, curating supervised fine-tuning (SFT) datasets to improve LLM controllability usually relies on human experts or proprietary LLMs, which requires additional costs. To bridge this gap, we propose Rule-based Data Recycling (RuleR), a data augmentation method incorporating multiple constraints into the original data samples according to predefined rules, which creates new training tasks to consolidate the controllability of LLMs. Instead of creating new data from scratch, RuleR ``recycles'' existing data by simply applying rule-based edits to their responses and appending the rule-instructions in their original instructions. Experimental results demonstrate RuleR's effectiveness in improving LLM controllability while maintaining general instruction-following capabilities.

Comparing existing methods (top) and our RuleR (bottom) for enhancing LLM controllability. Most existing methods rely on extra-human/model supervision to generate or edit instructions and responses, neglecting the remaining potential of the original data. On the contrary, RuleR demonstrates that simple rule-based (human/model-free) editing of existing data can greatly improve LLM controllability.

Highlights

RuleR is the first human/model-free data augmentation approach designed to improve LLM controllability in enforcing multiple constraints on LLM-generated responses.

Install

Install the dependencies

pip install -r requirements.txt

Note: The use of RuleR only uses spacy and tqdm packages. We recommend you manually install these 2 packages, and do not need to install them from requirements.txt

Install the Spacy model

python -m spacy download en_core_web_sm

Run Code

Single-Round Data (Alpaca format)

python rules/augment_round_single.py \ --data_path xxx.json \ # Alpaca format needed here --save_path xxx_augmented.json \ --augment_rate 0.9 \ --epo_num 2 \ --concate_layer 3

Multi-Round Data (ShareGPT format)

python rules/augment_round_multi.py \ --data_path xxx.json \ # ShareGPT format needed here --save_path xxx_augmented.json \ --augment_rate 0.9 \ --epo_num 2 \ --concate_layer 3

--data_path: Input data path.
--save_path: Save data path.
--augment_rate: The probability of implmenting augmentation.
--epo_num: The times of random augmentation process to be run.
--concate_layer: The max rule number for each sample.

Training

We use the prompt and code base from FastChat:

USER: Who are you? ASSISTANT: I am .........">A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: Hi ASSISTANT: Hello.USER: Who are you? ASSISTANT: I am .........

About

[NAACL'25] RuleR: Improving LLM Controllability by Rule-based Data Recycling

Resources

Readme

Stars

Watchers

Forks

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tianyi-lab/RuleR

Folders and files

Latest commit

History

Repository files navigation

RuleR: Improving LLM Controllability by Rule-based Data Recycling (NAACL'25)

News

Contents

Overview

Highlights

Install

Run Code

Single-Round Data (Alpaca format)

Multi-Round Data (ShareGPT format)

Training

ToDo

Citation

Our Related Works

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RuleR: Improving LLM Controllability by Rule-based Data Recycling (NAACL'25)

News

Contents

Overview

Highlights

Install

Run Code

Single-Round Data (Alpaca format)

Multi-Round Data (ShareGPT format)

Training

ToDo

Citation

Our Related Works

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages