Rho-1: Not All Tokens Are What You Need RHO-1-1B and 7B achieves SotA results of 40.6% and 51.8% on MATH dataset, respectively — matching DeepSeekMath with only 3% of the pretraining tokens.
How to Train Data-Efficient LLMs Models trained on ASK-LLM data consistently outperform full-data training—even when we reject 90% of the original dataset, while converging up to 70% faster
AlignEZ: Using the self-generated preference data, we identify the subspaces that: (1) facilitate and (2) are harmful to alignment. During inference, we surgically modify the LM embedding using these identified subspaces. Jacques note: could we apply this iteratively throughout training (and other similar methods)?
What do we mean by “alignment”? What makes the model safe?
Synthesized various resources for this “pre-training for alignment” type work:
Data
Synthetic Data
The RetroInstruct Guide To Synthetic Text Data
Alignment In The Age of Synthetic Data
Leveraging Agentic AI for Synthetic Data Generation
**AutoEvol**: Automatic Instruction Evolving for Large Language Models We build a fully automated Evol-Instruct pipeline to create high-quality, highly complex instruction tuning data
Synthetic Data Generation and AI Feedback notebook
The impact of models training on their own outputs and how its actually done well in practice
Google presents Best Practices and Lessons Learned on Synthetic Data for Language Models
Transformed/Enrichment of Data
Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling. TLDR: You can train 3x faster and with upto 10x lesser data with just synthetic rephrases of the web!
Better Synthetic Data by Retrieving and Transforming Existing Datasets
Rho-1: Not All Tokens Are What You Need RHO-1-1B and 7B achieves SotA results of 40.6% and 51.8% on MATH dataset, respectively — matching DeepSeekMath with only 3% of the pretraining tokens.
Data Attribution
In-Run Data Shapley
Scaling Laws for the Value of Individual Data Points in Machine Learning We show how some data points are only valuable in small training sets; others only shine in large datasets.
What is Your Data Worth to GPT? LLM-Scale Data Valuation with Influence Functions
Data Mixtures
Methods for finding optimal data mixture
RegMix: Data Mixture as Regression for Language Model Pre-training
Curriculum Learning
On transforming data into a curriculum to improve learning efficiency and capability
Curriculum learning that actually works?
Active Data Selection
MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models MATES significantly elevates the scaling curve by selecting the data based on the model’s evolving needs.
Data Filtering
Scaling Laws for Data Filtering—Data Curation cannot be Compute Agnostic Argues that data curation cannot be agnostic of the total compute that a model will be trained for Github
How to Train Data-Efficient LLMs Models trained on ASK-LLM data consistently outperform full-data training—even when we reject 90% of the original dataset, while converging up to 70% faster
On Pre-Training
Pre-Training from Human Preferences
Ethan Perez wondering if jailbreaks would be solved with this pre-training approach
LAION uses this approach for finegrained control over outputs during inference.
Nora Belrose thinks that alignment via pre-training would make models more robust to unlearning (she doesn’t say this, but this may be a good thing if you pre-train such that you don’t need unlearning)
Tomek describing some research direction for improving pre-training alignment
Simple and Scalable Strategies to Continually Pre-train Large Language Models
Neural Networks Learn Statistics of Increasing Complexity
Pre-Training towards the basin of attraction for alignment
Alignment has a Basin of Attraction: Beyond the Orthogonality Thesis
Requirements for a Basin of Attraction to Alignment
A “Bitter Lesson” Approach to Aligning AGI and ASI
Alignment techniques
AlignEZ: Using the self-generated preference data, we identify the subspaces that: (1) facilitate and (2) are harmful to alignment. During inference, we surgically modify the LM embedding using these identified subspaces. Jacques note: could we apply this iteratively throughout training (and other similar methods)?
What do we mean by “alignment”? What makes the model safe?
Values
What does it mean for a model to have a value?
On making the model “care”