Adrien Bufort's picture

2 7 80

Adrien Bufort

Forbu14

·

AdrienB

AI & ML interests

Deep learning, machine learning, reinforcement learning. @orange

Organizations

Forbu14's activity

upvoted a paper 6 months ago

Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences

Paper • 2404.03715 • Published Apr 4 • 59

upvoted a paper 8 months ago

Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18 • 141

upvoted a collection 8 months ago

Models

40 items • Updated 19 days ago • 1

upvoted 2 collections 10 months ago

Awesome feedback datasets

A curated list of datasets with human or AI feedback. Useful for training reward models or applying techniques like DPO. • 19 items • Updated Apr 12 • 64

Awesome SFT datasets

A curated list of interesting datasets to fine-tune language models with. • 43 items • Updated Apr 12 • 112

upvoted 2 papers about 1 year ago

NExT-GPT: Any-to-Any Multimodal LLM

Paper • 2309.05519 • Published Sep 11, 2023 • 78

Stack More Layers Differently: High-Rank Training Through Low-Rank Updates

Paper • 2307.05695 • Published Jul 11, 2023 • 22