Hello everyone!
Alignment Lab AI
,
is pleased to introduce our latest research efforts with:
Buzz!
a highly curated pretraining scale assistant dataset, unifying RL and SFT, developed
@HIVEDigitalTech
The Buzz model, Dataset,…
Buzz
is a large pretraining scale instruction-tuning dataset, built out of the most performant instruction datasets, including many of the chosen examples from popular preference tuning datasets,
all of that, reformatted, deduped, and cleaned…
Buzz-8b-Large-v0.5
is an early checkpoint of the model weve been continuing to pretrain on the dataset!
remarkably, despite the extraordinarily large volume of data, we are able to train it with a much higher learning rate than expected, and do not…
these are curated sets made from the *unprocessed most recent stack exchange dump* that we were surprised to find had not been optimized, despite the RL focus currently in the ecosystem,
RLSTACK contains…
Thats all for today!
keep your ears peeled and eyes perked though, were in the thick of it and have WAY more stuff to release over the coming days!
keep ai open sourced!
@alignment_lab
@HIVEDigitalTech
This is really exciting work! If you think there are still areas where human labels/preferences could help improve this further, I would be happy to chat!