Inproceedings,

The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset

, , , , , , and .
Advances in Neural Information Processing Systems 35 (NeurIPS 2022), 35, page 31809-31826. Curran Associates, Inc., (December 2022)

Meta data

Tags

    Users

    • @scadsfct

    Comments and Reviews