Inproceedings,

The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset

, , , , , , and .
Advances in Neural Information Processing Systems 35 (NeurIPS 2022), 35, page 31809-31826. Curran Associates, Inc., (December 2022)

Meta data

Tags

Users

  • @scadsfct

Comments and Reviews