The BigScience project was initiated by the company Hugging Face. It is supported by the CNRS, the GENCI and the Ministry of Higher Education and Research. Bloom is an artificial intelligence which learns from a large corpus of texts and whose initial goal is to generate text (completions of statements). Each prediction of the model is compared with the correct word allowing to adjust the internal parameters of the model.
Bloom, for BigScience Lmoney Open science Open access Multilingual Language Model, has 70 layers of neurons, 112 heads of attention and allows learning by evaluating billions of words leading to a model of 176 billion parameters. This learning was carried out over several months on a supercomputer (Jean Zay) requiring several hundred graphics processors running in parallel, ie 5 million hours of calculation.
As already indicated, Bloom’s innovation is the fact that it can analyze 46 different languages, whether on a literary, scientific or sports basis. It can even read computer code (13 programming languages at the moment). Another peculiarity is that the program is fully available in open science to facilitate research on language models.
This graphic indicates the languages used for training Bloom.
Thomas Wolf, co-founder and chief scientific officer of start-up Hugging Face, says, “ The creation of the Bloom model and the success of the BigScience research collaboration show that another way of creating, studying and sharing innovations in AI is possible, bringing together industrialists, academics and associations around an international, multidisciplinary and open access. I am delighted that Hugging Face was able to find the necessary support in France for this unprecedented approach on a global scale. “.
.
The post artificial intelligence that manages 46 languages appeared first on Gamingsym.