A group of girls with headphones and a man with a backpack

Meta has created artificial intelligence models capable of recognizing and delivering speech in more than 1,000 languages, which is ten times more than currently. The company believes that this is a significant step in preserving languages that are on the verge of extinction. Meta has released its models to the public on GitHub. She claims that such a solution will help developers working in different languages to create new speech applications, such as instant messengers that can understand everyone, or virtual reality systems that can be used in any language. There are about 7,000 languages in the world, but existing speech recognition models cover only about 100 of them. Meta solved this problem by retraining an existing AI model developed by the company in 2020 that can learn speech patterns from audio without the need for large amounts of labeled data such as transcripts. The model was trained on two new datasets: audio recordings of the New Testament of the Bible and corresponding text in 1,107 languages, as well as unlabeled audio recordings of the New Testament in 3,809 languages. The team processed the speech audio and text data to improve its quality, and then ran an algorithm designed to align the audio recordings with the accompanying text. After that, they repeated this process using a second algorithm trained on the new aligned data. With this method, the researchers were able to train the algorithm to learn new languages faster, even without accompanying text. However, the team cautions that the model is still prone to errors in the transcription of some words or phrases, which can lead to inaccurate or potentially offensive labels. They also admit that their speech recognition models produce more biased words than other models, although only 0.7% more. --auto --s2

Meta has created artificial intelligence models capable of recognizing and delivering speech in more than 1 , 000 languages , which is ten times more than currently . The company believes that this is a significant step in preserving languages that are on the verge of extinction . Meta has released its models to the public on GitHub . She claims that such a solution will help developers working in different languages to create new speech applications , such as instant messengers that can understand everyone , or virtual reality systems that can be used in any language . There are about 7 , 000 languages in the world , but existing speech recognition models cover only about 100 of them . Meta solved this problem by retraining an existing AI model developed by the company in 2020 that can learn speech patterns from audio without the need for large amounts of labeled data such as transcripts . The model was trained on two new datasets: audio recordings of the New Testament of the Bible and corresponding text in 1 , 107 languages , as well as unlabeled audio recordings of the New Testament in 3 , 809 languages . The team processed the speech audio and text data to improve its quality , and then ran an algorithm designed to align the audio recordings with the accompanying text . After that , they repeated this process using a second algorithm trained on the new aligned data . With this method , the researchers were able to train the algorithm to learn new languages faster , even without accompanying text . However , the team cautions that the model is still prone to errors in the transcription of some words or phrases , which can lead to inaccurate or potentially offensive labels . They also admit that their speech recognition models produce more biased words than other models , although only 0 . 7% more . --auto --s2

INFO

Tamaño

3072X2048

Fecha

Jun 12, 2023

Modo

Por defecto

Tipo

upscale

Checkpoint & LoRA

ReV Animated

0 comentario(s)

0

2

0