Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The world needs a state of the art and easy to integrate library for providing speech to text so that we don't have to rely on sending all of our voice data to the big tech giants. I imagine this was the motivation behind it.

That being said, I think the current open source options are actually okay and constantly getting better. They're certainly a lot better today than they were 3 years ago.



The thing is, good speech to text AI needs good training data. And lots of it. With as little error as possible.

They were also doing this. And made an easy to use website, where you could contribute and people did. I cannot imagine it was soo expensive that they now have to throw it under the bus.


I agree with you, their common voice project is extremely important. From the outside it seems like the engineering effort behind this is done -- They just need to keep the site running to keep collecting data. Actually, now that I think about it, validating the data might be a lot of man hours.

I very much hope they don't abandon it, the common voice project is arguably more important than their speech to text engine. There are competing open source speech to text engines. There is no other project like common voice that I am aware of.


"validating the data might be a lot of man hours."

It totally is and therefore also a very good crowdsourced project. And they set up already a nice website with badges and other things motivating people to contribute. Daily 15 minutes of a lot of people would mean lots of validated data. Because with user accounts you can also validate userquality etc.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: