Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Just curious. What languages (human languages) were used in the training data set of GPT3? Is it trained only on English texts and grammar, or is it transcending language barriers?


The vast majority (>93%) is English (by document): https://github.com/openai/gpt-3/blob/master/dataset_statisti...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: