Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Arcee AI Trinity Mini and Nano – US based open weight models (arcee.ai)
4 points by BarakWidawsky 27 days ago | hide | past | favorite | 3 comments


If the performance is comparable to Qwen3 in practice that's quite impressive.

Half the dataset being synthetic is interesting. I wonder what that actually means. They say that Datology needed 2048 H100s to generate the synthetic data. Does that mean they were generating data using other open weight LLMs? Seems like that would undermine the integrity of a "US based" dataset.


Why would that undermine its integrity? AFAICT there are a selection of "open" US-based LLMs to choose from: Google's Gemma, Microsoft's Phi, Meta's LLAMA, and OpenAI's GPT-OSS. With Phi licensed under MIT and GPT-OSS under Apache 2.


Because at that point you don't know where the data came from. You could be training on foreign propaganda without realizing it.

Presumably they wouldn't be training on synthetic data produced by anything less than a open frontier model and those are almost exclusively Chinese




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: