Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Basically “faster” means better performance e.g. tokens/s without loosing quality (benchmarks scores for models). So when we say faster we provide more tokens per second than llama cpp. That means we effectively utilize hardware API available (for example we wrote our own kernels) to perform better.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: