I'm curious about why the performance gains mentioned were so substantial for Qw... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		rnxrx 5 months ago \| parent \| context \| favorite \| on: Show HN: We made our own inference engine for Appl... I'm curious about why the performance gains mentioned were so substantial for Qwen vs Llama?

AlekseiSavin 5 months ago [–]

it looks like llama.cpp has some performance issues with bf16

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact