Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
cjbprime
on April 18, 2024
|
parent
|
context
|
favorite
| on:
Meta Llama 3
(You can't compare parameter count with a mixture of experts model, which is what the 1.8T rumor says that GPT-4 is.)
schleck8
on April 18, 2024
[–]
You absolutely can since it has a size advantage either way. MoE means the expert model performs better BECAUSE of the overall model size.
cjbprime
on April 18, 2024
|
parent
[–]
Fair enough, although it means we don't know whether a 1.8T MoE GPT-4 will have a "size advantage" over Llama 3 400B.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: