Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

it's a rolling buffer, so it just upsert index % 4 in this case


Thanks, so does that mean position within the buffer is irrelevant?


it does feel like so, the position eventually loses its meaning as more and more data gets crunched by the training process, eventually it's just a context of the past 4 tokens it feels like




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: