I don't get the point. A simple CNN with stride =1 should be able to solve it pe... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		Nopoint2 9 months ago \| parent \| context \| favorite \| on: Transformer neural net learns to run Conway's Game... I don't get the point. A simple CNN with stride =1 should be able to solve it perfectly and generalize it to any size.

montebicyclelo 9 months ago [–]

It wasn't obvious that a transformer could do this, and learn to produce conv via attention

artemisart 9 months ago | [–]

But it is, as long as the positional embedding are sufficient, i.e. use relative positional embeddings here.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact