Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It would be more convincing if they did an exhaustive enumeration and verified that for every possible 3x3 Life the learned NN was correct. How do I know looking at a speckled screenshot that it is exactly correct and there's not a little floating point error somewhere or something like that which results in 1 edge-case being slightly off? If the only testing is '100 Life games for 100 steps', that isn't water-tight. (While if you do exhaustive enumeration, well, it has to be correct, because the NN is deterministic and fixed and there's no way for it to go wrong then.)


Edit: increased the validation to 10,000 life grids for 100 steps, (taking 16 minutes to check), which is hopefully somewhat more convincing. That's 1,000,000 life steps computed without errors in total. Plus 32,000 steps computed without error during training.

When the attention grid is manually computed (to be equivalent to 3 by 3 conv), the model can be trained to be 100% perfect, verified by checking all 3 by 3 grid states. (And this manually computed attention matrix means that once the tokens reach the classifier layer, each token contains only the information of the relevant 3 by 3 grid, and the whole thing is deterministic as you say.)

However, when the model is computing the attention grid itself, just checking all 3 by 3 sub-grid states crop up is not enough, because the position of the sub-grids can impact the attention matrix, and also the state of other cells can impact the attention matrix. So as shown in the post, it does approximate 3 by 3 conv, but if it doesn't get the approximation quite right, there could be errors. But I would say that it's still computing the Game of Life algorithm in an interpretable way, it's just that maybe it has struggled to create a perfect 3 by 3 convolution via attention in that particular case. (To exhaustively check this, would require checking all 2 * (16x16) grids.)


I think it would have also been very interesting to manually construct a NN, which represented the rules exactly. Maybe there is some nice mathematical way to describe them or some constraints need to be fulfilled.

Then afterwards you can check the neutral network against the exact algorithms.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: