Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You might be interested, if you aren't already familiar, in some of the work going on in the mechanistic interpretability field. Neel Nanda has a lot of approachable work on the topic: https://www.neelnanda.io/mechanistic-interpretability


I was not familiar with it, and that does look fascinating, thank you. If anyone else is interested, this guide "Concrete Steps to Get Started in Transformer Mechanistic Interpretability" on his site looks like a great place to start:

https://www.neelnanda.io/mechanistic-interpretability/gettin...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: