Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

From a practitioners perspective, I view this type of behavior as tuning the model more towards the exploitation side of the exploration/exploitation trade-off. I think a lot of recommendation engines do this (looking at you YouTube) because it’s more profitable.


Which at its core is probably an alignment problem in the way the models are evaluated: they are measured on their short-term effects, and there exploitation rules. But if you look at the long-term effect of recommendations you really need a healthy dose of exploration to keep your users around.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: