Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I guess I'm concerned with rounding errors in the intermediate results leading to a similarity of 1.00000001 or something. When you say 10/14, I don't know what you are referring to btw. I couldn't easily find the original bug or library.


> When you say 10/14, I don't know what you are referring to btw.

I did the calculation manually, the answer is exactly 10 divided by 14. Basically you just take ( 1 * 3 + 2 * 2 + 3 * 1 ) / (3 * 3 + 2 * 2 + 1 * 1), which is 10 / 14. Computing cosine similarities is really simple as I said.

Edit: The funny thing is that his answer is actually a bit too low since he truncates instead of rounds at the end. 10 / 14 just continues as 142857 and repeats after that. So if the library is smaller than his then the library is wrong and not him. Anyway, blindly trusting the results of a library is just dumb, they make mistakes so often you are right and they are wrong.


The matrix library only truncates the printout of matrix entries. The output is, of course, full 32bit or 64bit floating point number.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: