People with the mathematical maturity to understand it don't really need the "review". And the people who need the review, probably don't have the mathematical background to appreciate this.
Probability theory is not exactly a mathematical field that one can put down and forget (specially one interested in ML.) You brush against it very often in other fields, not to mention other disciplines.
Aaand to top it off it has no refernces. Not even the requisite "see Feller" footnote.
>People with the mathematical maturity to understand it don't really need the "review".
I don't really think that holds water. I read through the document and understood perfectly everything it said, but I couldn't have told you right before reading it how the variance of a distribution is defined.
Plus, even if everyone's up to date on probability, the doc defines the notation the course will use. It's unlikely that people with different backgrounds all share common notation and (to some extent) terminology. 229 is a class for grad students as well as "advanced undergrads", so much of the class won't have learned probability in the same place. Hell, I took probability at Stanford, and if we ever used omega to denote an outcome space, I don't remember it.
I think this is definitely readable for someone who has stepped away from probability for maybe a year or two after having taken a couple courses on the subject at the undergrad level. And that seems like the target audience as well--undergrad students who need a refresher at the start of a new course.
Actually, you don't want to see Feller, not volume I and rarely volume II! :-)
Probability is a field with a curious mix: (1) Intuition goes a long way but (2) doing mathematically what is easy intuitively can require some relatively advanced math.
I looked over the paper of this thread: Looks like the author's best qualification was learning TeX! Yes, he should have given some references. He omitted little things like the central limit theorem and the law of large numbers -- little things like that! He mentioned the Poisson distribution but without mention of where and why it is common to encounter it. Actually, here is one place can see Feller II -- on renewal processes with a nice limit result that shows why Poisson is so common.
For the intended purpose of the paper, that is, an elementary review, just see some well regarded first text on probability.
But the grown up stuff in probability is much more difficult, although the paper did hint at it. Basically need quite a lot of mathematical analysis as a prerequisite, and the standard is Rudin's Principles. Then need quite a lot of measure theory and then some functional analysis. For that, the nicest is Royden's Real Analysis, but I also like the first (real) half of Rudin's Real and Complex Analysis.
Then for the probability, sure, Loeve, Neveu, and Breiman. Loeve was long at Berkeley and wrote the big book. His writing style is heavily from his native language, French! I like Loeve, but a lot of people don't. Two Loeve students were Neveu and Breiman. Neveu went back to France, but Breiman has long been at Berkeley.
For more, pursue stochastic processes. Generally a good guy to follow is Cinlar at Princeton.
You can see the rest of the notes, lectures and other course materials at http://see.stanford.edu/see/courseinfo.aspx?coll=348ca38a-3a... .