Overfitting can happen in many ways -- your training objective can be different at train and test time, or as you suggest the datapoints you use can be different at train and test time.
For overfitting induced by datapoints: If you include the datapoints in your problem specification, then you can say they induce bias at test time. If you treat the choice of training datapoints as a random variable, separate from the problem specification, then you can say they induce variance at test time. The difference is essentially semantic though. In general, you can freely move contributions to the error between bias and variance terms by changing which aspects of the modeling framework you define as fixed by the problem definition, and which you take to be stochastic.
My first impression is also in agreement with the parent. The blog post appears to use some terms loosely in order to make the connection between overfitting and Goodhart's law stronger. For example - calling training sample "proxy" and stating that it is is a slightly different goal is already leading towards the pre-defined conclusion.
And the reply also leaves me with a similar impression:
> your training objective can be different at train and test time
But this is not overfitting, this is concept drift, a different and well-defined thing in ML.
> the datapoints you use can be different at train and test time
Both train and test data came from the same population. They are just different incomplete random samples.
I guess what I am getting at - overfitting happens because we know we are training a model on an incomplete representation of the whole. But that representation is not a proxy, as suggested in the article - it is not slightly different to the goal. It's an incomplete piece of the goal.
A gentle note that an incomplete piece of a goal (e.g. a loss function computed on a subset of the data) is a proxy for the full goal (e.g. the loss function on the full dataset).
Similarly, concept drift can be a source of overfitting -- the objective you care about is the one after the concept drift occurred, but the objective you trained on is the one from before the concept drift. (Here's a scholar search for papers where the two concepts co-occur: https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&as_vis... )
I think this is a difficult concept for many without statistical training. The fact that different outcomes can be "the same" from a practical perspective.
Overfitting can happen in many ways -- your training objective can be different at train and test time, or as you suggest the datapoints you use can be different at train and test time.
For overfitting induced by datapoints: If you include the datapoints in your problem specification, then you can say they induce bias at test time. If you treat the choice of training datapoints as a random variable, separate from the problem specification, then you can say they induce variance at test time. The difference is essentially semantic though. In general, you can freely move contributions to the error between bias and variance terms by changing which aspects of the modeling framework you define as fixed by the problem definition, and which you take to be stochastic.