How does "use a generator" answer this question? A generator can yield an endless amount of data, but if they ask "our customers buy things 24/7 on our website, how would you pull the endless orders into Python?" and you said "a generator", I'd think you were trying to guess the teacher's password ( https://www.lesswrong.com/posts/NMoLJuDJEms7Ku9XS/guessing-t... )
It surely matters if the data source is push or poll or something else, matters if it's a lot of data or a trickle, matters if you need all of it or if you can miss some (and how much), if the source buffers it or doesn't. I'd expect discussions about message busses, queueing, buffering (maybe ring buffers), data stores (maybe SQL or No-SQL), as well as concurrency or parallelism expecting the question could move to "and what if there was 1000x more data?". "Process stock market data" needs more than "read a temperature every minute".
Or at least for a candidate to ask about some of them and be told "this isn't so complicated", not "it's up to you". If it's only asking whether you can loop over a generator instead of trying to read all the data up front, the question could surely be more focused?
The question is garbage as phrased to be sure, but I think the auxiliary context matters a lot in this type of questions. Mentioning "python", when databases are language agnostic, heavily implies the answer is a language feature. Furthermore the way the question is asked (reading from a script) also implies there is a specific non-open-ended answer.
I still answered the question in my head on a whim, half-expecting it to be false, it's a really vague formulation. Sounds like the interviewer was randomly browsing language features and cutting and pasting random descriptions from whatever tutorial. Or maybe the company was using python generators in a really specific "data pulling" niche that the answer seemed extremely obvious to the technical team writing the questions and they forgot that outsiders are not wired to think of the same patterns.
It surely matters if the data source is push or poll or something else, matters if it's a lot of data or a trickle, matters if you need all of it or if you can miss some (and how much), if the source buffers it or doesn't. I'd expect discussions about message busses, queueing, buffering (maybe ring buffers), data stores (maybe SQL or No-SQL), as well as concurrency or parallelism expecting the question could move to "and what if there was 1000x more data?". "Process stock market data" needs more than "read a temperature every minute".
Or at least for a candidate to ask about some of them and be told "this isn't so complicated", not "it's up to you". If it's only asking whether you can loop over a generator instead of trying to read all the data up front, the question could surely be more focused?