Basically: Back in the day a ton of apps used their own notification servers, with each one clashing, waking the device up, etc..
Google Cloud Messaging was then set to be the only service that could wake the device up when it was sleeping. Supposedly it batches messages and sends them at optimal times to save battery.
It's not when the device is awake or has a wakelock. But since Android's Doze (basically deep sleep battery saving state) GCM is the only service that can wake it up.
I thought the device wakes up from Doze regularly, and then you can poll for the notifications. The SDK docs sound this way too: "While the device is in Doze, apps' access to certain battery-intensive resources is deferred until maintenance windows"
But it seems these maintenance windows aren't frequent enough (every 15 minutes?) for some apps.
The minimum interval for periodic jobs is 15 minutes. When the device is in Doze mode, the jobs are defered to the next maintenance window. The longer the device remains in Doze mode, the greater the distance between two maintenance windows.
Do you want to use a messenger or any other social media app where the worst case scenario is that it takes 15 minutes for the message to reach the recipient?
Certainly not, so the app vendor is forced to use FCM to display notifications immediately even if the recipient's device is in Doze mode.
Google Cloud Messaging was then set to be the only service that could wake the device up when it was sleeping. Supposedly it batches messages and sends them at optimal times to save battery.