I'm also pretty sure that its immaterial if Haskell does 1 or not. This is an implementation detail and not at all important to something being a Monad or not.
My understanding is requiring 1 essentially forces you to think of every Monad as being free.
Ah! My favourite Haskell discussion. So, consider these two programs, the first in Haskell:
main :: IO ()
main = do
foo
foo
foo :: IO ()
foo = putStrLn "Hello"
and the second in Python:
def main():
foo()
foo()
def foo():
print("Hello")
For the Python one I'd say "I/O is done inside `foo` before returning". Would you? If not, why not? And if so, what purpose does it serve to not say the same for the Haskell?
My Haskell is rusty enough that I don’t know the proper syntax for it, but you can make a program that calls foo and then throws away / never uses the IO computation. Because Haskell is lazy, “Hello” will never be printed.
My understanding is requiring 1 essentially forces you to think of every Monad as being free.