I've worked at several companies where we'd discover hours later that critical webhooks from Stripe/Shopify never arrived (deployment, timeout, bug, etc.).
Every team ended up building the same solution: retry logic, dead letter queue, monitoring.
Curious how others handle this:
- Do you rely on the provider's retry policy?
- Built your own reliability layer?
- Use a service?
- Just manually reconcile when it happens?
(Context: Building https://relaehook.com to solve this, but genuinely curious what the norm is)
Trivial Go program, day’s work. Stick it in Postgres, run continuously.
Bizarrely there are vendors who are weird about webhooks. Lifefile, as an example, charges pharmacies a dollar per webhook firing. So the pharmacies are crappy about retry policy.
Tbh I wouldn’t buy any product in this space. It’s too simple with exclusive HTTP server plus Postgres plus processing loop. And with already delicate thing I would rather not introduce more vendors.
No, not even if you converted it into event queue via websocket or zmq or what have you.