Webhook Performance Requirements and Design Considerations

Webhooks need to respond immediately with 2xx in less then 1500ms. Sending unsuccessful status codes will result in timeouts, confusing experiences, and data loss.

How Should My Integration Respond to Requests From Mango?

Many of Mango's events, such as line extension state changes are time-sensitive and only viable for a short period of time. The webhook system has been designed as a fire and forget system. Given the nature of these events, and the high volume of events that move through Mango every second, there is no retry logic.

Any request that Therefore, we request that your integration always respond with a 200 HTTP response code.

What Happens If My Integration Responds With Anything Besides a 200?

After the 3rd time Mango receives an unsuccessful response code (i.e. a 3xx, 4xx, or 5xx response code), the endpoint is detected as unavailable and traffic is cut off temporarily to help the 3rd party system recover. Requests will not be sent to the unavailable endpoint for (30 seconds).
This can cause a significant number of events to be lost, which will not make sense to replay. When that timeout expires, the next event will be sent to the endpoint. The timeout renews until we get a 2xx response. The amount of events potentially missed is enormous.

Instead of responding with non success codes HTTP status codes, please respond promptly return a success code (2xx) that the message was received and handle any other error states in the application logs, or exception logging system.

How Much Time Does My Integration Have to Respond?

Because all webhook integrations share the same cluster of workers, a slow response is detrimental to all integrations. Therefore (at the time of this writing), an integrated endpoint has up to 1.5 seconds to connect, and 1.5 seconds to respond -- but the quicker the better.

If you think about it, by the time we fire off a "Call Ringing" event, there is already some latency behind that event. If your server is taking even a second to accept the connection and start processing that request, the phone could have already been ringing for a few seconds before your customer gets the screen pop. Remember, these events are time sensitive, and a sluggish server inevitably has other call state events piling up behind it.

Will My Integration Be Blacklisted For Temporary "Bad Behavior"?

Servers go down and bad things happen. But if your server is 1) always returning a 200 and 2) usually responding quickly and consistently, you'll probably never hear from us. On the other hand, if your server is not behaving properly and it is having a detrimental affect on our platform  we may need to disable the integration until the problems are resolved.