Webhooks error handling and limitations
Webhooks have certain limitations. If your system’s API has different requirements, you can set the target to be:
- A function, such as AWS Lambda, Google Cloud Functions, or Azure Functions, which can then call the third-party API
- An iPaaS platform, such as Zapier or Workato, which can then call the third-party API.
HTTP response codes
Stedi considers a 2xx
response a success, and marks any other response as a failure.
Stedi retries events associated with status codes other than 2xx
for up to 4 times with a 90 second wait period inbetween retries.
If the maximum number of retries has been exhausted, Stedi adds the event to the error queue for the webhook.
You can set the Concurrency when configuring the webhook to prevent throttling. This setting determines the maximum number of deliveries that Stedi will attempt to deliver to the endpoint at one time.
Timeouts
The target endpoint must respond with a 2xx
status code within 5 seconds, or the event will be counted as a failed delivery.
Because of this timeout limitation, we recommend designing your webhook endpoints to immediately acknowledge receipt with a 2xx
response, then process the data asynchronously. See Best practices for webhook endpoints.
Retries and duplicate deliveries
When a delivery fails, Stedi will retry up to 4 times every 90 seconds. After the fifth retry, Stedi moves the event to the error queue.
If your webhook doesn’t respond within 5 seconds, Stedi marks that as a failure and then automatically retries. This can result in duplicate deliveries.
Error queue
Each webhook includes an error queue. Each item in the queue consists of the original event that was attempted to be delivered. This ensures if the target service has some downtime, or anything else goes wrong, the missed events can be retried later. The error queue retains items for 14 days.
The order of the error queue is not guaranteed. The downstream service must be designed to be idempotent to handle at-least-once delivery of events, and must accept events out of order.
Logs
To view logs, click the webhook to go to its detail page, and then navigate to the Logs tab.
Deauthorized connections
If a webhook sends a message to an endpoint that returns a 401 (Unauthorized) response, the destination will be ‘deauthorized’. In this state, the webhook won’t be able to deliver messages.
If there is an issue with your authentication information (such as the password, API key, or OAuth settings), edit the webhook to fix it.
If the authentication information is correct, and there was a different reason for the endpoint returning a 401, you can try again by adding a temporary header. For example, x-stedi-reauthorize
with today’s date as a value. When you save, the webhook will attempt to deliver again. This header can be removed later. Editing the value of a header will also restart deliveries.
You will likely have a queue of messages to deliver, so Stedi will automatically start retrying them after you make this change. If the endpoint is still returning an invalid response, the webhook will return to Deauthorized
.
Best practices for webhook endpoints
When creating endpoints to receive webhooks from Stedi, we recommend the following architecture:
- Acknowledge first, process later: Design your endpoint to immediately return a
2xx
status code to acknowledge receipt, then process the payload asynchronously. - Store payloads for processing: Capture the webhook data in a queue, database, or other storage mechanism before processing.
- Process asynchronously: Handle the actual business logic in a separate process or worker after acknowledging receipt.
- Implement idempotency: Use idempotency keys from the event payload to prevent duplicate processing.
- Store the
eventId
from each webhook payload in your database - Before processing an incoming webhook, check if its
eventId
has already been processed - Design operations to be idempotent, ensuring that processing the same event multiple times doesn’t cause issues (e.g., avoid incrementing counters on each processing attempt)
- Store the
This architecture prevents timeouts, handles potential duplicate deliveries, and allows you to process high volumes of events.