Rate Limits

UniAuth enforces rate limits on all API endpoints to protect the platform from abuse, prevent brute-force attacks, and ensure fair usage for all integrations. This page documents the rate limit configuration for every endpoint, how limits are enforced, and best practices for staying within them.

Overview

Rate limits are enforced using a sliding window algorithm. Each window is 15 minutes long and limits are applied per IP address. When you make a request, UniAuth checks how many requests the originating IP has made to that specific endpoint within the current 15-minute window. If the count exceeds the limit, the request is rejected with a 429 Too Many Requests response.

Sensitive authentication endpoints have stricter limits than general-purpose endpoints. This design allows normal application usage to proceed unimpeded while making automated attacks impractical.

Note: Rate limits apply to all callers equally, including authenticated requests. If you are building a backend integration that calls UniAuth APIs on behalf of multiple users, all requests from your server IP count toward the same limit.

Per-Endpoint Limits

The following table lists the rate limit for each endpoint. All windows are 15 minutes.

Endpoint	Limit	Window
`POST /api/auth/login`	10 requests	15 minutes
`POST /api/auth/register`	5 requests	15 minutes
`POST /api/auth/forgot-password`	5 requests	15 minutes
`POST /api/auth/2fa/send-code`	10 requests	15 minutes
`POST /api/oauth/token`	30 requests	15 minutes
`GET /api/oauth/authorize`	20 requests	15 minutes
`POST /api/oauth/introspect`	30 requests	15 minutes
`POST /api/oauth/end-session`	10 requests	15 minutes
All other `/api/*` endpoints	100 requests	15 minutes

The limits above are defaults. Self-hosted deployments can adjust these thresholds through environment configuration. Contact your administrator if you need higher limits for a specific use case.

Rate Limit Response

When a rate limit is exceeded, UniAuth returns the following response:

HTTP/1.1 429 Too Many Requests
Content-Type: application/json
Retry-After: 900

{
  "success": false,
  "message": "Too many requests. Please try again later."
}

Response Headers

Header	Description
`Retry-After`	Number of seconds until the current rate limit window resets. Wait at least this long before retrying.

Handling Rate Limits

Your application should gracefully handle rate limit responses by implementing retry logic with exponential backoff. Below are examples in multiple languages.

JavaScript / TypeScript

async function fetchWithRetry(url, options, maxRetries = 3) {
  let attempt = 0;

  while (attempt < maxRetries) {
    const response = await fetch(url, options);

    if (response.status !== 429) {
      return response;
    }

    attempt++;
    if (attempt >= maxRetries) {
      throw new Error("Rate limit exceeded after maximum retries");
    }

    // Use Retry-After header, or fall back to exponential backoff
    const retryAfter = response.headers.get("Retry-After");
    const delay = retryAfter
      ? parseInt(retryAfter, 10) * 1000
      : Math.min(1000 * Math.pow(2, attempt), 30000);

    console.log(`Rate limited. Retrying in ${delay / 1000}s (attempt ${attempt}/${maxRetries})`);
    await new Promise(resolve => setTimeout(resolve, delay));
  }
}

// Usage
const response = await fetchWithRetry("https://your-uniauth-instance.com/api/oauth/token", {
  method: "POST",
  headers: { "Content-Type": "application/x-www-form-urlencoded" },
  body: new URLSearchParams({
    grant_type: "authorization_code",
    code: authorizationCode,
    redirect_uri: "https://your-app.com/callback",
    client_id: "your-client-id",
    client_secret: "your-client-secret",
    code_verifier: codeVerifier,
  }),
});

Python

import time
import requests

def fetch_with_retry(url, method="GET", max_retries=3, **kwargs):
    for attempt in range(max_retries):
        response = requests.request(method, url, **kwargs)

        if response.status_code != 429:
            return response

        if attempt + 1 >= max_retries:
            raise Exception("Rate limit exceeded after maximum retries")

        # Use Retry-After header or exponential backoff
        retry_after = response.headers.get("Retry-After")
        if retry_after:
            delay = int(retry_after)
        else:
            delay = min(2 ** (attempt + 1), 30)

        print(f"Rate limited. Retrying in {delay}s (attempt {attempt + 1}/{max_retries})")
        time.sleep(delay)

# Usage
response = fetch_with_retry(
    "https://your-uniauth-instance.com/api/oauth/token",
    method="POST",
    data={
        "grant_type": "authorization_code",
        "code": authorization_code,
        "redirect_uri": "https://your-app.com/callback",
        "client_id": "your-client-id",
        "client_secret": "your-client-secret",
        "code_verifier": code_verifier,
    },
)

Go

package main

import (
    "fmt"
    "net/http"
    "strconv"
    "time"
    "math"
)

func fetchWithRetry(req *http.Request, maxRetries int) (*http.Response, error) {
    client := &http.Client{}

    for attempt := 0; attempt < maxRetries; attempt++ {
        resp, err := client.Do(req)
        if err != nil {
            return nil, err
        }

        if resp.StatusCode != http.StatusTooManyRequests {
            return resp, nil
        }
        resp.Body.Close()

        if attempt+1 >= maxRetries {
            return nil, fmt.Errorf("rate limit exceeded after %d retries", maxRetries)
        }

        retryAfter := resp.Header.Get("Retry-After")
        var delay time.Duration
        if seconds, err := strconv.Atoi(retryAfter); err == nil {
            delay = time.Duration(seconds) * time.Second
        } else {
            delay = time.Duration(math.Min(math.Pow(2, float64(attempt+1)), 30)) * time.Second
        }

        fmt.Printf("Rate limited. Retrying in %v (attempt %d/%d)\n", delay, attempt+1, maxRetries)
        time.Sleep(delay)
    }

    return nil, fmt.Errorf("unreachable")
}

Distributed Rate Limiting

By default, rate limits are enforced per application instance using in-memory storage. This works well for single-server deployments but can lead to inconsistent enforcement when running multiple instances behind a load balancer.

When Redis is configured, UniAuth automatically switches to distributed rate limiting. Limits are enforced globally across all instances using a Redis-backed sliding window algorithm. This ensures that a client sending requests to different instances is still correctly rate-limited against the same counter.

Mode	Storage	Behavior
Single instance (default)	In-memory	Limits enforced per instance. Counters reset when the server restarts.
Multi-instance (Redis)	Redis	Limits enforced globally across all instances. Counters persist across restarts and are shared.

To enable Redis-backed rate limiting, configure the REDIS_URL environment variable in your deployment. See the Configuration guide for details.

Best Practices

Follow these guidelines to minimize rate limit issues in your integration:

Cache token responses. Access tokens are valid for 1 hour. Store the token and its expiration time, and reuse it for subsequent API calls rather than requesting a new token for every request.
Use refresh tokens instead of re-authenticating. When an access token expires, exchange the refresh token for a new one. This uses the token endpoint (30 requests/15 min) rather than the login endpoint (10 requests/15 min), and does not require user interaction.
Implement client-side throttling. Before sending requests, maintain a local counter and delay or queue requests when approaching the limit. This provides a better user experience than hitting the rate limit and retrying.
Spread batch operations over time. If you need to perform bulk operations (such as provisioning multiple users), spread the requests across the rate limit window rather than sending them all at once. Use the SCIM API for bulk user provisioning.
Use webhooks for real-time updates. Instead of polling the API for changes, configure webhooks to receive push notifications when events occur. This eliminates the need for repeated API calls.
Respect the Retry-After header. When you receive a 429 response, always use the Retry-After header value rather than a fixed delay. This ensures you resume at the earliest possible time without unnecessary waiting.
Avoid retry storms. If multiple instances of your application are hitting rate limits simultaneously, add jitter to your retry delays. Instead of all instances retrying at the same time, each should add a random offset:

// Add jitter to prevent retry storms
const baseDelay = parseInt(retryAfter, 10) * 1000;
const jitter = Math.random() * 2000; // 0-2 second random jitter
await new Promise(resolve => setTimeout(resolve, baseDelay + jitter));

Rate Limits vs. Account Lockout

Rate limits and account lockout are separate security mechanisms that work in parallel:

Mechanism	Scope	Trigger	HTTP Status
Rate Limit	Per IP address, per endpoint	Too many requests from one IP regardless of outcome	`429`
Account Lockout	Per user account	Too many failed login attempts for a specific account	`423`

A single attacker can trigger both: the rate limit (by sending many requests from one IP) and the account lockout (by failing authentication for a specific user). Your error handling should distinguish between the two status codes and display appropriate messages to the user. See the Error Reference for details on both response formats.

Related documentation: Error Reference · Security Guide · Token Reference · Webhooks