Your API was designed for human developers reading docs and writing integration code. But increasingly, the "developer" on the other end is an LLM deciding in real time which endpoint to call, how to handle the response, and whether to retry. That changes what "good API design" means.
This isn't about building a separate "AI API." It's about making your existing API better in ways that benefit both human developers and autonomous agents. Every pattern here is just good engineering — agents just force you to be honest about it.
When a human developer gets an error, they read the message, check the docs, fix their code, and redeploy. When an agent gets an error, it needs to decide right now — should I retry? With different parameters? After a delay? Or should I give up and try a different approach?
That decision tree requires structured, predictable error responses.
HTTP 400 Bad Request
{
"error": "Something went wrong"
}
HTTP 422 Unprocessable Entity
{
"error": {
"type": "validation_error",
"code": "invalid_parameter",
"message": "email must be valid",
"param": "email",
"is_retriable": false
}
}
The key fields an agent needs:
type — a machine-readable error category (not a human sentence). The agent can switch on this.code — a specific, stable error code. invalid_parameter is actionable; "error" is not.param — which field caused the error. The agent can fix exactly that field.is_retriable — can the agent try again? This saves a round trip of guessing.Stripe's error responses are the gold standard here. Every error includes a type, code, message, and the parameter that caused it. An agent can programmatically handle every failure mode without parsing English.
Offset-based pagination (?page=3&per_page=20) seems simple, but it breaks for agents in subtle ways. If records are inserted or deleted between page requests, items get skipped or duplicated. An agent processing all records can't detect this — it just gets silently wrong data.
Cursor-based pagination solves this and is now the preferred pattern in 2026.
GET /api/users?page=3&per_page=20
{
"data": [...],
"page": 3,
"total": 247
}
// Agent must calculate: am I done?
// What if total changed between requests?
GET /api/users?limit=20&after=usr_abc123
{
"data": [...],
"has_more": true,
"next_cursor": "usr_xyz789"
}
// Agent knows: has_more? Use next_cursor.
// Stable across insertions/deletions.
The agent-friendly pagination response includes:
has_more — boolean. The agent's stop condition. No arithmetic needed.next_cursor — opaque string. The agent doesn't need to understand it, just pass it back./batch endpoint.
Also consider the Link header (RFC 8288) — it's how HTTP was designed to do this:
Link: </api/users?after=usr_xyz789>; rel="next",
</api/users?before=usr_abc123>; rel="prev"
Human developers can read prose docs, scan examples, and infer patterns. Agents need a structured contract that tells them exactly what's available, what parameters are required, and what the response looks like — before making a single request.
OpenAPI (Swagger) is the standard. If you do nothing else from this article, publish an OpenAPI spec. An agent that has your /openapi.json knows every endpoint, every parameter type, every possible response shape. It's the difference between "explore and guess" and "read the map."
{
"paths": {
"/users/{id}": {
"get": {
"summary": "Get a user by ID",
"parameters": [{
"name": "id",
"in": "path",
"required": true,
"schema": { "type": "string", "pattern": "^usr_[a-z0-9]+$" }
}],
"responses": {
"200": {
"content": {
"application/json": {
"schema": { "$ref": "#/components/schemas/User" }
}
}
},
"404": {
"content": {
"application/json": {
"schema": { "$ref": "#/components/schemas/Error" }
}
}
}
}
}
}
}
}
Beyond OpenAPI, consider:
The auth spectrum, from best to worst for agents:
Authorization: Bearer sk_live_.... The agent stores it, sends it, never thinks about it again.client_credentials — machine-to-machine flow. The agent exchanges a client ID and secret for a token. No browser, no human consent. Stripe, Twilio, and most modern APIs support this.authorization_code — requires redirecting a human to a consent screen. An agent can't do this alone. It needs a human to authorize first, then it can use the refresh token.If your API only supports session-based auth or OAuth authorization_code, consider adding an API key option for programmatic access. It's the single highest-impact change you can make for agent accessibility.
Rate limiting is necessary. But how you communicate limits determines whether an agent gracefully throttles or blindly hammers your endpoint.
HTTP 403 Forbidden
{
"error": "Access denied"
}
// Is this auth? Rate limit? IP block?
// Agent has no idea what to do.
HTTP 429 Too Many Requests
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1710400000
Retry-After: 30
{
"error": {
"type": "rate_limit_error",
"message": "Rate limit exceeded",
"is_retriable": true,
"retry_after_seconds": 30
}
}
The essential headers:
X-RateLimit-Limit — total requests allowed per window. The agent can pace itself.X-RateLimit-Remaining — how many are left. The agent can preemptively slow down at 10% remaining.X-RateLimit-Reset — when the window resets (Unix timestamp).Retry-After — seconds to wait (on 429 responses). The most critical header for agents.
Send these on every response, not just 429s. An agent should be able to see Remaining: 5 and proactively slow down before hitting the wall. This is a conversation: "here's where you stand, here's what I can handle."
There's also an IETF draft (RFC) to standardize rate limit headers. Aligning with it future-proofs your API.
Network failures happen. Agents retry. If a POST request creates a duplicate resource every time it's retried, you've got a problem. Idempotency keys solve this:
POST /api/payments
Idempotency-Key: pay_req_abc123
Content-Type: application/json
{
"amount": 5000,
"currency": "usd"
}
// First request: creates payment, returns 200
// Retry with same key: returns same 200, no duplicate
The agent generates a unique key per logical operation. If the network drops mid-response and the agent retries, the server recognizes the key and returns the original result. No duplicate charges, no duplicate records.
If you're building an API in 2026, here's the minimum bar for agent-readiness:
/openapi.jsontype, code, param, and is_retriablehas_more and next_cursorRetry-After header on 429 responsesNone of this is exotic. It's all well-established HTTP and REST conventions. The difference is that agents force you to actually implement them, because they can't work around the gaps the way human developers do.