Write the description like a one-paragraph briefing
The AI uses thedescription to decide when to call your tool. Generic descriptions get generic behavior.
Always include a hint in your response
The AI reads the entire response body. If you return raw data without guidance, the AI guesses at what to say next. Bake in the next-action hint:
hint field name is a convention, not a requirement — instruction, next_step, say all work. The point is: include a one-liner in plain English that tells the AI what to say next.
Keep responses small
| Body size | Behavior |
|---|---|
| < 1 KB | Ideal. AI uses everything. |
| 1–4 KB | Fine. Slight latency cost. |
| 4–64 KB | Acceptable, but only the first 4 KB is shown in the dashboard preview. |
| > 64 KB | Truncated. AI gets first 64 KB + a [truncated] footer. |
Make endpoints idempotent
The AI might call the same tool twice in a row with the same arguments — usually because it got distracted by the caller’s next sentence and asks again. Side-effecting tools should be idempotent under that pattern. If you can’t make the tool idempotent (e.g. it charges a card), include atransaction_id in the response and have the AI confirm before reusing it.
Respond in under 2 seconds
The caller is waiting on a live phone call. The 10s budget exists for emergencies; the actual target is sub-2-second p95. Common ways to blow past 2s:- Chained API calls inside your endpoint. Pre-cache.
- Database round-trips on every call. Add an index, denormalize, or memcache.
- Synchronous third-party API calls (Stripe, Salesforce). Cache aggressively.
start_lookup— kicks off the query, returns a job ID immediately.check_lookup— polls the job ID, returns “still processing” or the result.
Don’t leak PII you didn’t intend to
Once data crosses the wire, the AI may verbalize it. Common leaks:- Card numbers — Send the last 4 digits, never the full number.
- National IDs — Mask before returning.
- Internal SKU codes — Translate to customer-facing names.
- Stack traces — On error, return
{"error": "Lookup failed", "hint": "Apologize and offer a callback."}not the raw exception.
Don’t trust LLM-provided arguments as gospel
The schema validates types but not content. If your tool takes aphone_number, the AI might pass "my mom's". Defend on your side:
hint and recovers. Returning 4xx would also work but the AI’s response is less specific (HTTP_TOOL_CUSTOMER_ERROR).
Use Bearer tokens, even for “internal” endpoints
Anyone on the public internet can find your endpoint URL (it’s in our invocation logs, copied as curl, etc.). Without auth, anyone can hit it once they know the URL.- Generate a token specific to this tool.
- Save it in the tool’s auth field.
- Verify it on your side and verify the HMAC signature. Both gates.
Audit the test fire before going live
Before you flip the tool to active, run Send test at least once and check:- Does your endpoint receive the request? (Check your access logs.)
- Does signature verification succeed? (Check your auth logs.)
- Is the response shape what the AI will get? (Look at the Test panel.)
- Does the latency look acceptable?
Version your responses if you change them
If you change the shape of your tool’s response (e.g. you used to returnextension and now you return routing_target), the AI’s description-based behavior will lag. Either:
- Make the new field a superset (return BOTH
extensionandrouting_targetfor a transition window). - Update the tool description to reflect the new shape.
- Test in a webhook.site / ngrok sandbox first.
Monitor the invocations log
The Invocations panel shows the last 30 days of activity. Skim it weekly. Common things to look for:HTTP_TOOL_CUSTOMER_ERRORrate > 5% → your endpoint is flaky.HTTP_TOOL_TIMEOUTshowing up → your endpoint is slow.HTTP_TOOL_INVALID_ARGSrepeating → your description is misleading the AI.- A particular tool is never called → maybe the AI can’t figure out when to use it; rewrite the description.
When NOT to use HTTP Tools
HTTP Tools is the right shape for synchronous, agent-decided lookups during a live call. Don’t use it for:- Background work — webhooks pushed from your side to ours. We don’t support inbound webhooks via this mechanism.
- File uploads — body must be JSON / text. No multipart, no binary.
- Long-running jobs — anything > 10s. Split into start/check tools.
- Anything the AI shouldn’t be deciding to call — purely deterministic logic belongs in your own backend, called from your own systems.
Next: Code samples
Drop-in handlers for Node.js, Python, PHP, and Go.

