Chatbot UX Best Practices
How AffZero's AI assistant implements the same interaction patterns used by ChatGPT, Claude, and Gemini for reliable, safe, and pleasant chat experiences.
Chatbot UX Best Practices
This document captures the principles behind AffZero's AI assistant design and explains how each one was implemented. It is intended for developers extending or maintaining the chat feature.
1. Never ask the model to author UI
The problem. If you instruct the LLM to print a JSON object that the client then parses, the model is acting as a template engine. It may paraphrase, omit fields, add extra text, or wrap the object in markdown code fences. Any of those deviations silently breaks the UI.
The pattern used by ChatGPT / Claude / Gemini. Tool calls are structured events in the streaming protocol — not text. The client reacts to the data, not to anything the model writes.
How AffZero implements it. The server returns result.toUIMessageStreamResponse() (AI SDK v6), which streams UIMessageChunk SSE events. When a write tool runs, the SDK emits a tool-output-available chunk containing the tool result as typed data. The client parses only text-delta chunks for the visible bubble and tool-output-available chunks (where output.__confirmation_required === true) for confirmation cards. The model's text never influences whether a card appears.
2. Halt generation after a pending action
The problem. When a model continues generating text after calling write tools it tends to hallucinate the outcome ("Done!") before the user has even approved, or invents UI limitations ("I can only show one card at a time").
The pattern. Stop the agentic loop the moment a confirmation is pending; resume after the user responds.
How AffZero implements it. A custom stopWhen predicate is passed alongside stepCountIs(8):
const stopOnConfirmation: StopCondition<any> = ({ steps }) => {
const last = steps[steps.length - 1];
return !!last?.toolResults.some((r: any) => r?.output?.__confirmation_required);
};
The loop halts immediately; the user sees the confirmation card(s) and nothing else until they act.
3. Parallel multi-action confirmations
The problem. Forcing a single confirmation per turn makes editing 10 conversions take 10 round-trips. Some earlier implementations also told the model that "only one card can be shown at a time", which was both wrong and user-hostile.
The pattern. Let the model call write tools multiple times in one turn (parallel tool calls). Each call produces one card. All cards render simultaneously.
How AffZero implements it. The tool-output-available stream parsing loop collects all confirmation chunks into a newConfirmations array (deduplicated by actionKey). setPendingConfirmations(newConfirmations) fires once after the stream ends with the full set. The system prompt now explicitly says:
"To act on multiple records, call the write tool once per record in the SAME turn — each renders its own card. Never claim only one can be shown at a time."
4. Deferred direct execution — no model round-trip on Confirm
The old problem. Previously, "confirming" meant re-sending the conversation to the model and hoping it re-called the write tool with the exact same parameters. This had two failure modes:
- Infinite confirmation loop — if the model re-derived parameters slightly differently, a fresh confirmation card appeared instead of executing.
- Re-fetch slowness — the model re-called read tools on every turn because tool results were discarded between turns.
The pattern. Capture the full {action, params} payload from the pending confirmation card on the client. On Confirm, POST those payloads directly to a server endpoint that runs getAction(action).run(ctx, params) without invoking the LLM at all. Return confirmedResults as plain JSON. The client renders a deterministic summary.
How AffZero implements it.
hooks/use-chat-engine.ts:confirmActionsImplPOSTs{ confirmedActions: [{ action, params }, ...] }.app/api/ai/chat/route.ts: whenconfirmedActionsis present and non-empty, the route short-circuits beforestreamText, executes each action through the registry, logs tochat_agent_runs, and returns{ confirmedResults }.incrementAiCallsis NOT called — no LLM token was used.buildChatToolsno longer has aconfirmedSetcheck — write tools ALWAYS return__confirmation_required. Execution only happens through the direct branch.
Deduplication. Pending confirmations are now deduplicated by toolCallId (unique per invocation, assigned by the AI SDK) rather than by actionKey. This ensures that 7 offer edits on the same connection produce 7 distinct cards even when the earlier computeActionKey hash was colliding.
computeActionKey fix. The function now uses a fully recursive stable JSON serialisation so data: { id: "195" } and data: { id: "196" } produce different keys. The key is used only for logging and display — never for execution matching.
4a. Short-TTL read cache
The problem. Without caching, re-opening the chatbot or starting a follow-up turn (e.g. "also add aff_sub3 to those offers") forces the model to re-fetch all platform data — adding 2–5 s round-trips per turn.
The pattern. Cache expensive read-tool results in server memory for a short TTL (3 minutes). On the next turn, the same query returns instantly from memory.
How AffZero implements it. lib/ai/tool-cache.ts is a module-level Map<string, {value, expires}>. In buildChatTools, pull_stats and tracker_action_get calls check getCached(orgId::toolId::params) before executing. On success, setCached stores the result. Error results are never cached. The cache is bounded at 500 entries with LRU-style eviction.
4b. Compact tool-result memory (client → server)
The problem. Even with a server-side cache, the model still thinks it needs to re-fetch data because it has no memory of what was returned in previous turns (tool results are not included in the re-sent conversation history by default).
The pattern. The client accumulates a compact "recently fetched data" log (up to 5 entries, 800-char preview each) from tool-output-available stream chunks. On the next request, this is sent as toolMemory. The server injects it as a ## Recently fetched data section into the system context block.
How AffZero implements it. hooks/use-chat-engine.ts collects non-error read-tool outputs into toolMemoryRef.current and sends them as toolMemory on every subsequent request. app/api/ai/chat/route.ts appends the memory section to the context block when present. The system prompt instructs the model: "Do NOT call read tools again for data already in the 'Recently fetched data' section."
5. Optimistic streaming text
The pattern. Show text as it arrives — don't buffer the full response before rendering. Users perceive streamed output as faster even at the same tokens-per-second.
How AffZero implements it. The streaming loop updates setMessages on every text-delta chunk, so the assistant bubble grows character by character just like ChatGPT.
6. Never leak raw data into the chat bubble
The problem. If a JSON blob with an encrypted connection ID or internal key appears verbatim in the chat history, it confuses users and may expose internal IDs.
The patterns.
- Structured streaming means tool result objects never travel through the text channel at all.
- A defense-in-depth sanitizer in
formatMessage(components/chat/chat-helpers.ts) strips any stray{..."__confirmation_required":true...}blob before rendering.
7. Human-readable confirmation cards
The problem. Showing Execute "edit_conversion" on S3afxrJwlh74Mw...? tells the user nothing actionable.
How AffZero implements it. Two fixes work together:
- The
confirmMessagefortracker_action_editnow looks up the connection name fromctx.connectionsinstead of displaying the raw ID. ConfirmationPreviewin bothchatbot-widget.tsxandmobile-chat.tsxrenders key fields fromparams.data: conversion ID, human-readable status label (e.g. "Confirmed (Approved)"), and comment. This applies totracker_action_edit,clone_offer, and the template/automation creation flows.
8. No double-confirmation
The rule. Never ask "Are you sure?" in text before calling the write tool. The confirmation card is the approval step. Asking twice is frustrating and erodes trust.
How AffZero enforces it. The system prompt explicitly states: "Do NOT ask 'Should I proceed?' or 'Are you sure?' before calling a write tool. The confirmation card IS the approval step."
9. Honest post-action reporting
The rule. After a confirmed tool runs, the model must report what the tool actually returned. Writing "Done!" when the tool returned an error is a lie.
How AffZero enforces it. The system prompt states: "Report what the tool ACTUALLY returned. If success: true → tell the user what was done. If error: ... → report the error honestly. Writing 'Done!' does NOTHING unless the tool returned success: true."
10. Clarify before acting, not after
The rule. If required information is missing, ask for it before calling the write tool. Never call a tool with guessed parameters and then apologise.
How AffZero enforces it. The system prompt provides explicit guidance: missing required fields → ask first, then call. If all information is present → call immediately. This keeps the conversation linear and predictable.
Summary table
| Practice | Where implemented |
|---|---|
| Structured tool-call streaming | app/api/ai/chat/route.ts → toUIMessageStreamResponse() |
| Halt on pending confirmation | app/api/ai/chat/route.ts → custom stopOnConfirmation StopCondition |
| Parse confirmations from typed data | hooks/use-chat-engine.ts → tool-output-available SSE chunk loop |
| Parallel multi-card confirmations | hooks/use-chat-engine.ts + system prompt |
| Stable idempotent action keys | lib/ai/chat-tools.ts actionKey + confirmedActionIds |
| Human-readable card preview | components/chat/chatbot-widget.tsx, mobile-chat.tsx ConfirmationPreview |
| Connection name (not encrypted ID) | lib/ai/actions/registry.ts confirmMessage |
| Sanitize leaked JSON blobs | components/chat/chat-helpers.ts stripConfirmationJson |
| No double-confirmation | System prompt |
| Honest result reporting | System prompt |