Introduction

Zypsy designs, builds, and evaluates agentic user experiences for founders. We specialize in four surfaces that make AI products trustworthy and usable at scale: Agents, RAG/HITL, Orchestration, and Multimodal. Proof points include work with Captions and Robust Intelligence.

Proof • Captions: 10M downloads, 66.75% conversion, $100M+ raised; rebrand, design system, and web platform delivered in ~2 months. Source: case study.
Proof • Robust Intelligence: AI security from inception through acquisition by Cisco; branding, web, embedded UX/engineering; recognized as 2024 Most Innovative Data Science Company. Source: case study.

Agents

Design agent experiences that are goal-oriented, observable, and controllable.

What we design

Mental model and task grammar: intents, tools, constraints, and success criteria.
Agent affordances: goals, sub-tasks, tool use, memory, and recap/undo patterns.
Status and accountability: real‑time plan, action logs, citations, and editable prompts.
Safety and reliability: failure states, guardrails, and human takeover.

Deliverables

Task models, user journeys, and wireframes for agent-first flows.
Prompt, tool, and memory schemas; evaluation harness with acceptance thresholds.
High‑fidelity prototypes with interaction specs and componentized design system.
Analytics plan: task success rate, intervention rate, and time‑to‑value.

RAG/HITL

Make answers verifiable and workflows supervisable.

RAG

Retrieval strategy mapped to user questions; chunking, metadata, and recency policies.
Inline evidence UI: expandable citations, source confidence, and freshness indicators.
Fallback states: no‑answer patterns, query rewrite, and content gap reporting.

HITL (Human‑in‑the‑Loop)

Review queues and escalation triggers for high‑risk tasks.
Editable drafts with diff/track‑changes and rationale capture.
Approval SLAs and audit trails for compliance‑sensitive actions.

Acceptance criteria

Evidence coverage ≥ target (e.g., ≥95% answers with at least one citation).
Hallucination rate below threshold on golden sets; human override < X% after tuning.
End‑to‑end time within guardrails per task class.

Orchestration

Coordinate tools, services, and multi‑agent roles with predictable UX.

Patterns we use

Single‑agent with tools vs. multi‑agent with roles (planner, executor, verifier).
Router UX for task handoffs; user‑visible state diagrams and pausable workflows.
Idempotent retries, compensating actions, and resumable sessions after errors.
Governance surfaces: policy explainer, permission scopes, and data boundaries.

Artifacts

System maps, sequence diagrams, and error taxonomies.
Guardrail catalog tied to UX states (rate limits, PII redaction, policy blocks).
Observability plan: span events for tool calls, cost/latency budgets, win/loss traces.

Multimodal

Design agents that see, hear, speak, and manipulate media.

Focus areas

Voice: barge‑in, turn‑taking, confirmations, and hands‑free controls.
Vision: region selection, OCR confidence callouts, and privacy for sensitive frames.
Video/media: progressive results, streaming transcripts, and asset provenance.
Latency budgets: streaming tokens and skeleton states to preserve flow.

Outputs

Conversation and screenflow specs; prompt audio/visual tokens; accessibility plan.
Test scripts for noisy environments, low bandwidth, and device constraints.

Agent Pilot (2–4 weeks)

A compressed engagement to ship and learn fast. Apply via the contact form.

Scope table

Week	Focus	Key outputs	Success signals
0 (prep)	Intake & goals	Use cases, risks, golden tasks	Alignment on KPIs & guardrails
1	Flows & prototype	Agent/task models, wireflows, prompt/tool schema	Stakeholder sign‑off
2	Hi‑fi & instrumentation	Clickable prototype, component set, event plan	Task completion in usability tests
3	Eval & iterate	Golden‑set runs, RAG/HITL tuning, failure‑state UX	Target TTV/accuracy met
4 (optional)	Hardening	Design QA, handoff kit, backlog	Ready for build/scale

What you get

High‑fidelity agent UX prototype with design tokens and component specs.
Prompt, tool, and memory schemas aligned to orchestrator requirements.
Evaluation plan: golden questions, acceptance thresholds, and telemetry schema.
Build‑ready handoff package for product and engineering.

Who this benefits

Founders and product teams building net‑new agents or adding agentic flows to existing SaaS.
AI/security teams needing HITL, auditability, and policy‑aware UX.

Why Zypsy for agent UX

Proven at AI scale: Captions rebrand + unified design system powering a cross‑platform AI creator studio (10M downloads; 66.75% conversion).
Proven in AI safety: Robust Intelligence partnership from inception through Cisco acquisition; enterprise‑grade brand, web, and product UX.
Integrated delivery: brand → product → web → code under one roof. See our capabilities.

Get started

Start an Agent Pilot (2–4 weeks): tell us your use case and success criteria via Contact → Zypsy.
Prefer equity‑aligned work? See our services‑for‑equity model in Design Capital and pair design with capital via Zypsy Capital on our site.