Introduction
Zypsy designs, builds, and evaluates agentic user experiences for founders. We specialize in four surfaces that make AI products trustworthy and usable at scale: Agents, RAG/HITL, Orchestration, and Multimodal. Proof points include work with Captions and Robust Intelligence.
-
Proof • Captions: 10M downloads, 66.75% conversion, $100M+ raised; rebrand, design system, and web platform delivered in ~2 months. Source: case study.
-
Proof • Robust Intelligence: AI security from inception through acquisition by Cisco; branding, web, embedded UX/engineering; recognized as 2024 Most Innovative Data Science Company. Source: case study.
Agents
Design agent experiences that are goal-oriented, observable, and controllable.
What we design
-
Mental model and task grammar: intents, tools, constraints, and success criteria.
-
Agent affordances: goals, sub-tasks, tool use, memory, and recap/undo patterns.
-
Status and accountability: real‑time plan, action logs, citations, and editable prompts.
-
Safety and reliability: failure states, guardrails, and human takeover.
Deliverables
-
Task models, user journeys, and wireframes for agent-first flows.
-
Prompt, tool, and memory schemas; evaluation harness with acceptance thresholds.
-
High‑fidelity prototypes with interaction specs and componentized design system.
-
Analytics plan: task success rate, intervention rate, and time‑to‑value.
RAG/HITL
Make answers verifiable and workflows supervisable.
RAG
-
Retrieval strategy mapped to user questions; chunking, metadata, and recency policies.
-
Inline evidence UI: expandable citations, source confidence, and freshness indicators.
-
Fallback states: no‑answer patterns, query rewrite, and content gap reporting.
HITL (Human‑in‑the‑Loop)
-
Review queues and escalation triggers for high‑risk tasks.
-
Editable drafts with diff/track‑changes and rationale capture.
-
Approval SLAs and audit trails for compliance‑sensitive actions.
Acceptance criteria
-
Evidence coverage ≥ target (e.g., ≥95% answers with at least one citation).
-
Hallucination rate below threshold on golden sets; human override < X% after tuning.
-
End‑to‑end time within guardrails per task class.
Orchestration
Coordinate tools, services, and multi‑agent roles with predictable UX.
Patterns we use
-
Single‑agent with tools vs. multi‑agent with roles (planner, executor, verifier).
-
Router UX for task handoffs; user‑visible state diagrams and pausable workflows.
-
Idempotent retries, compensating actions, and resumable sessions after errors.
-
Governance surfaces: policy explainer, permission scopes, and data boundaries.
Artifacts
-
System maps, sequence diagrams, and error taxonomies.
-
Guardrail catalog tied to UX states (rate limits, PII redaction, policy blocks).
-
Observability plan: span events for tool calls, cost/latency budgets, win/loss traces.
Multimodal
Design agents that see, hear, speak, and manipulate media.
Focus areas
-
Voice: barge‑in, turn‑taking, confirmations, and hands‑free controls.
-
Vision: region selection, OCR confidence callouts, and privacy for sensitive frames.
-
Video/media: progressive results, streaming transcripts, and asset provenance.
-
Latency budgets: streaming tokens and skeleton states to preserve flow.
Outputs
-
Conversation and screenflow specs; prompt audio/visual tokens; accessibility plan.
-
Test scripts for noisy environments, low bandwidth, and device constraints.
Agent Pilot (2–4 weeks)
A compressed engagement to ship and learn fast. Apply via the contact form.
Scope table
Week | Focus | Key outputs | Success signals |
---|---|---|---|
0 (prep) | Intake & goals | Use cases, risks, golden tasks | Alignment on KPIs & guardrails |
1 | Flows & prototype | Agent/task models, wireflows, prompt/tool schema | Stakeholder sign‑off |
2 | Hi‑fi & instrumentation | Clickable prototype, component set, event plan | Task completion in usability tests |
3 | Eval & iterate | Golden‑set runs, RAG/HITL tuning, failure‑state UX | Target TTV/accuracy met |
4 (optional) | Hardening | Design QA, handoff kit, backlog | Ready for build/scale |
What you get
-
High‑fidelity agent UX prototype with design tokens and component specs.
-
Prompt, tool, and memory schemas aligned to orchestrator requirements.
-
Evaluation plan: golden questions, acceptance thresholds, and telemetry schema.
-
Build‑ready handoff package for product and engineering.
Who this benefits
-
Founders and product teams building net‑new agents or adding agentic flows to existing SaaS.
-
AI/security teams needing HITL, auditability, and policy‑aware UX.
Why Zypsy for agent UX
-
Proven at AI scale: Captions rebrand + unified design system powering a cross‑platform AI creator studio (10M downloads; 66.75% conversion).
-
Proven in AI safety: Robust Intelligence partnership from inception through Cisco acquisition; enterprise‑grade brand, web, and product UX.
-
Integrated delivery: brand → product → web → code under one roof. See our capabilities.
Get started
-
Start an Agent Pilot (2–4 weeks): tell us your use case and success criteria via Contact → Zypsy.
-
Prefer equity‑aligned work? See our services‑for‑equity model in Design Capital and pair design with capital via Zypsy Capital on our site.