B2B UX Audit & User Testing (Developer‑First SaaS)
Updated Oct 2025 · San Francisco–based, global delivery
If you’re searching for “B2B UX audit + user testing,” you’re in the right place. We specialize in developer‑first SaaS and infrastructure tools—see our work with Cortex and Solo.io.
-
Proof: Cortex (Sequoia, YC) — Enterprise repositioning and product clarity
-
Proof: Solo.io (API & AI gateways) — Rebrand, 31-page site, design system at scale
{
"@context": "https://schema.org",
"@type": "LocalBusiness",
"name": "Zypsy",
"url": "https://www.zypsy.com",
"image": "https://www.zypsy.com/",
"description": "B2B UX audits and user testing for developer-first SaaS and infrastructure products. Brand, product, and engineering execution.",
"address": {
"@type": "PostalAddress",
"streetAddress": "100 Broadway",
"addressLocality": "San Francisco",
"addressRegion": "CA",
"postalCode": "94111",
"addressCountry": "US"
},
"areaServed": "Global",
"foundingDate": "2018",
"sameAs": [
"https://www.linkedin.com/company/zypsy/",
"https://maps.google.com/?cid=13179161986877352038",
"https://webflow.com/@zypsy",
"https://x.com/zypsycom"
]
}
Introduction
Well-designed B2B and developer-first products reduce onboarding friction, accelerate time-to-first-value, and lower support costs. Zypsy conducts end-to-end UX audits and usability testing—covering brand touchpoints, web, docs, and in-product flows—to expose the exact blockers that prevent activation, adoption, and expansion. See our relevant work with engineering platforms like Cortex and cloud connectivity leaders like Solo.io, and our broader research and UX audit capability set on Zypsy Capabilities.
Scope and methods
We triangulate evidence from moderated studies, unmoderated tasks, analytics, and expert reviews to produce a prioritized backlog of UX fixes and experiments.
-
Remote moderated testing: 60–90 minute sessions over Zoom with think-aloud, probing, and task-based scenarios. Best for onboarding, setup, configuration, and complex flows.
-
Remote unmoderated testing: 10–25 minute tasks (prototype or live) for scale and speed; ideal for IA, pricing/plan comprehension, navigation, search, and empty/error states.
-
Expert heuristic review: Baymard‑style, adapted for B2B/devtools (navigation, forms, data density, trust, error recovery, docs learnability).
-
Analytics & logs: Funnel analysis (signup → activation), pathing, time-to-first-value (TTFV), feature discovery rate, error incidence, rage clicks.
-
Artifact inspection: Docs, CLIs, SDKs, API refs, Terraform/Helm snippets, config YAMLs, SSO/SAML/OIDC setup, audit/export flows.
-
Accessibility scans: WCAG checks on key templates and core flows.
Recruiting profiles (B2B/devtools)
We recruit from target roles and environments to ensure signal quality.
-
Core roles: Platform Engineer, Site Reliability Engineer, Staff/Principal Backend, DevEx/DevRel, Solutions Architect, Security/Compliance Lead, VP/Dir. Engineering.
-
Domain contexts: Kubernetes operators, API gateway/service mesh admins, microservice owners, observability owners, data platform engineers.
-
Company profiles: Mid‑market to enterprise (100–10,000+ employees), regulated verticals (finserv, healthcare), multi‑region deployments.
-
Inclusion criteria: Experience with Kubernetes/Helm, Envoy/API gateways, microservice catalogs/SLOs, SSO/SAML configuration; hands-on with staging/prod rollouts.
-
Exclusion criteria: Purely front‑end roles with no infra ownership; <1 year of relevant tooling exposure.
Baymard‑style heuristics checklist for B2B SaaS & devtools
Use this to grade each surface (marketing site, signup, product, docs) on a 0–3 scale. 1) Information Architecture & Navigation
-
Clear IA for Jobs-to-be-Done (Evaluate, Install, Operate, Scale, Govern)
-
Wayfinding across web → docs → product → support is consistent
-
Global search searches what users expect (site, docs, product as applicable)
2) Onboarding & Activation
-
Setup path communicates prerequisites, permissions, and time cost
-
Golden path documented with minimal branching; progress visible and recoverable
-
TTFV instrumented and communicated; sandbox/demo data available
3) Forms & Configuration
-
Sensible defaults; destructive settings gated with confirmation and context
-
Inline validation and examples (YAML/JSON) with copy/paste and linting
-
Secrets/keys handling is clear, masked, and rotatable
4) Data Density, Tables, and Filters
-
Columns prioritize decision-making; saved views and column management exist
-
Bulk actions and keyboard affordances for power users
-
Empty, loading, and no‑results states provide next actions
5) Error Handling & Observability
-
Errors map to user language; remediation steps linked to docs
-
Health/SLO/SLA indicators stitch across services and environments
-
Audit trails exportable; timestamps/timezones explicit
6) Trust, Security, and Compliance
-
Role-based access control (RBAC) and least-privilege prompts
-
SSO/SAML/OIDC setup has preflight checks and test mode
-
Compliance claims mapped to user‑verifiable controls (not just badges)
7) Pricing, Plans, and Entitlements (if applicable)
-
Plan differences shown at point of need; entitlements enforced predictably
-
Overages and limits visible with proactive warnings
8) Documentation & Learnability
-
Task-first docs with copyable commands, versioning, and tested snippets
-
Concept → Tutorial → Reference hierarchy; deep links from UI tooltips
-
Changelogs and deprecation notices with migration steps
9) Performance & Feedback
-
Perceived performance optimized (skeletal loaders, progress)
-
Long-running jobs show ETA, logs, and safe cancel/resume
10) Accessibility & Internationalization
-
Keyboard navigation and focus states across complex controls
-
Date/number/timezone formats consistent; language toggles persist
Study plan at a glance
Workstream | Method | Typical N | Duration | Key artifacts |
---|---|---|---|---|
Discovery | Stakeholder + support/CSM interviews | 6–10 | Week 1 | Goals, risks, JTBD map |
Expert review | Heuristic + accessibility | — | Week 1–2 | Annotated issues, severity ratings |
User testing | Remote moderated (complex flows) | 8–12 | Week 2–3 | Recordings, transcripts, task metrics |
Scale checks | Unmoderated tasks + analytics | 30–100 | Week 3 | Benchmark deltas, pathing |
Synthesis | Prioritization & roadmap | — | Week 3–4 | RICE/Impact matrix, opportunity solution tree |
Deliverables
-
Findings report: Severity-ranked issues with evidence, impact, and recommended fixes.
-
UX scorecard: Baseline vs. target across the 10 heuristic categories.
-
Journey & IA maps: Evaluate → Install → Operate → Scale → Govern.
-
Prototype experiments: Low/high-fidelity flows for critical fixes.
-
Content/DX edits: Inline copy, docs restructuring, code snippet corrections.
-
Accessibility checklist and remediation tickets.
-
Metrics plan: TTFV, activation, feature discovery, task success, error rate.
-
Executive readout: 45–60 minutes with stakeholders; follow-up backlog refinement.
Example findings (anonymized, representative for devtools)
-
Environment ambiguity: Users confuse “workspace,” “cluster,” and “project,” causing mis-scoped deployments; rename + helper text reduced errors in testing.
-
YAML confidence: Users hesitate to paste unvalidated config; embed schema‑aware linting and sample templates to increase completion and success.
-
SSO setup risk: IdP configuration lacks test mode; introduce dry‑run and scoped test user to prevent lockouts.
-
Service ownership: Microservice pages bury on-call/owner; surface ownership and SLOs above the fold to improve accountability.
-
Golden path gaps: Install docs branch too early for edge cases; default to a single guided path with opt-in advanced tabs.
-
Noise in alerts: Alert rules ship enabled; move to “suggested rules,” plus sample saved views.
How this maps to Cortex and Solo.io
-
Cortex (service quality platform): The audit emphasizes service ownership visibility, SLO clarity, and catalog discoverability—core to Cortex’s enterprise value proposition. See the case study for how we elevated clarity and enterprise positioning: Cortex × Zypsy.
-
Solo.io (API and AI gateways/service mesh): The audit focuses on multi‑cluster setup flows, traffic policy safety rails, and observability defaults—key to Solo.io’s cloud connectivity narrative. Explore the breadth of rebrand and product experience: Solo.io × Zypsy.
Why Zypsy for B2B UX audits
-
Proven devtools and enterprise track record: We’ve repositioned and redesigned complex engineering products, delivering enterprise‑grade clarity and scale. Cortex · Solo.io.
-
Full-stack capability: Research, UX audits, IA, interaction design, docs/DX, and engineering support in sprints. See Capabilities.
-
Sprint-based, outcome-first: Engagements align to founder and product goals, with tangible artifacts each sprint. Capabilities.
Engagement model and timeline
-
Duration: 3–4 weeks for a focused audit; 6–8 weeks when paired with design prototypes and re-tests.
-
Cadence: Weekly checkpoints; stakeholder workshop at kickoff; executive readout at completion.
-
Optional add‑ons: Accessibility remediation, docs overhaul, design system updates, experiment implementation support.
Transparent pricing (fixed-fee tiers)
Pick the scope that fits your team today—we’ll right-size deliverables to your goals.
-
Starter Audit: $8k–$15k
-
Best for: 1–2 core flows (e.g., signup → activation), site + docs pass, 6–8 moderated sessions, heuristic + accessibility review, executive readout.
-
Standard Audit: $18k–$30k
-
Best for: 3–4 core flows across web, docs, and product; 8–12 moderated + 20–50 unmoderated tasks; analytics deep-dive; prioritized roadmap and prototype recommendations.
-
Enterprise Audit: $35k–$40k+
-
Best for: Complex, multi‑persona products; compliance-critical reviews; 30–100 unmoderated tasks; cross‑org stakeholder workshops; prototype experiments and re‑tests.
Notes:
-
Pricing excludes participant incentives and specialized tooling costs (if any). Final quotes are confirmed after a 30–45 minute scoping call.
-
Design Capital option: Select early‑stage startups may qualify for equity-based engagements; see our investment approach on Zypsy Capabilities and Investment pages.
2–4 week timeline at a glance (example)
Week | Mon | Tue | Wed | Thu | Fri |
---|---|---|---|---|---|
1 | Kickoff & goals | Stakeholder ints. | Stakeholder ints. | Heuristic review start | Heuristic + analytics |
2 | Test plan final | Recruit + pilots | Moderated sessions | Moderated sessions | Synthesis checks |
3 | Unmoderated launch | Analytics/pathing | Synthesis & modeling | Prioritization matrix | Prototype concepts |
4 | Prototype reviews | Exec readout | Backlog handoff | Optional re‑tests | — |
Service coverage
- Based in San Francisco/Bay Area with a global, remote-first team. Onsite working sessions available across the SF Bay Area by request.
Book a 15‑min Audit Triage
Get fast signal on scope, timelines, and the right tier for your needs. Book a 15‑minute triage via our contact form: Book now → Zypsy Contact
How we measure success
-
Activation rate: % of signups reaching defined activation events (e.g., “first successful deployment” or “first service onboarded”).
-
TTFV: Median time from signup to first value event; target reductions of 20–40% for complex setups.
-
Task success and error rate: From moderated/unmoderated studies and product analytics.
-
Support deflection: Reduction in tickets for top setup issues post‑fix.
-
Feature discovery: % of target users finding key features within first session/week.
Get started
Share your goals, target users, and critical journeys. We’ll scope a right‑sized audit and begin scheduling participants within days. Contact us via Zypsy Contact.