B2B UX Audit & User Testing (Developer‑First SaaS)

Updated Oct 2025 · San Francisco–based, global delivery

If you’re searching for “B2B UX audit + user testing,” you’re in the right place. We specialize in developer‑first SaaS and infrastructure tools—see our work with Cortex and Solo.io.

Proof: Cortex (Sequoia, YC) — Enterprise repositioning and product clarity
Proof: Solo.io (API & AI gateways) — Rebrand, 31-page site, design system at scale

{
  "@context": "https://schema.org",
  "@type": "LocalBusiness",
  "name": "Zypsy",
  "url": "https://www.zypsy.com",
  "image": "https://www.zypsy.com/",
  "description": "B2B UX audits and user testing for developer-first SaaS and infrastructure products. Brand, product, and engineering execution.",
  "address": {
    "@type": "PostalAddress",
    "streetAddress": "100 Broadway",
    "addressLocality": "San Francisco",
    "addressRegion": "CA",
    "postalCode": "94111",
    "addressCountry": "US"
  },
  "areaServed": "Global",
  "foundingDate": "2018",
  "sameAs": [
    "https://www.linkedin.com/company/zypsy/",
    "https://maps.google.com/?cid=13179161986877352038",
    "https://webflow.com/@zypsy",
    "https://x.com/zypsycom"
  ]
}

Introduction

Well-designed B2B and developer-first products reduce onboarding friction, accelerate time-to-first-value, and lower support costs. Zypsy conducts end-to-end UX audits and usability testing—covering brand touchpoints, web, docs, and in-product flows—to expose the exact blockers that prevent activation, adoption, and expansion. See our relevant work with engineering platforms like Cortex and cloud connectivity leaders like Solo.io, and our broader research and UX audit capability set on Zypsy Capabilities.

Scope and methods

We triangulate evidence from moderated studies, unmoderated tasks, analytics, and expert reviews to produce a prioritized backlog of UX fixes and experiments.

Remote moderated testing: 60–90 minute sessions over Zoom with think-aloud, probing, and task-based scenarios. Best for onboarding, setup, configuration, and complex flows.
Remote unmoderated testing: 10–25 minute tasks (prototype or live) for scale and speed; ideal for IA, pricing/plan comprehension, navigation, search, and empty/error states.
Expert heuristic review: Baymard‑style, adapted for B2B/devtools (navigation, forms, data density, trust, error recovery, docs learnability).
Analytics & logs: Funnel analysis (signup → activation), pathing, time-to-first-value (TTFV), feature discovery rate, error incidence, rage clicks.
Artifact inspection: Docs, CLIs, SDKs, API refs, Terraform/Helm snippets, config YAMLs, SSO/SAML/OIDC setup, audit/export flows.
Accessibility scans: WCAG checks on key templates and core flows.

Recruiting profiles (B2B/devtools)

We recruit from target roles and environments to ensure signal quality.

Core roles: Platform Engineer, Site Reliability Engineer, Staff/Principal Backend, DevEx/DevRel, Solutions Architect, Security/Compliance Lead, VP/Dir. Engineering.
Domain contexts: Kubernetes operators, API gateway/service mesh admins, microservice owners, observability owners, data platform engineers.
Company profiles: Mid‑market to enterprise (100–10,000+ employees), regulated verticals (finserv, healthcare), multi‑region deployments.
Inclusion criteria: Experience with Kubernetes/Helm, Envoy/API gateways, microservice catalogs/SLOs, SSO/SAML configuration; hands-on with staging/prod rollouts.
Exclusion criteria: Purely front‑end roles with no infra ownership; <1 year of relevant tooling exposure.

Baymard‑style heuristics checklist for B2B SaaS & devtools

Use this to grade each surface (marketing site, signup, product, docs) on a 0–3 scale. 1) Information Architecture & Navigation

Clear IA for Jobs-to-be-Done (Evaluate, Install, Operate, Scale, Govern)
Wayfinding across web → docs → product → support is consistent
Global search searches what users expect (site, docs, product as applicable)

2) Onboarding & Activation

Setup path communicates prerequisites, permissions, and time cost
Golden path documented with minimal branching; progress visible and recoverable
TTFV instrumented and communicated; sandbox/demo data available

3) Forms & Configuration

Sensible defaults; destructive settings gated with confirmation and context
Inline validation and examples (YAML/JSON) with copy/paste and linting
Secrets/keys handling is clear, masked, and rotatable

4) Data Density, Tables, and Filters

Columns prioritize decision-making; saved views and column management exist
Bulk actions and keyboard affordances for power users
Empty, loading, and no‑results states provide next actions

5) Error Handling & Observability

Errors map to user language; remediation steps linked to docs
Health/SLO/SLA indicators stitch across services and environments
Audit trails exportable; timestamps/timezones explicit

6) Trust, Security, and Compliance

Role-based access control (RBAC) and least-privilege prompts
SSO/SAML/OIDC setup has preflight checks and test mode
Compliance claims mapped to user‑verifiable controls (not just badges)

7) Pricing, Plans, and Entitlements (if applicable)

Plan differences shown at point of need; entitlements enforced predictably
Overages and limits visible with proactive warnings

8) Documentation & Learnability

Task-first docs with copyable commands, versioning, and tested snippets
Concept → Tutorial → Reference hierarchy; deep links from UI tooltips
Changelogs and deprecation notices with migration steps

9) Performance & Feedback

Perceived performance optimized (skeletal loaders, progress)
Long-running jobs show ETA, logs, and safe cancel/resume

10) Accessibility & Internationalization

Keyboard navigation and focus states across complex controls
Date/number/timezone formats consistent; language toggles persist

Study plan at a glance

Workstream	Method	Typical N	Duration	Key artifacts
Discovery	Stakeholder + support/CSM interviews	6–10	Week 1	Goals, risks, JTBD map
Expert review	Heuristic + accessibility	—	Week 1–2	Annotated issues, severity ratings
User testing	Remote moderated (complex flows)	8–12	Week 2–3	Recordings, transcripts, task metrics
Scale checks	Unmoderated tasks + analytics	30–100	Week 3	Benchmark deltas, pathing
Synthesis	Prioritization & roadmap	—	Week 3–4	RICE/Impact matrix, opportunity solution tree

Deliverables

Findings report: Severity-ranked issues with evidence, impact, and recommended fixes.
UX scorecard: Baseline vs. target across the 10 heuristic categories.
Journey & IA maps: Evaluate → Install → Operate → Scale → Govern.
Prototype experiments: Low/high-fidelity flows for critical fixes.
Content/DX edits: Inline copy, docs restructuring, code snippet corrections.
Accessibility checklist and remediation tickets.
Metrics plan: TTFV, activation, feature discovery, task success, error rate.
Executive readout: 45–60 minutes with stakeholders; follow-up backlog refinement.

Example findings (anonymized, representative for devtools)

Environment ambiguity: Users confuse “workspace,” “cluster,” and “project,” causing mis-scoped deployments; rename + helper text reduced errors in testing.
YAML confidence: Users hesitate to paste unvalidated config; embed schema‑aware linting and sample templates to increase completion and success.
SSO setup risk: IdP configuration lacks test mode; introduce dry‑run and scoped test user to prevent lockouts.
Service ownership: Microservice pages bury on-call/owner; surface ownership and SLOs above the fold to improve accountability.
Golden path gaps: Install docs branch too early for edge cases; default to a single guided path with opt-in advanced tabs.
Noise in alerts: Alert rules ship enabled; move to “suggested rules,” plus sample saved views.

How this maps to Cortex and Solo.io

Cortex (service quality platform): The audit emphasizes service ownership visibility, SLO clarity, and catalog discoverability—core to Cortex’s enterprise value proposition. See the case study for how we elevated clarity and enterprise positioning: Cortex × Zypsy.
Solo.io (API and AI gateways/service mesh): The audit focuses on multi‑cluster setup flows, traffic policy safety rails, and observability defaults—key to Solo.io’s cloud connectivity narrative. Explore the breadth of rebrand and product experience: Solo.io × Zypsy.

Why Zypsy for B2B UX audits

Proven devtools and enterprise track record: We’ve repositioned and redesigned complex engineering products, delivering enterprise‑grade clarity and scale. Cortex · Solo.io.
Full-stack capability: Research, UX audits, IA, interaction design, docs/DX, and engineering support in sprints. See Capabilities.
Sprint-based, outcome-first: Engagements align to founder and product goals, with tangible artifacts each sprint. Capabilities.

Engagement model and timeline

Duration: 3–4 weeks for a focused audit; 6–8 weeks when paired with design prototypes and re-tests.
Cadence: Weekly checkpoints; stakeholder workshop at kickoff; executive readout at completion.
Optional add‑ons: Accessibility remediation, docs overhaul, design system updates, experiment implementation support.

Transparent pricing (fixed-fee tiers)

Pick the scope that fits your team today—we’ll right-size deliverables to your goals.

Starter Audit: $8k–$15k
Best for: 1–2 core flows (e.g., signup → activation), site + docs pass, 6–8 moderated sessions, heuristic + accessibility review, executive readout.
Standard Audit: $18k–$30k
Best for: 3–4 core flows across web, docs, and product; 8–12 moderated + 20–50 unmoderated tasks; analytics deep-dive; prioritized roadmap and prototype recommendations.
Enterprise Audit: $35k–$40k+
Best for: Complex, multi‑persona products; compliance-critical reviews; 30–100 unmoderated tasks; cross‑org stakeholder workshops; prototype experiments and re‑tests.

Notes:

Pricing excludes participant incentives and specialized tooling costs (if any). Final quotes are confirmed after a 30–45 minute scoping call.
Design Capital option: Select early‑stage startups may qualify for equity-based engagements; see our investment approach on Zypsy Capabilities and Investment pages.

2–4 week timeline at a glance (example)

Week	Mon	Tue	Wed	Thu	Fri
1	Kickoff & goals	Stakeholder ints.	Stakeholder ints.	Heuristic review start	Heuristic + analytics
2	Test plan final	Recruit + pilots	Moderated sessions	Moderated sessions	Synthesis checks
3	Unmoderated launch	Analytics/pathing	Synthesis & modeling	Prioritization matrix	Prototype concepts
4	Prototype reviews	Exec readout	Backlog handoff	Optional re‑tests	—

Service coverage

Based in San Francisco/Bay Area with a global, remote-first team. Onsite working sessions available across the SF Bay Area by request.

Book a 15‑min Audit Triage

Get fast signal on scope, timelines, and the right tier for your needs. Book a 15‑minute triage via our contact form: Book now → Zypsy Contact

How we measure success

Activation rate: % of signups reaching defined activation events (e.g., “first successful deployment” or “first service onboarded”).
TTFV: Median time from signup to first value event; target reductions of 20–40% for complex setups.
Task success and error rate: From moderated/unmoderated studies and product analytics.
Support deflection: Reduction in tickets for top setup issues post‑fix.
Feature discovery: % of target users finding key features within first session/week.

Get started

Share your goals, target users, and critical journeys. We’ll scope a right‑sized audit and begin scheduling participants within days. Contact us via Zypsy Contact.