Timeline: 2021 – 2023
Platform: Web App + MS Word Add-in
Primary User: Lawyers, procurement managers, finance teams
Key Methods: User interviews, task analysis, prototyping, usability testing
Tools: Figma, Miro, Adobe CC, HTML/CSS, JS
The central design challenge at Legartis wasn't building AI features. It was designing the interface layer that made AI output trustworthy, actionable, and professionally defensible. Legal professionals have personal liability attached to every contract they sign off. They are among the most sceptical adopters of automated tools precisely because the stakes of a missed clause are measured in litigation costs, regulatory penalties, and professional reputation.
The product had 90%-accurate AI. The adoption problem wasn't accuracy. It was trust. And trust in this context required transparency. Not just a clear UI, but an interface that exposed the AI's reasoning at every step, let reviewers interrogate it, override it, and annotate their disagreement. The design problem was making the AI professionally defensible, not just usable.

Legartis was building AI technology capable of reviewing contracts with over 90% accuracy, identifying missing, non-compliant, or risky clauses in seconds. The technical capability existed. The product didn't yet match it. Legal professionals are among the most sceptical adopters of automated tools: their professional liability depends on the accuracy of every contract they sign off. For AI to earn a place in that workflow, the design had to communicate reliability, transparency, and control and not just speed.
When I joined as Lead Product Designer, the platform had a functional proof-of-concept but lacked the information architecture, interaction model, and visual language to make it credible at enterprise level. My scope spanned the full product: web application, Microsoft Word add-in, AI insights layer, collaboration hub, analytics dashboards, and the cross-platform design system.
The business objective was to move from a specialist legal tool to a cross-functional workspace. One that legal, procurement, and sales teams could all use independently, without sacrificing the depth that trained lawyers required.

Contract Playbook design session (April 2022). Cross-functional review with Legartis legal team and enterprise customer. Interface visible: Contract Checker, 17 open tasks / 6 completed. Advocating for Playbook Builder as primary onboarding path required pushing against engineering complexity — and the 30-day retention data validated the call.
Contract review is one of the most time-intensive, high-stakes tasks in any legal department. A single NDA could take a lawyer 30–45 minutes to review manually; a complex data processing agreement, several hours. At scale (across hundreds of contracts per month) this was unsustainable.
But the barrier to automation wasn't just technical. It was psychological.
Key pain points:
— Legal counsel, enterprise customer · discovery interview


The distinction between trust and confidence reshaped the entire design programme.
Legal professionals were not opposed to AI assistance. They were opposed to AI they couldn't verify. This reframed the core design challenge from "make AI easier to use" to "make AI legible enough to be professionally defensible."
Everything that followed was downstream of this finding.
Four decisions that changed how lawyers used the AI.
The redesign wasn't about adding features. It was about restructuring the relationship between the reviewer and the AI output, making the AI a trusted collaborator in the legal workflow rather than an opaque suggestion box.
1. Structured review flow, from flat list to step-by-step sequence
The original interface presented all AI findings simultaneously as a flat list beside the document. Reviewers were left to manage their own review progress, decide their own sequence, and track their own completeness. We restructured this as a discrete, progressive review sequence — each step corresponding to a contract section or clause type, with explicit progress tracking and clear completion states.
Before · v1.2 approach
✕All findings surfaced simultaneously as an undifferentiated list
✕No indication of review progress or remaining work
✕Reviewer tracks their own state mentally
✕AI finding and document clause presented in separate panes — no direct linkage
✕No explicit action model — reviewers could read and close, but couldn't "decide"
After · redesign
✓Findings sequenced as discrete review steps with progress indicator
✓Explicit progress: "Step 3 of 12 — Insurance obligations"
✓System tracks review state — resumable, auditable, transferable
✓AI finding anchored to specific clause with bidirectional navigation
✓Three-action model: Accept · Annotate · Escalate — every decision explicit

Word Add-in · v1.5 (2022). The step-by-step review flow in action — AI findings are surfaced inline alongside the contract, anchored to specific clause positions. Each finding shows its status (Needs to be verified / Task completed by the software), the finding type, and its location in the document. The reviewer's right panel provides clause-level context with accept/dismiss actions. Progress is tracked by step and visible throughout.
2. AI rationale as the primary trust signal
Research showed that the same AI finding was trusted at significantly different rates depending on how it was explained. We redesigned the finding presentation to lead with the rationale. The specific company requirement being checked, the clause location, the sentence where it was found, before asking the reviewer to act.
Each finding in the Word Add-in now showed: what the AI checked ("Subprocessing: authorisation requirement"), where it found it ("Found in 1 sentence — 1.1.4"), what the company standard was ("Der Auftragnehmer muss die vorgängige Einwilligung..."), and what decision was needed ("Requirement is fulfilled / Does not apply"). The reviewer was never asked to trust a score. They were given the basis for a judgment and asked to make one.

Word Add-in · v1.4 (2022). The AI rationale model in production. The right panel presents each finding with its full transparency stack: the category ("TOMs"), the status ("Needs to be verified"), the exact clause reference ("1.1.4 Any defects or faults which appear…"), the company's specific requirement, and the explicit decision interface. The reviewer is never left with a score to interpret, they're given a judgment to make.
3. The three-action model. Accept, Annotate, Escalate
Previous versions allowed reviewers to read findings and close the panel. There was no explicit decision model, no mechanism for recording that a finding had been reviewed and actioned, or by whom. This made the review process impossible to audit, impossible to hand off, and impossible to resume.
The redesign introduced three explicit actions for every finding: Accept (the AI assessment is correct and the requirement is met), Annotate (the reviewer disagrees or wants to qualify), or Escalate (the finding requires senior review before a decision can be made). Every action creates a traceable record. Every decision is attributable and reversible. The review becomes an auditable document, not just a process.
This single change addressed three separate pain points: the absence of audit trail, the inability to hand off a review mid-process, and the lack of clarity about what "done" looked like for a given contract.

Word Add-in · v1.3 (2022). The "Missing" state, when the AI cannot find a required clause, the interface moves from detection to guidance. Rather than surfacing a negative finding and leaving the reviewer to act, the panel provides the company's standard clause text with a "Needs to be inserted" instruction. The missing finding becomes an actionable task, not a data point to interpret.
4. The platform dashboard. AI as a visible pipeline, not a black box
Alongside the Word Add-in redesign, I led the design of the Legartis web platform, the contract management layer where documents were uploaded, processed, and tracked across the full review lifecycle.
The central design challenge was surfacing the AI review pipeline in a way that was legible to users at different levels of technical understanding. The solution was to represent the AI process as four discrete, auditable pipeline stages: Pre-check (initial document scan), Generator (clause extraction and classification), Comparison (comparison against company standards), and Analysis (risk and compliance assessment). Each stage showed its status, its result, and its available action, making the AI process transparent and navigable rather than opaque.
The process log below each contract provided a complete audit history: who ran each stage, when, with what result, and whether any overrides had been applied. For legal teams working under compliance requirements, this was not a nice-to-have. It was a prerequisite for adoption.

Several rounds of usability testing were done, at least 1 quartely. Measurable movement on every target metric.
We ran two structured usability test programmes (Phase I against the v1.3 prototype and Phase II against v1.4) with 8 participants per round drawn from active legal, procurement, and finance users. Each session used a standardised task protocol against a representative contract set.

Full review task completion rate v1.3 → v1.4. Target was 70%+; v1.4 reached 81%. Primary driver: structured step sequence replaced free-form navigation.
Reduction in manual re-reads of AI-flagged clauses between prototype rounds. Reviewers who understood the rationale didn't re-read the source clause to verify the finding.
End-to-end NDA review time in guided prototype vs. baseline. Measured across 8 test participants.
Percentage of test participants who used the Accept/Annotate/Escalate model for every finding without prompting in Phase II, up from 41% in Phase I.
These figures are derived from internal usability test session data (Phase I: July 2022, Phase II: October 2022). They represent controlled prototype testing performance, not live deployment analytics. Live deployment KPI data was not accessible to the design team directly.
UX Test Phase II (2022) — full contract view. The test document used across Phase II sessions: a General Terms & Conditions contract with AI annotations surfaced inline across all five sections. Findings shown include: clause extractions with "Task completed by the software" (green), findings requiring human verification (orange "Needs to be verified"), and flagged deletions ("Needs to be deleted"). The density of the annotation layer across this contract was deliberately chosen to stress-test the interface's legibility under real-world volume conditions.
Beyond the contract review flow.
The redesigned review flow was the strategic core of the work, but the platform scope extended across the full contract lifecycle. Each area below was designed to the same standard of transparency, auditability, and role-appropriate information density.
The component system and interaction patterns established during the v1.3–v1.5 redesign were also documented as a design system foundation. Providing the shared language between design and engineering that reduced handoff friction and enabled the team to move faster across all of the above without inconsistency.
Impact & Results
The Legartis redesign ran across two years and three major version milestones (v1.3 → v1.4 → v1.5). The primary measure of success was adoption: whether legal professionals used the AI review path rather than bypassing it.
Complete platform redesign delivered across four major versions from 2021 to 2023, each driven by structured usability testing and validated before engineering implementation.
Legal counsel, procurement lead, finance controller, CLO, external reviewer, IT admin, power user, onboarding user — each with distinct workflows and success metrics.
From MVP (2020) to a fully transparent, step-by-step review interface (2023). Each version shipped with validated interaction improvements from usability testing.
Accept/Annotate/Escalate model implemented without modification in engineering — a high-fidelity translation from design intent to shipped product.