How to prove that the redesign worked: metrics that convince management
Main chat
A chat for vibe coders: news, guides, live cases, marketplace, and finding executors.
The redesign is underway. The team is happy - it looks better, it is more convenient to use. The management looks at us and asks, "So, did this make us money?"
If the answer is “well... it got better” is a bad answer. Not because the redesign didn’t work. Because the designer didn’t prepare for this in advance.
Proof of the result of the redesign begins before its launch - with the correct choice of metrics and fixation of the basline.
Mistake #1: No basline
The most expensive mistake is to start measuring after starting. There's nothing to compare it to.
Is 4.3% conversion a good thing or a bad thing? Unknown. "Conversions went from 2.8% to 4.3%" is +54%, and that's conversation.
**Rule: * 2-4 weeks before any launch, fix all the metrics you plan to improve. Screenshot of the dashboard with the date - minimum. CSV with data is better.
What Metrics to Choose: Three Levels
Level 1: Behavioral Metrics (UX Metrics)
This is what changes the fastest and is directly related to the design solution.
** For onboarding redesign:**
- Onboard completion rate
- Time to first key action
- Drop-off on every step
** For shape redesign:**
- Form completion rate
- Error rate by field
- Time to complete
** For the redesign of the main float:**
- Task success rate
- Task completion time
- Error Rate on Key Steps
These metrics see a change 1-2 weeks after launch.
Level 2: Product metrics
This is what matters to the product and the team.
- **Activation Rate ** - Have you reached the aha moment
- D7/D30 Retention - Are they returning
- Feature adoption - whether key functions are used
- Funnel conversion - all over the funnel
These metrics see a change in 3-6 weeks.
Level 3: Business Metrics
That's what the leadership convinces.
- Conversion rate (if you redesigned the attraction funnel)
- Trial-to-paid conversion
- Churn rate (after 60-90 days)
- Revenue impact - additional revenue
- CAC - if the funnel was improved
These metrics see a change in 2-3 months.
How to structure measurement: before, during, after
Prior to launch (2-4 weeks)
Fix basline on all three metric levels. If something isn’t measured, it’s time to set up analytics.
Also record a quality baseline:
- SUS test with 5 users
- Characteristics of session recordings
- Quotes from interviews or reviews
First 2 weeks after launch
Look at the UX metrics. Not on business metrics - early. Do not draw conclusions from the data of the first week - novelty effect.
What you're looking for: clear regressions. If something gets worse, you have to react.
4-6 weeks after launch
Look at product metrics. Novelty effect has passed, data has stabilized.
Prepare the first results report for the team.
2-3 months after launch
Look at business metrics. This is the final report for management.
A/B Method: The Only Way to Prove Causation
The problem with before/after measurement is that other factors are changing. Marketing campaigns, seasonality, industry news all affect metrics at the same time as redesign.
“Conversions increased 40 percent after the redesign” is a correlation. The A/B test proves causal: you show old and new designs at the same time, the difference is explained only by design.
When A/B is required: for high investment decisions. If the redesign took 3 months and cost 2 million rubles, you need proof, not correlation.
When A/B is not required: for small changes with obvious direction. Fix broken UX, remove a critical error in the form - the result is quite obvious without a control group.
What not to do when measuring
Don’t just look at “good” metrics
Always check guardrail metrics - the ones that shouldn't get worse.
For example, the redesign of the registration form increased the completion rate by 25%. Great. But the D30 retention of new users fell 15% - because the form now registers non-target users.
The bottom line: mixed. Without guardrail metrics, you'd only see "victory.".
Don't draw conclusions too early
The first week's data is unreliable:
- Novelty effect: Users behave differently because something has changed
- Bias in the sample: first come the most active users
- Cache and CDN: some users see the old version for a few more days
A minimum of 2 full weeks before any conclusions.
Don’t attribute the redesign to other changes
If an email campaign is launched at the same time as the redesign, it is impossible to accurately separate the effects without an A/B test. Be honest: We're seeing the metric rise after launch. This is probably a redesign, but there was a campaign in parallel. It cannot be precisely divided without A/B.”.
Outcome report format
You need a report that takes 2 minutes to read. Here's the structure:
Version for CEO (1 slide / 1 page)
REDISAIN [What's Named]
Launched: [date]
Measured: [after how long]
RESULTS
[Metrica 1]: was X% → became Y% (+Z%)
[Metrica 2: was X → became Y]
[Financial effect]: +N rubles per month / +M users
CONCLUSION
[1–2 sentences: What it means for business]
NEXT STEP
[What we plan to do based on these results]
Team version (more detailed)
Add to that:
- Data on each step of the funnel (before and after)
- Segment analysis (mobile vs. desktop, new vs. returned)
- Unexpected findings (which surprised)
- What didn't work and why (if any)
- Hypotheses for the next iteration
What to do if the result is disappointing
It's not over. It's data.
Metrica hasn't changed
- Is the correct metric chosen to measure this change?
- Has it been enough time?
- Do other factors compensate for the improvement?
Metrica got worse:
- Novelty effect (will be over in a month)?
- Are there new problems that outweigh the improvements?
- Was the hypothesis correctly formulated?
How to imagine a bad result:
Honestly, with an analysis of the causes and a plan for the next steps. “We expected X, we got Y. In our analysis, the cause is Z. The next step is [a specific action].”.
It's professional. And that teaches the team better than silence.
AI and Proof of Redesign Results
Select the right metrics before launch
We are planning a redesign [of what exactly].
Our hypothesis is that if we [change], then [the expected effect].
Product: [Description]
Key business objectives: [Description]
Help select metrics to measure the result:
1. UX metrics (which will change quickly)
2. Product metrics (which we check in a month)
3. Business Metrics (which will prove to management)
4. Guardrail metrics (which cannot be degraded)
For each: how to measure how much to see the result.
Prompt: Analyze data after redesign
Redesign [that] launched [date]. It's past.
Data up to:
[metrics]
Data after:
[metrics]
Parallel changes during this period: [what has changed]
Analyze the results:
1. What metrics are there improvements?
2. What is the probability that the improvement is due to the redesign (and not other factors)?
3. Are there any unexpected results (better or worse than expectations)?
4. What do you recommend as the next step?
Prompt: Write a report on the results
Help write a report on the results of the redesign.
Data:
[all figures before and after]
Target audience: [CEO/whole team/investors]
Write:
1. Executive summary (3-5 sentences)
2. Detailed analysis of each metric
3. Financial impact (help to calculate data)
4. Conclusion and next step
Tone: direct, data first, no design jargon.
Prompt: Explain a neutral or negative result
The redesign [which] did not produce the expected result.
Expected: [hypothesis and numbers]
Received: [Real data]
Parallel factors: [what else happened]
Help:
1. Find possible explanations for the result
2. Separate “redesign didn’t work” from “we measured wrong”
3. Formulate the next step
4. Write honest communication for the team and management
Specifics of measuring different types of redesign
Not all redesigns are measured the same. The type of change determines which metrics to choose and when to view the result.
Landing redesign
**What to measure: ** conversion rate (visitor → registration / application / purchase), bounce rate, time on page, scroll depth.
**When to watch: ** 2–3 weeks for data stabilization Landing reacts quickly - there is no retention lag.
Method: A/B test is required if traffic permits. Without it, it is difficult to separate the redesign effect from the traffic change.
What is often overlooked is the quality of leads. A new landing page can give more registrations, but it is worse to convert into paying ones. Look not only at the top of the funnel.
Onboarding redesign
**Onboarding completion rate, time to activation, D7 retention.
** When to watch:** completion rate - after 1-2 weeks, D7 retention - after 5-6 weeks (you need to give the cohorts to reach the 7th day).
D7 retention improved, but the quality of activated users changed. If untargeted, D30 retention will drop later.
Redesign of key flow (not onboarding)
**Task completion rate, task completion time, error rate, drop-off steps.
When to watch: 2 weeks for task metrics The business effect is 4-6 weeks.
**What is often overlooked: * How a change in one flow affects adjacent ones. It simplified the task creation – users began to create more tasks – the load on the task list grew, new problems appeared.
UI/design system redesign (without changing UX)
This is the most difficult case to measure. Visual changes without changing the structure of the interaction.
What to measure: NPS/CSAT (subjective perception), SUS test (assessment of usability before and after), brand perception (if there are surveys).
**What is hard to measure: long-term brand effect. A beautiful and consistent interface affects trust and quality perception, but it’s hard to attribute directly to conversion.
Honest answer: For a purely visual redesign without changing UX, it’s hard to show direct ROI. An indirect argument: Decreasing design debt = accelerating the team.
Segmentation as a tool for understanding results
An aggregated result often hides an important one. The segmentation reveals:
Mobile vs. Desktop
The redesign improved conversions on the desktop, but did not change on mobile — it is likely that mobile-specific problems remained unresolved. This is the direction of the next iteration.
New vs. Returning
If only new users react to the change, it confirms that the redesign affects the first impression. If only the returning ones are for deep use.
Engagement channels
Users from organic search and advertising behave differently. Landing redesign can perfectly convert “warm” organic traffic and work worse for “cold” advertising.
Cohorts by time
If the results of the first week are very different from the fourth, this is a novelty effect. Look at stabilization.
How to Create a Culture of Measurement in a Design Team
Measuring results is not a one-time action, but a skill that needs to be built in a team.
Definition of Done includes metrics
The task is not ready when the design is approved and the development is completed. And then when:
- Identified success metrics (before launch)
- Recorded Basline (before launch)
- Measurement time (how much we watch)
- Recorded result (after)
This adds 2-3 hours to each task. It's worth doing.
Metrics retrospective
Once a quarter: you take all the projects of the previous quarter, see what achieved the expected result, what did not. Analyzing why. This is the best way to learn how to make better predictions.
Publicity of results
Results (including disappointing) are open access for the team. Not to praise or scold, but to learn. "That's what we expected, that's what we got, that's what we think about it" - a culture of honest data discussion.
When Metrics Lie: The Traps in Measuring Results
Before/after data can show an improvement that isn’t there — or hide an improvement that is there. Here are the classic traps.
The seasonality trap
The redesign launched in September. In October-November, conversions increased. Is it a redesign or a seasonal increase in demand?
**How to defend yourself: * Compare it to the same period last year, not just before launch. Use the A/B test to neutralize the seasonal effect.
Traffic trap
Conversion is up. But the traffic has become “quality” – launched a new advertising channel with a more targeted audience. The redesign has nothing to do with it.
**How to protect yourself: ** Look at conversions for each channel separately. If growth is only due to a new channel, it is not the result of a redesign.
Trap novelty effect
Users behave differently simply because something has changed. The first 7-10 days are not reliable.
**How to protect yourself: * Wait at least 2 full weeks before the conclusion. Look at the trend stabilization, not the peak values of the first days.
Survivor bias trap
Measure only users who have reached the end of the flow. But the redesign could attract new users (or scare off) before that point.
- How to protect yourself: * Measure the whole funnel from the first touch. Not just the final conversion rate.
Hawthorne effect trap
Users know that they are being watched (participate in the test) and behave differently. Especially relevant for usability tests, less for product data.
How to protect yourself: For product data (not tests) the effect is minimal. For usability tests, normalize for the first 10 minutes of the session when the user is “accustomed”.
Quick wins vs. long-term effects: how to balance
Some metrics react quickly – and it’s tempting to optimize for them. But quick wins don’t always mean long-term results.
Example of conflict
Aggressive popup at the time of exit from the landing ("Wait! 20% off!:
- Short term: conversion rate up 15%
- Long term: NPS down, retention down, brand perception down
The "conversion" metric showed victory. Business lost.
How to find balance
Guardrail metrics are insurance. With any optimization test, determine in advance what can not be worsened. NPS, retention, time to value – these metrics must remain stable or grow.
Time horizons. Look not just at 2-week results. Cohorts of users who have gone through the new design - how do they behave in 60 and 90 days?
Template: knowledge base of redesign results
A good team accumulates knowledge. The bad one solves the same problems again.
The structure of the knowledge base
Notion, Confluence or any wiki. One page for each measured redesign:
REDISAIN: [Name]
Date: [launch]
Author: [designer]
Problem
[What was wrong, data before]
Hypothesis
[If we [change], then [the expected result], because [the cause]
Decision
[What changed, link to Figma]
RESULTS (measured: [date])
Metric 1: [was X → became Y]
Metric 2: [was X → became Y]
Financial impact: [calculation]
CONCLUSION
[The hypothesis was confirmed / partially / not confirmed]
[Explanation]
NEXT STEP
[What we do based on this result]
What would you do otherwise?
[Honest retrospective]
In a year, such a base is a valuable asset. New designers build in days instead of months. The rationale for new redesigns is based on real precedents. The team doesn't repeat mistakes.