~/wiki / issledovaniya-i-ux-metody / ai-sintez-issledovaniy-50-intervyu-za-chas

50 Interviews per Hour: How AI Synthesis Changes UX without Losing Quality

Main chat

A chat for vibe coders: news, guides, live cases, marketplace, and finding executors.

$ cd section/ $ join vibe dev
50 Interviews per Hour: How AI Synthesis Changes UX without Losing Quality - обложка

An hour ago, you finished a series of 50 user interviews. Six weeks of team work: recruiting, decryption, coding, synthesis of insights. Today, a product comes in and asks, “What’s in the B2B admin segment?” We urgently need to understand their invite flow.” And you understand that there is no answer in the report - the wrong slice, the wrong question.

It’s a typical pain of quality discourse: synthesis takes longer than the interviews themselves, and as soon as a business needs a new angle, everything starts anew. AI tools promise to break this cycle. Sometimes they promise too much. This article is about where synthesis actually accelerates ten times without loss of quality, and where AI quietly replaces the narrative with pseudo-insights, and the team does not notice this.

The interview is easy. It's hard to turn 50 hours of conversation into a decision that the team will make.

In classical flow, the synthesis looks like this: decoding, manual coding of quotes, thematic clusters, affinity diagram, report with recommendations. Each step is subjective and laborious. One researcher handles 8-12 interviews a week if he wants to do it well. Then the cuts begin: fatigue, haste, gluing similar formulations where the difference is important.

On the review with the product comes the second layer of problems:

  • Half of the “insights” are a retelling of what you already knew
  • Quotes from the report can not be quickly found in the originals
  • To the question “in which interviews did this sound?” – the answer is “well, about a third”
  • After a month, the entire massif is virtually dead: no one will return there

This is where AI-synthesis gives the most honest win. Not in the interview (there you need a living person), not in the formulation of hypotheses (there you need the context of the product), but in the mechanical part: decoding, marking, search on the body, primary clustering.

What does it mean to not lose quality

Before we grasp the tools, we agree on the criteria. Quality synthesis is not a “beautiful report.” It's a synthesis that goes through three tests.

Traceability. Any statement in the report can be brought to a specific quote in a particular interview in a minute. Not “users often say X” but “Here are 7 snippets, timecodes so-and-so.”.

** Reproducibility.** Another person running the same array through the same procedure will get comparable themes. Not identical, but recognizably the same.

**Sensitivity to rare signals. **Synthesis doesn't have to mean everything. If 3 out of 50 respondents said something important and unexpected, it should surface, not drown in the majority.

The AI-pipeline, which fails any of these checks, saves time due to the friction, not with it. I will always come back to these three criteria.

Where AI really speeds up and where it seems

It is useful to divide the problems of synthesis into two groups.

Where acceleration is fair

  • Decoding. Modern models in Russian give acceptable quality in minutes instead of hours. With speaker markup and time codes.
  • **Show all the places where invites and irritation are mentioned is a semantic search by emblem, seconds instead of manual combing.
  • Primary encoding. AI tags fragments in a predefined pattern. Not perfect, but like a draft, it saves days.
  • A short sammmari with key pains and quotes – the model does well, especially if you set the structure.
  • **Recast for the audience. ** Same insight for product, designer, C-level.

Where acceleration is dangerous

  • The model willingly generates “users want simplicity and speed” is not insight, it is noise.
  • Prioritize problems. AI doesn’t know which problem is cheap to fix and important to business. You can't.
  • Clusters that make the model “blind” are often beautiful, but useless: they glue people by words, not behavior.
  • "Who else to interview." ** If you ask the model to complete the sample, it will complete plausibly and incorrectly.

The simple rule is that AI is good where the source code is and needs to be reworked. AI is bad where you have to make a decision or come up with something that is not in the texts.

Working Pipeline: What It Looks Like in a Team

If you throw everything out, meaningful AI synthesis in the product team consists of six steps. I'm going to describe them as they actually live - with places where people cut corners and with the checks that those slices catch.

1. Preparation of hull

Before anything to feed models, the body must be put in order. Files are named according to the scheme (role segment date id), each interview has a card: who, when, what hypothesis was tested, what guide was used.

It sounds boring, but without this in two weeks you will not be able to answer the question “and this is generally from what study quote”. Traceability begins here, not at the report stage.

2. Transcription with markup

The model decrypts, puts speakers and time codes. The next step is a mandatory step that almost everyone skips: a quick pass through the first 5 minutes of each recording. Not for proofreading, but to catch systematic errors - confusion of speakers, cropped beginnings, misidentification of product terms.

If the model persistently hears "leads" as "ices," it corrects itself in the glossary once and works on. If not corrected, this noise will go into all tags and searches.

3. Schematic coding

The main thing here is not to let the model invent the tags themselves. The code diagram is prepared in advance: pains, tasks, objections, mentions of competitors, emotional markers, context of use. 15-30 categories, no more.

The model passes through the body and marks the fragments. At the output is the table "fragment / time code / interview / tags". It's a basic artifact to which everything is then tied.

4. Clustering by hand, not by model

A common mistake is to ask the model to “find the main topics.” Get beautiful names and lose control. It is better to open the table with tags, sort by frequency, see the intersections and collect the clusters yourself. The model here works like a search engine: “Show all the fragments where pain = onboarding and emotion = irritation.”.

5. Checking for rare signals

A separate passage through the hull with the question "what met 2-3 times, but it sounds important." This step is almost always thrown out - and in vain. This is where future hypotheses sit that are not visible in the top 5 pains.

6. Compilation of the report with references

Each statement in the report is with a clickable link to the original fragments. Not "many users," but "8 of 47, see snippets." If links are lazy to do - the report is not ready.

Diagnosis: Pipeline is broken if..

Several symptoms that show that AI-synthesis has turned into a beautiful wrapper without content.

  • All insights sound like landing headlines. "Users want simplicity," "speed is important," "lack of transparency." The model told herself.
  • **No one on the team opens the source. ** The report lives separately from the hull. Nothing can be checked in a month.
  • ** To the question "why this conclusion?" - pause.** If the author of the synthesis can not show 3-5 fragments under the statement in 30 seconds, the statement is not confirmed.
  • Subjects from different studies are suspiciously similar. These are not “stable patterns,” they are stylistic imprints of the model.
  • Tag markings don't converge when rerun. We ran the same case through the same model a week later - we got other clusters. No reproducibility.

Typical mistakes of a designer who just joined

Trust Sammari instead of citations

Sammari interviews are convenient for a brief, but dangerous as a source for a solution. In Sammari, wording is lost, namely, they are the fuel for interface texts, empathy and prioritization. The rule is simple: in the layout are quotes, not retelling.

Pull into the design "average user said"

If the report says “users get confused on the tariff step,” it’s not a design task. You need to open the snippets: who exactly gets confused, on what, what speaks a second before the click. Often it turns out that “confusion” is two different problems in two segments, and you need to fix them with different layouts.

Take AI Clusters as Segments

Clusters, as they say, are not segments. Segments are defined by behavior, context, and JTBD. If you draw people on top of AI clusters, you will get beautiful cards, on which you can not make any decisions.

How to apply it in the layout tomorrow

The most honest way to pull AI synthesis into a design is to tie it to specific screens.

Screenplay: Remaking an Empty State

  1. Take out all the fragments that mention the first use of the feature.
  2. Group by emotion: confusion, skepticism, curiosity, irritation.
  3. For each group - 2-3 verbatim quotes on the board next to the layout.
  4. Write text on the screen in response to specific quotes, not an abstract user.

Scenario: Priority dispute

The product says, "Let's fix X first," you think Y is more important. Instead of arguing, open the case and consider how many interviews X sounds, how many Ys, in what contexts, with what emotional load. It's not a final argument, but it's about data, not taste.

Questions for the review of the layout based on the mirror

  • Under what specific snippets from the interview was this screen made?
  • What quotes refute the current decision - do they exist at all?
  • If you remove the mirror, will the layout change? If not, it was decorative.
  • What rare signal from the case in the layout is not consciously taken into account and why.

Segment summary

AI speeds up the mechanics of synthesis, but it doesn’t reverse any solution. The designer who pulls quotes directly into the layout and holds references to the source wins twice: both in speed and in defending his decisions on review. A designer who trusts beautiful sammari gets a quick report and a slow product.

Advanced scenarios: when AI synthesis really pays off

Basic Pipeline closes single studies. But the real value comes out where fusion just wasn't done before -- because it was expensive. Some applied stories.

Cross-exploration

The building has accumulated for six months: onboarding, pricing, outflow, B2B demo. The product asks, “Has anyone ever talked about integrating with Slack?” It used to be a rice worker's day. Now you're querying vector index all over the corpus, and in a minute you have 14 snippets from 6 different studies with context and speaker.

The condition for this to work is a unified metadata scheme. Segment, tariff, duration of the interview, product, who conducted the interview. Without this, you will get a dump of quotes without the ability to weigh them.

Longitudinal segment by segment

Take all the interviews with one type of user over a 12-month period and ask the model to find out how the pain formulations have changed. It's the kind of work that no one does with their hands -- too tedious. AI is not “finding the truth” but highlighting the candidates: “This is where X was talked about in one tone in March, and another in October.” Next you go read with your hands.

Marking up for a specific hypothesis

Not “mark up all the topics” but “find all the places where the user mentions an alternative tool and the reason for the switch.” The model solves narrow problems much more accurately than abstract thematic clustering. Use this.

How it fits into the team's work

AI synthesis is not a designer’s personal toy. If you are sitting next to a rice worker, a product and an analyst, you need to negotiate.

Who's in charge

  • The designer: the quality of the case, guides, transcripts, metadata scheme.
  • Designer: quotes in the layout, checking fragments for controversial decisions.
  • Product: Prioritize based on frequency and segment rather than volume of one interview.
  • Analyst: Links qualitative cues to quantitative cues – where the pain from the interview is visible in the funnel.

If all of these roles are dragging their own AI-squeezes into their documents, a month later the team lives in four parallel realities. One body, one index, different slices.

Anti-teamwork patterns with AI synthesis

  • Everyone drives his prompt. Ten people received ten different lists of themes in one case. Then there's the flavour argument.
  • **A report without author. **AI said so is not an argument. The synthesis must have a person who is responsible for the conclusions.
  • Synthesis replaces conversation with the user. Six months later, no one on the team remembers what a live client sounds like. This is a bad sign, even if the metrics are rising.
  • Quotes are taken out of context. Enterprise client complaint and self-serve user complaint look the same in squeezing, but they are two different product tasks.

Quality check: how to know that synthesis can be trusted

A simple ritual worth sewing into the process.

Checklist before report goes to team

  • Under each statement - at least 3 fragments with direct links.
  • It is indicated how many interviews in the body and how much confirm the thesis.
  • Segments are visible: the thesis is valid for all or only one group.
  • There is a section "what in the case contradicts the conclusions" - if it is not, the authors did not search.
  • Prompts and version of the model are fixed - the synthesis is reproduced.
  • At least 2-3 key interviews are read in their entirety, not just in fragments.

Blind Sustainability Test

Take the same case, run through the same model with the same prompt in a week. Compare the top topics. If the order and formulations float, the synthesis is not stable, and it is impossible to rely on the “top 5”. You can rely only on what is consistently repeated in several runs.

Manual sampling

Once a quarter, take random 5 interviews from the case and mark with your hands without looking at the AI markup. Then you compare. If the discrepancies are systemic - for example, the model consistently misses one type of pain - this is treated with a prompt or change of model, but you need to know this.

How to explain the decision to the team

The most common problem with AI-synthesis is not technological, but communication. The decision has been made, but the team does not understand exactly how it relates to the interview.

Structure of the review argument

  1. The problem with one sentence, as the user formulates it, is a literal quote.
  2. How many times and in what segments it occurs in the case.
  3. What we tried before and why it didn’t work (if there’s a story).
  4. The current decision and what specific fragments it addresses.
  5. What signals from the body does it deliberately ignore and why.

The fifth point is critical. A designer who honestly says, “We don’t close these 3 complaints in this release because they’re about a different segment” looks stronger than someone who pretends to take everything into account.

Questions to be prepared for

  • Is this the opinion of one high-profile customer or a pattern?
  • What segment did we see this in and does it match the target segment?
  • What in analytics data confirms or disproves this qualitative signal?
  • If we do the opposite, what specific quotes would disprove that?

If there are short answers to these questions with links, the solution is protected. If not, you’re selling a beautiful report, not a design.

Segment summary

AI synthesis becomes the team’s infrastructure, not the designer’s personal accelerator, only when there is a single body, shared metadata, and a person in charge of inference. Everything else is parallel hallucinations in beautiful documents.

Checklist to start: what should be ready before the first case

Teams most often fail AI-synthesis not on the prompts, but on the preparation. Half the problem with “the model hallucinates” is actually “we don’t have normal transcripts and a consistent vocabulary.”.

Minimum stack before touching the model

  • Transcripts with timecodes and speaker identifiers, not one solid text.
  • A single template of interview metadata: segment, role, tariff, prescription of the product, attraction channel.
  • A consistent vocabulary of pain and tasks: “onboarding”, “activation”, “migration” mean the same thing for the researcher, designer and product.
  • Understanding what is considered insight and what is observation. Without it, half the reports are a retelling.
  • A fixed list of prohibitions for the model: do not invent quotes, do not combine different segments, do not smooth out contradictions.
  • Clear responsibility: who owns the corpus, who writes the prompts, who signs the conclusions.

If at least three points are not closed, AI synthesis will look fast, but decisions on it are dangerous.

Checklist for every run

  • The model version and temperature are recorded.
  • Prompt lies in the repository, not in personal correspondence.
  • The corpus section is indicated: which interviews are included, which are excluded and why.
  • The result is stored as an artifact with a date, not rewritten over the previous one.
  • The discrepancies with the previous run are clearly described: "Themes 1-3 are stable, theme 4 is the first to appear.".

Anti-Patterns at the Process Level

We've deconstructed the team's anti-patterns. There is another group - less noticeable, but more harmful in the long run.

"Catch with your hands" - and do not catch up

The team implements AI synthesis in new interviews, and the old archive is left out. After six months, the body is disparate pieces: something is marked, something is not, something according to the old pattern of segments. Any retrospective question turns into a small separate study.

Synthesis without product feedback

Reports come out regularly, but no one keeps track of what findings actually become features, experiments or failures. A year from now, it’s impossible to say whether AI synthesis helped us make a single decision that we wouldn’t otherwise make.

One promp for all tasks

Universal prompt "pull out insights" works only on the demo. On a real task, you need different slices: pain by segment, contradictions with the hypothesis, quotes for a specific screen. A team that has one prompt for everything inevitably slides into flat reports.

Researcher becomes model operator

The quietest anti-pattern. The researcher ceases to go to interviews and listen to live people, turning into an AI-squeeze editor. The quality of prompts is growing, sensitivity to nuances is falling.

Questions for report review

A short list that makes sense to ask before the report goes to work.

According to the data

  • On which body is the conclusion made and how fresh is it?
  • Which segments are represented and which are not?
  • Which of the conclusions is based on one or two interviews, and which on a stable pattern?

By method

  • Where is the work of the model, and where is the interpretation of man?
  • What alternative explanations have we considered and discarded?
  • What in the corpus contradicts the main conclusion and how do we explain it?

By decision

  • What specific action follows from the report?
  • What will we stop doing if we believe these conclusions?
  • What signal in the sale or in the next interview will show that we are wrong?

The last question is the most useful. A report after which a refutable expectation cannot be formulated does not move the product.

Practical outcome

AI-synthesis does not cancel UX-resort and does not make it cheaper in the long run – it redistributes efforts. Less time is spent on manual marking, more on the quality of the case, the accuracy of the prompts and honest interpretation.

Working practice is based on three things: one case with normal metadata, playable runs with fixed artifacts and a person who subscribes to the conclusions and is ready to answer questions through links. Everything else - speed, templates, dashboards - builds up from above.

And most importantly, stay close to live interviews. The model summarizes well what is already in the case. The new product brings a person who heard in conversation what he did not expect.

$ cd ../ ← back to Research and UX Methods