Skip to main contentSkip to footer
Tabbi.
ServicesMethodologyPortfolioInsightsAbout
Client LoginBook a Call
Tabbi.

Peer-reviewed qualitative analysis using the GATOS methodology. Defensible insights for high-stakes decisions.

Navigation

  • Services
  • Methodology
  • Portfolio
  • Insights
  • About
  • Client Portal

Capabilities

  • Thematic Analysis
  • Decision Network Mapping
  • Causal Inference
  • Generative Reporting

Contact

  • [email protected]
  • LinkedIn

© 2026 Tabbi Research. All rights reserved.

PrivacyTerms

Our Methodology

The GATOS workflow

Generative AI‑enabled Theme Organization and Structuring — a peer‑reviewed approach to qualitative analysis with full traceability.

Every insight traces back to source data. No black boxes.

View PublicationsBook a Scoping Call

Methodology Guarantees

Traceability by design

  • Every theme maps back to specific participant evidence
  • Extract‑based workflow preserves nuance and context
  • Audit‑ready deliverables you can defend internally

The Difference

Why traceability changes everything

Without traceability

  • —

    Black box outputs

    AI generates "insights" with no way to verify if they came from your data or the model's imagination.

  • —

    Undetectable hallucination

    Plausible-sounding themes that aren't grounded in participant responses. No audit trail to catch them.

  • —

    Unverifiable claims

    "67% mentioned X" — but can you see which responses? Can a reviewer check your numbers?

  • —

    Lost voices

    Summarization strips nuance. Themes become abstractions disconnected from the people who expressed them.

With the GATOS workflow

  • —

    Full evidence chain

    Every theme traces through codes, clusters, and extracts back to specific participant utterances.

  • —

    Hallucination prevention

    Constrained code generation reads nearest-neighbor extracts first. The model cannot invent categories unsupported by data.

  • —

    Audit-ready quantification

    Every prevalence count links to the actual extracts. Reviewers can drill down to verify any finding.

  • —

    Preserved participant voice

    Extract-based workflow captures each idea in the participant's own framing. Nothing is lost to summarization.

The Workflow

From raw data to traceable themes

GATOS maintains a chain of custody at every stage. Each step feeds the next with full provenance intact.

1

Raw Utterances

Original participant feedback enters the pipeline — interviews, surveys, open-ended responses. Every source is tagged and preserved.

2

Extract Creation

Each utterance is distilled into discrete information points, capturing a single idea in the participant's own framing. One response may yield multiple extracts.

3

Semantic Clustering

Extracts are embedded into vector space and grouped using PCA, UMAP, and agglomerative clustering. Similar ideas from different participants converge naturally — no predefined categories.

4

Codebook Development

Codes are generated through constrained nearest-neighbor retrieval. The model must read existing extracts before proposing new codes — preventing hallucination by design.

5

Theme Synthesis

Codes are organized into themes with every connection preserved. Any theme can be traced back through codes → clusters → extracts → original utterances.

The Key Innovation

Every theme can be traced back through codes → clusters → extracts → original utterances. You can verify exactly which participant voices contributed to each insight.

Interactive Traceability Chain

Explore how insights connect back to source data

Click any step to trace insights back to source data

Theme
Workflow Friction Drives Frustration
High-level pattern identified across multiple codes

Key: Every theme traces back through codes, clusters, and extracts to specific participant utterances—no hallucination possible.

Deep Dive

How each step works

1

Extract Creation

Raw participant utterances are distilled into discrete information points — each capturing a single idea in the participant's own framing.

Participant utterance

“I waited forever in the ER and nobody told me what was happening. The nurse was nice though.”

Extracted information points

“Long wait time in emergency department”

“Lack of communication during wait”

“Positive interaction with nursing staff”

2

Semantic Clustering

Extracts are embedded into vector space and clustered using PCA, UMAP, and agglomerative clustering. Similar ideas from different participants converge naturally.

Extracts from different participants converge

Patient 12

“Long wait time in emergency department”

Patient 89

“Waited over two hours with no updates”

Patient 204

“ER wait was unreasonable”

Patient 317

“Nobody communicated expected wait time”

Cluster #23

“Emergency wait times & communication”

47 extracts from 31 participants

3

Constrained Codebook Development

Codes are generated through nearest-neighbor retrieval, ensuring new codes are grounded in existing patterns. The model cannot invent categories not supported by the data.

Nearest-neighbor extracts retrieved

“Long wait time in emergency department”

“Waited over two hours with no updates”

“Nobody communicated expected wait time”

Generated code

“Excessive wait time with inadequate communication”

Safeguards

  • Temperature = 0 for determinism
  • Must read nearest neighbors first
  • Explicit anti-hallucination instructions

Quality criteria

  • Parsimony (minimal redundancy)
  • Consistent abstraction level
  • Non-overlapping categories
4

Theme Synthesis with Full Traceability

Codes are organized into themes, with every connection preserved. Ask about any theme and trace it back to the specific participant utterances that contributed to it.

Full evidence chain

Theme

“Communication Gaps During Care”

Code

“Uncertainty about wait status”

Cluster

847 extracts about waiting + communication

Sample extracts

“No updates during wait”, “Didn't know if forgotten”

Source utterances

Patient 142, Patient 891, and 845 others

Published Research

Peer‑reviewed and validated

The GATOS methodology is documented in peer‑reviewed research.

Nature Portfolio · 2026

GATOS: Generative AI-enabled Theme Organization and Structuring for Qualitative Research

Read paper →
SAGE · 2025

Leveraging Generative Text Models and NLP to Perform Traditional Thematic Data Analysis

Read paper →
SAGE · 2024

Using Generative Text Models to Create Qualitative Codebooks for Student Evaluations of Teaching

Read paper →
View All Publications

Get Started

Ready to experience traceable analysis?

We share anonymized portfolio examples during a consultation to show exactly how GATOS works on real data.

Book a Free Scoping CallExplore Research Briefs