Skip to content
research methodology transparency rodo

Research Methodology — How We Collect and Analyze Tarot Data

Transparent documentation of our data collection process, anonymization, AI provider attribution, sample limitations, and update cadence. n=1,370 readings, 69 users, 7 languages.

Tomasz Fiedoruk 6 min read n=1370

This page documents how we collect, anonymize, and analyze the AI tarot reading data we publish on this site. We update it whenever the methodology changes.

Last updated: 2026-05-06.

Sample composition

Our current dataset:

  • 1,370 readings total
  • ~750 unique participants — composed of:
    • 69 registered users (defined by user_id; deduplication strict; 24% of readings)
    • ~680 anonymous guest sessions (by IP fingerprint; 76% of readings)
  • 7 languages (EN 90.7%, PL 3.6%, PT 2.9%, FR 1.2%, ES 0.9%, DE 0.4%, IT 0.2%)
  • Time window: 2026-01-01 to 2026-05-02
  • 1,261 readings with question text (the rest are "draw without question" requests)

Important caveat: guest IP fingerprints overcount unique participants (multiple users can share an IP — household, university, corporate NAT) and undercount returning users (one person across mobile + home + work IPs counts as 3). Treat ~750 as a rough order-of-magnitude estimate, not a precise number. The 69 registered figure is exact.

The dataset grows continuously. Quarterly snapshots get published with full statistics. Real-time stats may differ from the published snapshot by up to one quarter.

What we collect

For each reading, our application logs:

Field Type Purpose
Reading ID UUID Unique identifier
User ID hash SHA-256 Anonymized user grouping
Spread type enum Which spread (3-card, Celtic, etc.)
Cards drawn array of card IDs Order matters (positions)
Reversed flags array of bool Per card
Question text text (optional) If user provided
Question category enum Auto-categorized: future, love, work, money, health, family, uncategorized
Language ISO 639-1 UI language at time of reading
Timestamp UTC Date + time
AI model enum gpt-5.4 / claude-sonnet-4.6 / gemini-2.5-flash / nvidia-llama-3.3
User rating 1-5 (optional) Post-reading feedback if given

What we don't log: IP address (only SHA-256 hash for security), email, name, physical location beyond country code from IP geolocation, browser fingerprints, or any other personally identifying data.

Anonymization process

User IDs in published statistics are SHA-256 hashes with a per-snapshot salt. Hash collisions are practically zero (2^256 hash space, 69 users).

For published per-card statistics, we apply k-anonymity with k=5:

  • Combinations of (language + spread_type + week) with fewer than 5 observations are aggregated to higher-level groupings before publication
  • Individual reading IDs never appear in public datasets
  • Question text is published only in aggregated category counts, never verbatim

The full anonymization audit is performed before each quarterly publication. Audit notes are included in the dataset download.

AI provider attribution

Readings are generated using one of five LLM providers depending on user tier and queue status:

  • NVIDIA Llama 3.3 70B — free tier fallback (last resort)
  • OpenRouter Gemini 2.5 Flash — primary free tier (≥90% of free readings)
  • OpenRouter Qwen3-235B — secondary free tier
  • OpenRouter GPT-5.4 — paid Tier 1 ("Seeker") readings
  • Anthropic Claude Sonnet 4.6 — paid Tier 2 ("Mystic") dual-oracle readings

Per-reading AI provider attribution is included in the dataset for researchers wanting to compare AI behavior across providers.

Statistical limitations

Three limitations matter:

Sample size. 1,370 readings is enough to detect strong effects (a 50%+ deviation from random, for instance) but not enough for fine-grained per-card significance testing. To claim a specific card appears more often than chance, we'd need approximately 6,000 readings per the standard chi-square sample size calculation for a 78-category distribution. We're roughly halfway there.

Selection bias. Our users are not a representative sample of all tarot users globally. They are people who:

  • Found aimag.me through search, social, or referral
  • Speak one of our supported languages
  • Were comfortable using a web-based AI tarot tool
  • Self-selected into our funnel

Generalization to "all tarot users" is not warranted from this dataset.

Observational, not experimental. We don't randomize, we don't have a control group, we can't establish causation. We can describe patterns. We can't claim to explain them.

Update cadence

  • Quarterly snapshots: January, April, July, October. Published as a versioned dataset with anonymization audit notes.
  • Real-time aggregate stats: updated daily on this site (live counters, top cards, day-of-week distribution).
  • Per-reading data: never published in real-time. Always batched into quarterly anonymized snapshots.

Conflict of interest

The author of this research operates aimag.me, the AI tarot tool from which this data is collected. This is disclosed on every page. We have a financial interest in users finding tarot useful enough to subscribe to paid tiers.

To minimize bias from this conflict:

  • We publish data even when it's unflattering to AI tarot (e.g., the Major:Minor randomness finding directly undermines mystical claims)
  • We commit to publishing all quarterly snapshots regardless of what they show
  • We document and explain methodology changes whenever they happen
  • The dataset itself is open under Creative Commons license — anyone can run their own analysis and disagree with our interpretations

License

Published statistics on this site are released under Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0).

Citation format:

aimag.me Tarot Reading Dataset (n=1,370). Collected 2026-01-01 to 2026-05-02. Anonymized open dataset. Available at aimag.me/research.

Questions

For methodology questions, dataset access requests, or replication queries: [email protected].

For RODO/GDPR-related data subject requests, see our Privacy Policy.

Cite this research

If you use this in research, journalism, or analysis:

Fiedoruk, T. (2026). Research Methodology — How We Collect and Analyze Tarot Data. aimag.me Research. Retrieved from https://aimag.me/research/methodology

License: CC BY-SA 4.0. Dataset: /research/dataset

Want to add your own reading to the next snapshot?

Try a free reading on aimag.me →
Home Cards Reading Sign in