Methodology · 8 min read · April 26, 2026
How We Calculate AI Visibility Metrics
Every number on your Arenza dashboard has a formula. This page is the receipt — exact computation, exact data source, every edge case we’ve hit. If a vendor (us included) won’t show you their math, that vendor’s number is marketing, not measurement.
TL;DR
- Answer-side metrics (Visibility Score, Share of Voice, Citation Share, Sentiment, Themes) follow Profound’s vocabulary. Same words, same formulas.
- Traffic-side metric (AI Sessions) comes from our own SDK on your website — Profound doesn’t track this; we do.
- We deliberately do NOT track conversions, orders, or revenue. That data lives in your existing analytics; we join it on the portal side, not via SDK pollution.
1. Visibility Score
What it answers: “Out of every question a customer might ask AI about my category, what % of the AI answers actually mention my brand?”
Formula
Visibility Score = (questions where brand is mentioned in AI answer)
/ (total questions scanned across all platforms)
× 100Data source: our scan-batch pipeline runs 25 questions × 3 AI platforms (ChatGPT, Gemini, Perplexity) per brand per scan. For each question we save the full AI answer text. For Visibility Score we case-insensitively check whether the brand name string appears anywhere in that answer.
Honest edge cases:
- Substring match. “Anker” matches both “Anker” and “AnkerWork”. Brands with sub-brands (Anker / SOLIX / AnkerWork) get a true count of mentions but cannot distinguish parent vs sub-brand at this layer — sub-brand attribution is in the “Sub-brand breakdown” card downstream.
- Pronoun / paraphrase mentions don’t count. If AI says “the company” instead of repeating the brand name, we miss it. This biases the metric slightly downward, especially for less-famous brands.
- The denominator is questions, not platforms. A question that 3 platforms each mention you in counts as 1 mentioned, not 3.
2. Share of Voice
What it answers: “Out of every brand mention in my industry’s AI answers, what fraction belongs to me vs my competitors?”
Formula
Share of Voice = my_mentions / (my_mentions + sum(peer_mentions)) × 100 where peer set = brands in same industry per industry-library
Data source: we maintain an industry-library that maps each brand slug to an industry (e-bike, robot vacuum, portable power station, etc.). For your tenant we look up the industry, find peer brands, and re-run the same mention-count logic on each peer’s most recent scan. Sum the mentions, divide.
Honest edge cases:
- Peer scans are point-in-time. If we scanned Lectric two weeks ago and Engwe today, we’re comparing freshness mismatched data. We re-run all peer brands monthly to keep this within ±30 days.
- Peer set is curated, not exhaustive. We track 7–15 reference brands per category. Long-tail competitors aren’t in the SOV denominator, which biases the number slightly upward.
- Each brand-name match counts once per question, not once per occurrence. AI saying “Lectric is great, Lectric ships fast” counts as 1 Lectric mention.
3. Citation Share
What it answers: “When AI links to a source while answering questions about my industry, how often does it link to *my* domain?”
Formula
Citation Share = (questions where AI answer linked to brand domain)
/ (questions where AI answer contained ANY link)
× 100Data source: for every AI answer we extract URLs via regex (https?://[^\s<>]+), normalize host to lowercase + strip www., and check whether any URL host matches the brand’s primary domain (or a subdomain of it).
Honest edge cases:
- The denominator excludes link-less answers. ChatGPT in particular often summarizes without citing — those questions don’t bring the metric down. This is the same convention Profound uses.
- Brand domain inference is heuristic:
brand.toLowerCase().replace(/[^a-z0-9]/g, '') + '.com'. Customers with non-.comprimary domains (e.g.brand.io,brand.uk) should declare their domain on Sites — otherwise we count zero. Override is atapp.arenza.ai/sites. - Citations to your competitors show up in “Top Citation Domains” below the tile. Your number is just the share that’s yours.
4. Sentiment Breakdown
What it answers: “When AI talks about my brand, is it being positive, neutral, or negative?”
Per-question classification
for each question: worst impact = max(issues for this question, by impact rank) if no issues → POSITIVE if purchase_blocker → NEGATIVE (e.g. AI told users not to buy) if factual_drift → NEGATIVE (e.g. AI cited wrong specs) if visibility_gap → NEGATIVE (e.g. AI didn't mention us at all) if reputation_risk → NEUTRAL (e.g. AI mentioned a known issue) if competitive_disadvantage → NEUTRAL (e.g. AI ranked us below peers)
Data source: our judge model evaluates each AI answer against ground truth and assigns a businessImpact classification per detected issue. We bucket those into VP-readable sentiment categories.
Why this isn’t pure NLP sentiment: running a separate sentiment classifier (positive/negative tone) on the AI answer text would tell you the AI’s tone, not its impact on your business. Profound and most competitors run NLP sentiment. We chose business-impact sentiment because a politely-worded answer that tells customers your product is dangerous is “positive tone, severely negative impact” — you need to know about it. If you want raw NLP sentiment, ask us in support and we’ll add a toggle.
5. Themes
What it answers: “Which topics is my brand getting hit on?”
Data source: our scan covers 4 dimensions (D1 Visibility, D2 Factual Accuracy, D3 Positioning, D4 Competitive). The Themes card shows the issue count per dimension, ranked. The dimension with the most issues is your top theme; usually that points the GEO team to where to invest first.
6. Top Citation Domains
What it answers: “Where is AI getting its information about my industry?”
The Top Citation Domains card aggregates every URL extracted from every AI answer in your scan, groups by host, and shows the most-cited 10 with a per-platform breakdown. This is how you find out that AI is citing Reddit threads more than your own pages, or that Wikipedia is the dominant source about you, or that a specific competitor’s blog post is being treated as authoritative.
Same regex extraction as Citation Share. No deduplication of identical URLs within a single answer — if AI cites the same URL three times in one paragraph, that’s 3 to the count, on the assumption that AI repeating itself reflects emphasis.
7. AI Sessions (Arenza exclusive)
What it answers: “How many actual visitors did AI platforms send to my website?”
This is the one metric Profound doesn’t have. Profound looks at AI answers; we add the second half: did those answers actually drive someone to click through to you?
Pipeline
browser visits your page
→ Arenza SDK reads document.referrer
→ identify(referrer) classifies as AI / non-AI
chatgpt.com → ai_platform=chatgpt, confidence=high
perplexity.ai → ai_platform=perplexity, confidence=high
claude.ai → ai_platform=claude, confidence=high
(full mapping in identify.ts, 6 platforms today)
→ POST /api/sdk/events (page_url + referrer + UTC timestamp + anon session_id)
→ backend dedupes by session_id, aggregates daily
AI Sessions count = unique session_ids in window where is_ai=trueData source: 4KB JS SDK on your website (arenza.ai/sdk.js). Install guide: contact us or check the customer onboarding doc.
Honest edge cases:
- Referrer stripping: Cloudflare, ad blockers, and some browser privacy modes blank
document.referrer. Lost referrer = lost AI attribution. Industry-wide problem; not solvable client-side. - Direct visits: AI often summarizes without a clickable link. Users who later type your URL or Google your brand do not show up as AI sessions, even though AI was the catalyst. This biases AI Sessions downward; treat it as a floor.
- Session = 30 min sticky. Same visitor refreshing 5 times = 1 session. Same visitor coming back 31 minutes later = 2 sessions. Standard GA4 convention.
What we do NOT compute
Several metrics other vendors show, we deliberately omit:
- Conversions / revenue / AOV. Out of scope for the SDK. Use your existing analytics (GA4 / Shopify / Mixpanel) to measure conversions; we join the two datasets on the portal side via OAuth or CSV upload, not by polluting the SDK.
- Average Position (where in the AI answer your brand appears: 1st mention vs Nth). Profound shows this. Our judge model doesn’t yet record position. On the Q3 roadmap.
- Sub-brand rollups. We can detect Anker vs AnkerWork mentions inside one scan, but the rollup logic (Anker = parent of AnkerWork+SOLIX+Eufy) is bespoke per customer. Available on request, not in the default tile.
- Sentiment trend over time. Each scan stores a snapshot, not a time series. Coming when we ship the daily-snapshot index, ETA Q3 2026.
Why publish this
Three reasons.
First, customers paying us ≥$10k/year deserve to know what they’re paying for. A dashboard that shows you a 14% Visibility Score without telling you what that 14% is computed from is not a dashboard, it’s a slot machine.
Second, we want analysts and other vendors to be able to reproduce our numbers and call BS on us when we get them wrong. The fastest way to lose trust in this category is to invent metrics nobody can audit.
Third, this also keeps us honest internally. If a metric ships without a paragraph that fits in this page, the metric is over-engineered or underspecified.
Disagree with any of the formulas? research@arenza.ai. We answer.
Source code references: backend/src/sdk/visibility.ts (computation), backend/src/sdk/identify.ts (AI platform detection), backend/industry-library/library.json (peer set), sdk/arenza-js/src/index.ts (SDK). Repo access on request.