How we calculate the Queraid Score
Transparency builds trust. Here's exactly how we measure AI brand visibility. The methodology is open. The dataset behind the scores is our proprietary asset.
1. Which AI models we query
Every audit queries 5 major AI models in parallel. Using multiple providers eliminates single-model bias and gives a consensus view of how "AI" — not just one chatbot — sees your brand.
OpenAI
GPT-4o-mini
Anthropic
Claude Haiku 4.5
Perplexity
Sonar
Gemini 2.0 Flash
Groq (xAI)
Grok 3 Mini Fast
All models are queried via their official APIs with a 12-second timeout per LLM. If a model fails, we retry once before marking it as unavailable.
2. What prompts we use
We first auto-detect the brand's industry using a single LLM call, then generate 6 industry-aware prompts. Each prompt is sent to all 5 models, producing 30 data points per audit.
Identity
"What is {brand} and what does it do?"
Best in class
"What are the best {industry} companies or tools?"
Recommendation
"I need a recommendation for {industry}. What should I use?"
Comparison
"Compare {brand} with its competitors in {industry}."
Reputation
"What do people think about {brand}?"
Use case
"Is {brand} a good choice for {industry}? Why or why not?"
Prompts are also available in Italian for multilingual audits.
3. How we detect your industry
Before generating prompts, we make a single LLM call (GPT-4o-mini) asking: "What industry does this brand operate in? Reply with just the industry name." The detected industry is used to make all 6 prompts contextually relevant.
This ensures prompts like "best in class" and "recommendation" reference the correct competitive landscape, dramatically improving the accuracy of the audit.
4. Scoring breakdown
The Queraid Score ranges from 0 to 100 and is composed of four weighted components:
Mention Rate
40 ptsHow often does the brand appear in LLM responses?
mention_score = (mentions / 30 datapoints) × 40
Sentiment
30 ptsHow positively does AI describe the brand? Sentiment is analyzed per-response and averaged.
sentiment_score = normalize(avg_sentiment, -1..+1) × 30
Competitor Parity
20 ptsHow does the brand's visibility compare to competitors mentioned in the same responses?
competitor_score = (brand_mentions / total_competitor_mentions) × 20
Cross-LLM Consistency
10 ptsAre mention rates consistent across all 5 LLMs? Low variance means strong, uniform AI presence.
consistency_score = (1 - variance / 0.25) × 10
60-100
Good
30-59
Needs Work
0-29
Critical
5. Caching
Results are cached for 24 hours per brand (normalized). Repeat checks within this window return the same score instantly without additional LLM calls. This keeps costs low and results consistent within a day.
Cache is based on the normalized brand name (lowercase, stripped of trailing dots/slashes). "Stripe.com", "stripe.com", and "STRIPE.COM" all resolve to the same cached result.
6. Limitations
Point-in-time snapshot. Scores reflect LLM outputs at the time of the audit. As models are updated and retrained, responses — and scores — may change.
Not an endorsement. Queraid measures visibility, not quality. A high score means AI models know about and mention your brand, not that your product is better.
AI-generated data. LLM responses may contain inaccuracies. The score aggregates many data points to smooth out individual errors, but is not a guarantee of factual accuracy.
Model availability. If an LLM provider is down during an audit, we retry once and then score with the remaining models. Scores from 4 models are still valid but may differ slightly from 5-model scores.
This methodology is open. We believe transparency builds trust and drives standard adoption.
The data behind the scores is our proprietary dataset.