Web Cited Research

13 of 600.

We tested 10 funded B2B developer-tools brands for AI search visibility. Across 4 LLMs and 600 buyer-research responses, only 13 mentioned any of the 10 brands. Seven scored zero. Three broke through.

Published May 15, 2026 · Developer Tools & Infrastructure category
13 / 600 LLM responses that mentioned any of the 10 target brands across 5 standardized buyer prompts. Three brands (LlamaIndex, MacStadium, Qase) account for the 13; the other seven got zero.

The finding in one paragraph

We picked 10 funded B2B developer-tools and infrastructure companies, all with strong category positions in their sub-niche. We asked four leading LLMs (ChatGPT, Claude, Gemini, Perplexity) five buyer-research questions any engineering leader or platform team lead would plausibly type into an AI assistant. We ran each question 3 times per engine for variance. That produced 600 distinct LLM responses. Only 13 of those responses mentioned any of the 10 target brands by domain or by name: 8 mentioned LlamaIndex, 3 mentioned MacStadium, 2 mentioned Qase, and the remaining seven brands appeared zero times. The category is being defined by a different set of brands entirely.

Where each brand landed

Seven of ten scored 0 / 60. Three broke through. The leader (LlamaIndex) earned 8 mentions out of 60 LLM responses. Even the top brand was cited less than 14% of the time.

Brand Domain Category Score Homepage fixes
LlamaIndexllamaindex.aiLLM framework and RAG infrastructure8 / 60n/a
MacStadiummacstadium.comMac cloud + CI infrastructure3 / 609
Qaseqase.ioQA test management2 / 607
ActiveStateactivestate.comOpen-source supply chain security0 / 605
HumanSignalhumansignal.comData labeling for ML teams0 / 6010
FusionAuthfusionauth.ioAuthentication and identity for developers0 / 607
Liquibaseliquibase.comDatabase change management and CI/CD0 / 6010
Network to Codenetworktocode.comNetwork automation0 / 605
RapidCanvasrapidcanvas.aiAI/ML platform and applications0 / 607
Synergissynergis.comEngineering document management0 / 6010

"Homepage fixes" is the per-brand count of fix recommendations detectable from a single homepage fetch and the brand's robots.txt, llms.txt, and sitemap.xml endpoints. Roughly 25 distinct check categories: server-side rendering, structured-data JSON-LD presence (Organization, Product/Service, FAQPage, BreadcrumbList), OpenGraph + Twitter Card completeness, canonical + meta description + title + h1 health, alt-text coverage, mobile viewport, html lang attribute, robots.txt allow-list for AI crawlers (GPTBot, ClaudeBot, PerplexityBot, OAI-SearchBot, Google-Extended, anthropic-ai), and llms.txt + sitemap presence. The full Web Cited audit identifies additional fixes from deeper page-by-page analysis, schema validation, citation-footprint checks, and accessibility scans. "n/a" indicates the brand's homepage blocked our discovery user-agent (a fix in itself: AI crawlers may be blocked too).

Who LLMs cite instead

The same 600 responses cite the brands below by name, repeatedly. These are the companies that have entered the AI-search conversation for devtools buyer prompts. Even brands at the top of this list have remaining homepage fixes available; AI search visibility is rarely saturated.

  1. LangChain82 mentions10
  2. TestRail75 mentions8
  3. Auth055 mentions6
  4. Okta55 mentions8
  5. GitHub Actions51 mentions10
  6. CircleCI49 mentions6
  7. Xcode Cloud49 mentions11
  8. Jenkins40 mentions11
  9. Jira37 mentions9
  10. Zephyr36 mentions9
  11. PropelAuth36 mentionsn/a
  12. Haystack36 mentions9
  13. Bitrise35 mentions4
  14. PractiTest35 mentions6
  15. Fastlane31 mentions13
  16. Zephyr Scale31 mentionsn/a

Mention counts are bolded-product-name references across all 600 LLM responses. "Homepage fixes" is the per-brand count of fix recommendations from the discovery audit described above. "n/a" indicates the brand's homepage blocked our discovery user-agent.

Methodology

This study was designed to be replicable. The exact prompts, models, settings, and matching logic are documented below. Anyone with API access to the four engines can re-run it.

Engines tested
OpenAI ChatGPT (gpt-4o-mini)
Anthropic Claude (claude-haiku-4-5-20251001)
Google Gemini (gemini-2.5-flash-lite)
Perplexity (sonar)
Settings (all engines)
temperature 0.2
max_tokens 800
N = 3 trials per (engine, prompt)
Total LLM calls
600
10 brands × 4 engines × 5 prompts × 3 trials
Match criteria
case-insensitive substring match of brand domain OR brand name anywhere in the LLM response, including any cited URLs

The 5 standardized buyer prompts

  1. What is the best macOS CI/CD infrastructure for iOS and macOS app development teams?
  2. How do enterprises secure their open-source software supply chain in 2026?
  3. What is the best authentication and identity platform for B2B SaaS developers?
  4. What tools do QA teams use for test management and release coverage at scale?
  5. What is the best LLM framework for building production RAG applications?

The same 5 prompts were applied to every brand audit. This is the only way to make a category comparison defensible: if each brand got its own custom prompts, the rankings would reflect the prompt selection, not the brand visibility.

Known limitations of this methodology. Substring matching can miss paraphrased mentions where the LLM references a product without naming the company or domain. LLM responses vary between calls; N = 3 trials per prompt reduces but does not eliminate variance. Buyer prompts evolve; this set reflects May 2026 category language. LLM training data has a knowledge cutoff that may not include recently-launched brands or recent rebrands.

What this means if your devtools brand is on the invisible list

Most funded B2B devtools vendors have not entered the AI-search conversation. They built websites optimized for SEO and developer-doc indexing, not for how LLMs synthesize answers. When a platform engineer or CTO asks an AI assistant a category question, the answer comes back with the brands that show up in Stack Overflow answers, in GitHub READMEs that LLMs ingested, in YouTube engineering content, and in long-form analyst writeups. The dominant set is not necessarily the best tool. It is the set the LLMs have read about most often.

The fix is mechanical. LLM-friendly content follows specific patterns: structured FAQ content matching exact buyer-prompt phrasing, schema-marked product pages, citation footprints on third-party developer sites the LLMs already index, and content that answers comparison queries directly. LlamaIndex did some combination of these things, intentionally or not, and earned a 13% citation share where the median brand earned 0%.

Want to know if your devtools brand is invisible?

Web Cited runs the same measurement against your domain, your buyer prompts, and the engines your developer audience actually uses. Citation Monitor reads those prompts every week and ships a click-to-copy Playbook your engineers ship from in the next sprint, from $49/mo.

Start monitoring