Skip to main content

HomeBlog › AI & Commerce

AI & Commerce

AI Traffic in GA4: The Attribution Playbook for Ecommerce

Most AI referral traffic is misattributed as direct. Here is how to build a proper AI-assistant channel grouping in GA4, tag outbound links, and recover attribution.

Quick Answer

AI assistants pass referrer headers inconsistently. ChatGPT and Perplexity usually pass, Gemini partially, Claude often does not, Grok is variable. The result: 20 to 40 percent of AI traffic lands in direct / none in GA4, hidden inside other channels. The fix is a custom channel grouping that unions AI hostnames, paired with UTM tagging on outbound links you control and a dedicated AI-assistants segment for analysis.

Key takeaways

Why GA4 default attribution misses AI traffic

Google Analytics 4 assigns channels based on referrer hostname, campaign parameters, and landing page behavior. When a user clicks a link in ChatGPT or Perplexity, the browser either passes the referrer (Perplexity does, ChatGPT usually does) or strips it (varies by browser, extension, and privacy setting).

Traffic with a valid referrer lands in the referral channel. Traffic with stripped referrer lands in direct / none, indistinguishable from true direct traffic. In tracked accounts, this cross-contamination inflates the direct channel by 10 to 25 percent and hides the true volume of AI-origin traffic.

The practical impact: operators see direct traffic growing, assume it is branded search or returning visitors, and fail to invest in the content that is actually driving the growth. Misattributed data produces mis-optimized behavior.

Building the AI-assistants custom channel grouping

In GA4, navigate to Admin > Data display > Channel groups > Create custom channel group. Name it 'Stan Marketing' or similar (not the default). Add a new channel called 'AI Assistants'.

Define the rule: Source matches regex '(chat\.openai\.com|chatgpt\.com|perplexity\.ai|gemini\.google\.com|bard\.google\.com|claude\.ai|grok\.com|x\.ai)'. Place this channel above Organic Search and Referral in the rule order, because GA4 applies rules in order and matches the first hit.

Once the custom channel grouping is created, set it as the default for your property. All reports will now segment AI traffic correctly. Historical data cannot be re-segmented; the grouping applies to new sessions only.

UTM tagging for recovered attribution

You cannot tag links the AI produces from live retrieval. But you can tag links you publish on blog content, social posts, partner sites, and anywhere else you control that an AI might ingest during training or retrieval.

Pattern: every outbound link on your blog to a product or service page carries utm_source set to the publishing context, utm_medium set to content, utm_campaign identifying the article. When the AI cites your blog and the user clicks through, attribution is preserved through the UTM chain.

Second pattern: branded partnership mentions. If a product review on a major publication links to your store, request the link include a specific UTM. When AI assistants retrieve that review as a source and users click through, attribution carries.

The dedicated AI-assistants segment for analysis

Beyond the channel grouping, build an Explore segment in GA4 that isolates AI-origin users. Use the same regex from the channel grouping rule. Apply the segment to Explore reports to see conversion rate, AOV, session duration, pages per session, and assisted conversions.

The segment answers operational questions the channel grouping alone does not: which pages do AI visitors land on first? Which pages do they convert on? What is the assisted conversion contribution? Is the visitor pattern different by AI platform (ChatGPT vs Perplexity vs Gemini)?

Report monthly. Look for AOV and conversion rate trends versus organic search baseline. If AOV is materially higher (typical), the channel is producing high-quality traffic and deserves more investment.

Tracking AI-specific on-site behavior

Beyond referrer tracking, capture AI-specific events that correlate with AI-origin traffic. Time-on-page above 90 seconds is a strong AI-origin signal. Pages per session at 2+ with a product or collection first page is another.

Create a GA4 audience based on these behavioral triggers. Compare audience overlap with the AI-assistants segment. If 60 to 75 percent of the behavioral audience also appears in the AI segment, the behavioral signal is reliable enough to use for lookalike modeling and retargeting.

This is particularly useful for stores where referrer stripping is severe. The behavioral signal becomes a proxy for the attribution the header did not preserve.

Common attribution mistakes that destroy AI channel data

Setting the custom channel group below Organic Search in rule order: this causes google.com/ or bing.com/ referrers to get assigned to Organic even when they came from Gemini or Copilot embedded in Bing. Put the AI rule above Organic.

Using overly narrow regex that misses subdomain variations: chatgpt.com but not chat.openai.com, or perplexity.ai but not www.perplexity.ai. Test the regex against actual source reports before locking it in.

Tagging internal links with UTMs: internal traffic carrying utm_source corrupts channel attribution. UTMs belong on external outbound links only. Audit every link carrying a utm_source parameter; if it points to your own domain, remove the UTM or use cross-domain tracking instead.

5-Platform comparison: how each AI treats Shopify

A quick reference across ChatGPT, Perplexity, Gemini, Claude, and Grok. For the full 11-dimension deep comparison with optimization cost and decision framework, see the AI Platforms for Ecommerce comparison.

Platform Source mechanism What it rewards Traffic profile
ChatGPT Training corpus + Bing live retrieval + OpenAI Shopping partners Complete schema, authority signals, named specifications, editorial coverage Highest volume. 1.5-3x conversion. +20-40% AOV. Longer sessions.
Perplexity Live web retrieval only, inline citations on every answer Heading-structure query match, freshness, clean crawlability, clean schema Fastest-growing. 1.3-2.5x conversion. +10-30% AOV. High click-through.
Gemini Google Search + Merchant Center feeds + Google Shopping ads (blended) Top-10 organic ranking, feed health, Shopping ad Quality Score, structured data Variable. 1.1-1.8x conversion. Patterns blend with organic search.
Claude Training corpus, conservative live retrieval in some interfaces High-authority editorial coverage, declarative framework language, trusted sources Lower volume. 2-4x conversion when cited. +25-50% AOV. High quality.
Grok X/Twitter public data + web retrieval, real-time bias Active X presence, recent public mentions, timely offers, trending topics Newest, unstable. Category-specific (gaming, tech, collectibles).

Common Questions

Common questions

Why does AI traffic show up as direct in GA4?

AI assistants vary in referrer passage. Perplexity and ChatGPT usually pass, but user browser settings, privacy extensions, and some AI interfaces strip the referrer before the click reaches your site. Traffic with stripped referrer lands in direct / none, indistinguishable from true direct. The fix is a custom channel grouping in GA4 plus UTM tagging on every outbound link you publish, so when AI assistants cite your content the click through preserves attribution.

What regex should I use for AI-assistants channel grouping?

Source matches regex with these hostnames separated by pipes: chat.openai.com, chatgpt.com, perplexity.ai, gemini.google.com, bard.google.com, claude.ai, grok.com, x.ai. Escape periods as backslash-dot. Place the rule above Organic Search in the channel group rule order so Gemini traffic through google.com gets assigned to AI rather than Organic.

Can I recover historical AI traffic that was misattributed?

No. GA4 custom channel groupings apply only to new data after the grouping is defined. Historical sessions already assigned to direct or referral stay assigned. You can approximate historical volume by filtering the referral channel on AI hostnames and adding an estimate of direct contamination, but it will not be precise. Start tracking correctly going forward and move on.

Should I build a separate GA4 property for AI traffic?

No. Use the same property with a custom channel grouping and a dedicated audience or segment for AI users. Separating properties creates cross-domain tracking headaches and fragments your data. The single-property model with proper segments handles AI analysis cleanly while preserving all other analytics capabilities.

How do I know if my AI traffic tracking is accurate?

Cross-reference three data points. First, check your GA4 channel report after the custom grouping is live; AI should show non-zero sessions. Second, check the AI-assistants segment conversion rate; it should be 1.5 to 3x your organic baseline (AI traffic converts well). Third, check AOV; it should be 10 to 40 percent above organic. If all three match expected patterns, tracking is working. If AOV is at parity with organic, you are probably missing traffic in direct.

Does UTM tagging on blog links really recover attribution?

Partially. When AI assistants cite your blog content and users click through, the blog-page URL loads with UTM parameters preserved. If you then link from that blog page to a product with UTM-preserving behavior, the product visit carries the original source. This recovers some of the otherwise-lost AI-origin traffic. It does not recover traffic from AI answers that cite your store directly without going through your blog first.

What metrics should I compare between AI and organic traffic?

Conversion rate is the primary metric. AI traffic typically converts at 1.5 to 3x organic search rates because the user arrived pre-qualified by the AI's answer. AOV is secondary; expect 10 to 40 percent above organic. Session duration is informational; AI visitors often spend 60 to 120 seconds versus 40 seconds organic median. Pages per session matters less. Track assisted conversions to capture AI traffic that influences later direct-return purchases.

The Engagement Format

Begin with the diagnosis. Not the proposal.

$999 · 72-hour written diagnostic · No retainer structure · fee is final on submission before work commences

Begin Revenue Sprint
Stan Tscherenkow, Principal Consultant, Stan Consulting LLC

Stan Tscherenkow

Principal Consultant · Stan Consulting LLC

Twenty years paid advertising team across US, European, and Asian markets. MBA, Universitat Trier. Marketing, Loughborough University. Founded Stan Consulting LLC in 2019, Roseville California.

About us →