GA4 Attribution
The host surface where data-driven attribution is configured and reported. The default attribution layer for most operators.
Read the entry →Stan Consulting · Marketing Atlas · Reference · Attribution
Google's machine-learning attribution model that assigns fractional conversion credit across touchpoints based on observed contribution. The default GA4 model since 2023.
Section 02 · Quick definition
Data-driven attribution is the machine-learning attribution model used by Google Analytics 4 and Google Ads to assign fractional credit to each touchpoint on a conversion path. It uses the property's own historical conversion data to learn which interactions contribute most to outcomes and distributes credit accordingly. It became the default GA4 model in 2023, replacing last-click. The model output looks like attribution, but it is a learned distribution, not a measured cause. The operator sees credited revenue per channel and not the math behind the credit.
Section 03 · Why it matters
Data-driven attribution is closer to the truth than last-click for any business with a multi-touch buyer journey, and it is harder to defend in front of a CFO who wants a clean attribution narrative. The CFO sees credit shifting between channels quarter to quarter and asks why. The model retrained. The conversion volume changed. The path mix shifted. None of those answers settle a budget conversation. The model is an improvement in fidelity at the cost of a story.
The metric matters because it sits underneath every Smart Bidding decision, every channel performance dashboard, and every conversation about which marketing investments are working. An operator who does not understand what data-driven attribution is doing cannot read the dashboard and cannot question what the bidding algorithm is optimizing toward.
The practical stake is that data-driven attribution rewards the channels with the strongest learned contribution and penalizes channels with sparse data. The penalty is not always deserved, and the reward is not always proportional.
Section 04 · How it works
Data-driven attribution trains on the property's observed conversion paths and the paths of users who did not convert. It learns the lift each touchpoint contributed by comparing converting paths to non-converting paths with similar structure. The output is a fractional credit per touchpoint that sums to one across the path. The model retrains on a rolling basis using recent conversion data, so the credit assigned today is not necessarily the credit the same path would have received six weeks ago.
The model consumes the user's sequence of touchpoints on the way to a conversion: source, medium, campaign, and event timestamps. Paths can include up to 30 touchpoints across the lookback window. Untagged or direct sessions are included as touchpoints in the path.
The model compares paths that converted to similar paths that did not. The lift attributed to a touchpoint is roughly the difference in conversion probability between paths that included it and paths that did not. The math is approximate; the spirit is causal.
The model needs conversion volume to train. Google publishes a guideline of approximately 3,000 conversions per property over 30 days for stable output. Below the threshold, GA4 falls back to a less-trained model and the credit assignments become noisy.
The model retrains regularly to reflect recent buyer behavior. The retrain shifts credit assignments without any change in the buyer journey. This is the single largest reason CFOs see channel-credit drift quarter to quarter and ask why.
The four steps run continuously per property. The model output is a number; the math behind the number is not exposed in any GA4 report.
Section 05 · Common misunderstandings
“Data-driven attribution is causal. The model knows what drove the sale.”
The model is correlational with a counterfactual flavor. It does not run experiments and cannot establish causation. It learns which touchpoint patterns predict conversions on this specific property and assigns credit accordingly. That is useful and not the same as knowing what caused what.
“If we have a small store, data-driven attribution still works for us.”
Below approximately 3,000 conversions per 30 days, the model is undertrained and assigns credit with high variance. The reported numbers will look authoritative; they are noisy. Stores below the threshold are better served by a position-based model with a known structure.
“Data-driven attribution is fair to all channels because it's machine learning.”
The model is structurally biased toward channels with strong observability and frequent visibility on the path. Branded search and direct traffic both inherit credit that originated upstream in display, social, or organic. The bias is built into the data the model trains on, not the algorithm itself.
“The model output is the same in GA4 and Google Ads, so the numbers should match.”
GA4 and Google Ads each run their own data-driven attribution model on different scope: GA4 sees web and app sessions; Ads sees ad impressions and clicks. The two models train on different data and produce different credit. Differences are normal and structural.
“If we switch from data-driven to last-click, we'll see what really drives conversions.”
Last-click does not show what drove conversions. It shows which channel happened to be visited last. For brands with branded-search closing inheritance, last-click systematically over-credits paid search. For brands with email closing inheritance, last-click over-credits email. Neither model shows truth. Both show different stories.
Section 06 · Diagnostic questions
Does the GA4 property have enough conversion volume for stable data-driven attribution training, or is it below the threshold?
How has channel-level credit shifted in the last four quarters, and how much of that shift is model retrain versus underlying behavior change?
Which channels are gaining credit that was previously elsewhere, and which are losing it?
Is data-driven attribution being used to import conversions into Google Ads for Smart Bidding, and how does the imported credit compare to the platform-native count?
How has the share of direct traffic moved alongside the model retrain, and is direct absorbing credit that should belong to a tagged channel?
Are the data-driven attribution numbers being read against a benchmark or against the operator's own historical baseline at fixed mix?
Are conversion paths longer or shorter than they were 12 months ago, and does the model handle the new path length correctly?
Section 07 · Related Atlas entries
Section 08 · Five Cents
Data-driven attribution is closer to truth than last-click and harder to defend in front of a CFO. I have lost time in budget meetings explaining why a channel that drove 32% of revenue last quarter drove 24% this quarter without anything underneath the channel changing. The answer is the model retrained on the property's recent path data and reweighted credit. The CFO does not want that answer. The CFO wants a number that does not move because the math moved. There isn't one. The honest read is that data-driven attribution is the best learned credit available, and learned credit is not the same thing as a bank statement, and the operator who pretends otherwise is going to lose budget conversations they should win.
Stan · Marketing AtlasSection 09 · Sources