← All Insights
attributionHubSpotMarketing OpsRevOps

UTMs to known contacts: the attribution gap that breaks every Looker-to-HubSpot dashboard

Most B2B attribution problems are not multi-touch problems. They are session-stitching problems: and the fix is a property, not a bigger tool.

Generally, when a marketing leader comes to us frustrated about attribution, the conversation starts with the dashboards. The Looker dashboard says inbound came from organic. The HubSpot dashboard says inbound came from referral. Both are technically correct at their respective layers, and the fact that they disagree is not a tooling problem, it is a session-stitching problem. UTMs live on anonymous sessions, contacts live in HubSpot, and nobody owns the join in between. That is the gap this post is about.

Why this matters now

Sales and marketing still operate on disconnected data in most B2B SaaS organisations, and that is the failure mode underneath nearly every attribution argument. The Harvard Business Review piece on linking sales and marketing put it directly: companies are inhibited by siloed customer data, and a digital customer hub, one identifier, one record, one journey, is the structural fix (hbr.org). For SaaS teams running a warehouse alongside HubSpot, the session-to-contact stitch is where that hub either holds or quietly stops being trustworthy.

Why the form-fill UTM capture isn't enough

The standard pattern is to capture the latest UTM parameters on the form submission and write them to hidden fields on the contact: utm_source, utm_medium, utm_campaign. HubSpot does this natively, and it works for the simplest case, a visitor lands on a UTM-tagged URL, clicks a CTA, fills a form, becomes a contact. One session, one contact, one set of UTMs.

The case that almost never holds, a visitor opens the UTM-tagged email on day one, browses three pages, leaves. Two days later they come back via a Google search, read pricing, leave again. Two days after that they paste the homepage URL into a browser and fill the demo form. The form-fill UTM capture writes direct / none to the contact. The warehouse session table, meanwhile, has all three sessions and the email-source UTM that actually drove the journey. Two systems, two truths, and the one that wins is whichever dashboard the CFO opens first.

The session-token property pattern

The fix is not a bigger attribution tool. It is a single property on the contact record, call it session_token, that holds the same identifier the front-end writes to its session log. On first page load, the front-end generates the token (a UUID, a hashed cookie, anything stable across sessions), drops it into a first-party cookie, and tags every page event with it. When the visitor fills a form, the same token writes to a hidden field and lands on the contact.

From there, attribution stops being a guess and becomes a join. The contact has a token; the session table has full UTM history keyed to that token, including every prior anonymous session the same browser owned. The contact-level UTM properties in HubSpot become the summary; the warehouse join is the source of truth.

What lives where

On the contact, session_token, first_session_utm_source, first_session_utm_medium, first_session_utm_campaign, last_session_utm_source. These are summary fields, populated by the workflow on form submission, so the SDR can read the contact card without opening the warehouse.

Inside the warehouse, the full session table, keyed by session_token, one row per page view with full UTM context. The contact record points at the session table; the session table does not need to know about the contact until the join runs.

Backfill workflow, matching anonymous sessions to converted contacts

The form fill captures the current session token. It does not, by itself, capture the prior anonymous sessions from the same browser. That is what the backfill workflow is for.

The first-party cookie holding the token persists across sessions, same UUID on every visit until the cookie expires or the user clears it. The warehouse session table inherits that token, so all of that browser's anonymous sessions are already keyed to one identifier. The contact record only learns the token at form-fill time. The backfill is the moment you walk that token backward through the session table and pull every prior session into the contact's attribution view.

In practice this is a scheduled job, not a real-time workflow. Every night, a Looker query finds new contacts created in the last 24 hours, joins on session_token, and writes the rolled-up first-touch and converting-touch UTMs back to HubSpot via the API. Real-time is a nice-to-have; nightly is the version that actually ships and stays running.

The free-email-domain problem

About 40% of inbound contacts on most B2B SaaS sites we work with arrive on a free email domain, gmail, outlook, yahoo. Some are real buyers using a personal address; many are noise. Either way, a contact on a personal email is harder to resolve to a company, and the session token only helps if the same browser-and-cookie was actually the buyer's. Shared laptop, cleared cookies, incognito form fill, any of those breaks the token chain and the backfill produces a partial picture.

There is no clean fix at the contact level. What works is to accept the directional truth, free-email-domain contacts resolve cleanly roughly 60% of the time, business-email-domain contacts closer to 90%, and the aggregate report stays honest if you segment by email type. The dashboard that pretends every contact has full attribution is the dashboard that quietly lies to the CMO.

On the one hand, on the other hand

On the one hand, the session-token join gives credit to the email campaign that actually drove the journey, even when the converting form fill says direct / none. On the other hand, this needs to be taken with a grain of salt at the contact level, especially at the free-email-domain edge. Directionally, the join is the fix the bigger attribution tool was supposed to be. At the individual-contact level, it is one signal among several. Both can be true.

Pattern from the field

A B2B SaaS team in DACH at Series A came to us last quarter with a familiar complaint: the warehouse dashboard said paid social was their best inbound channel, the HubSpot contact-level dashboard said paid social was barely registering. The team had been arguing about which number to put in the board deck for two months. The real problem was that the front end was writing a session ID to the warehouse but never to the contact, so HubSpot only ever saw the converting session UTMs: which, given the typical read-then-return buyer journey, almost always landed on direct or organic. We added a session_token property, wired the form to capture it, ran a nightly backfill against the existing session history, and inside three weeks the dashboards started telling the same story, paid social was driving discovery, direct was closing.

Resolution, the session-stitch playbook

  1. Generate a stable session token on the front end. One UUID per browser, persisted in a first-party cookie, attached to every page event flowing into Looker.
  2. Add a session_token contact property in HubSpot. Single-line text, hidden from the contact card if you want, but populated on every form fill via a hidden field.
  3. Write the token to the form. The form's hidden field reads from the cookie at submission time and writes the token to the contact record. This is the only piece that has to be real-time.
  4. Build the nightly backfill query. A Looker job that joins new contacts on session_token, rolls up first-touch and converting-touch UTMs from the session table, and pushes the result back to HubSpot via the API.
  5. Segment your reporting by email type. Free-email-domain contacts get a separate tab in the attribution dashboard. The aggregate is honest only if you let the noisy segment be visibly noisy.
  6. Reconcile the two dashboards monthly. Looker and HubSpot will not match exactly, they are summarising different things, but the directional story should agree. When it does not, the session-stitch is broken before the data is.
  7. Document the join. A one-page note on which property lives where, which job runs when, and what to check first when a number looks wrong. The session-token pattern is the kind of plumbing that breaks silently if no one knows it exists.

Where Checkpoint comes in

Most attribution work we do at Checkpoint is not building a new model, it is fixing the session-stitch the previous model assumed was there. If your marketing operations team is fighting two dashboards that disagree, the join is almost always the broken layer, and a property plus a nightly query is almost always the fix. We pair that with the broader revenue operations work on what attribution is actually for in your funnel, and with the HubSpot implementation work that wires the properties, forms, and workflows so the stitch survives the next portal change. That's why the right place to start is not the attribution tool, it is the contact record.

Sources

Carolina Decastri
Carolina Decastri
GTM & Partnerships

Five years across sales, project management, and venture capital, focused on supporting early-stage startups from zero to one. Built a Founder Resources Platform serving 200+ founders and 100+ partnerships. Founded the START and Platform Crew communities. HubSpot Sales and Marketing Hubs certified.

LinkedIn

Share this article