HubSpot lead scoring: fit vs. engagement, why one score never works

When a team asks us to build lead scoring in HubSpot, the first question we ask back is whether they are trying to score who the lead is, or what the lead is doing. Those are two different jobs. Most instances we audit have one model trying to do both, which is why nobody on the sales floor takes the score seriously. The fix is not a better algorithm. It is splitting the score into two contact-level properties, each owned by the team that uses it.

Why this matters now

Buying behavior at the contact level has gotten harder to read, not easier. Buying groups are larger, individual touchpoints noisier, and the signal-to-noise ratio on a composite grade has fallen further than most ops teams want to admit. The Harvard Business Review's analysis of the modern B2B buying journey laid out the structural reason, customers spend most of their journey in self-serve research and surface inside the CRM with partial intent, presenting a profile that looks contradictory until you separate who they are from what they are doing. A combined grade flattens that distinction at exactly the moment it matters most. [1]

What's the difference between fit scoring and engagement scoring?

Checkpoint GTM splits HubSpot scoring into two contact-level properties: a fit score answers who the lead is, built from static demographics, while an engagement score answers what they are doing, built from recency-weighted behavior. Across the instances Checkpoint GTM audits, one combined grade averages both signals, so neither the inbound nor outbound team trusts it.

Generally, when a revenue team asks for a lead score, they are asking for one number that ranks the queue. That instinct is the trap. Ranking implies a single ordering, and a single ordering forces you to average two signals that should not be averaged. The fit score answers a structural question about the account. The engagement score answers a behavioral question about timing. Averaging them produces a middle number that is high enough to clutter the inbound queue and low enough to lose the outbound rep's confidence on the cold list. The right question is closer to, which property does each team weight when they decide what to work next?

Fit score, who they are

Checkpoint GTM builds the fit score purely from demographics, country, revenue band, employee band, industry, job title, seniority, and ICP filters, so it barely moves until the company changes. Checkpoint GTM validates it by pulling last quarter's closed-won against closed-lost-no-decision deals; if the fit score does not visibly separate them, the inputs are wrong, not the math.

The fit score is the answer to whether this is the kind of company the team sells to. It is built from contact and company demographics, country, revenue band, employee band, industry, job title, seniority, and any ICP filter the GTM team has agreed on. The data is mostly static or slow-moving. Once a contact's company is enriched, the fit score barely moves until the company itself changes.

The diagnostic test for a fit score is simple. Pull a list of last quarter's closed-won deals, then pull a list of last quarter's closed-lost-no-decision deals. The fit score should separate them visibly. If the distributions overlap, the inputs are wrong, not the math.

What the fit score does not tell you

The fit score will not tell you whether the contact is paying attention this week. A perfect-fit contact who has not opened an email in nine months is still a perfect-fit contact, and a fit score that decays based on engagement is no longer a fit score. It is a confused composite. Resist the temptation to bake recency into it.

Engagement score, what they are doing

Checkpoint GTM builds the engagement score purely from behavior, opens, page views, demo requests, content downloads, pricing-page visits, and meeting holds, each carrying a recency weight in the usual thirty-to-ninety-day band. It moves daily, so the inbound team queues from it; a contact who fired every signal in Q2 decays back down as attention fades.

The engagement score is the answer to whether the contact is paying attention right now. It is built from behavior, opens, page views, demo requests, content downloads, pricing-page visits, and meeting holds. Every behavioral input has a recency weight. A demo view from yesterday counts more than three pricing-page visits from June, and a contact who fired all their signals in Q2 should fall back down the queue as the score decays.

The engagement score is the property the inbound team queues from. It moves daily and gives the rep a reason this contact surfaced today. The fit score decides whether the lead is worth the next motion at all; the engagement score decides today is the day.

Why the inbound team uses one and the outbound team uses the other

Checkpoint GTM gives each team a primary property: inbound queues from the engagement score, recency-weighted, because fit is already settled by the form; outbound filters on the fit score as a hard ICP cutoff, since engagement is sparse on a cold list. A single combined grade compromises both, which is why neither team trusts it.

On the one hand, the inbound team is working a queue of leads who already raised a hand. The fit question is largely settled by the form they filled in, or by the routing layer. What inbound needs is a sequence, who raised their hand most recently, with what intensity, on what page. That is the engagement score, weighted heavily on recency.

On the other hand, the outbound team is building a list from a colder universe. The engagement signal is sparse by definition. What outbound needs is a filter that removes the segment of the universe that does not match the ICP before any rep spends a minute on a sequence. That is the fit score, used as a hard cutoff rather than a ranking. That is why a single combined lead grade disappoints both teams, inbound would rather the score moved with behavior; outbound would rather the score did not move at all once the company is enriched. A combined grade compromises both.

Pattern from the field

A B2B SaaS team in DACH at Series A came to us with one HubSpot lead grade that the inbound rep had stopped checking and the outbound rep had never trusted. The grade was a weighted average of inputs spanning both fit and behavior. The team's instinct was to retune the weights. The actual fix was structural, split the property in two, define a fit score from a demographic stack with no behavioral inputs, define an engagement score from a behavioral stack with a thirty-day recency decay, and let each team weight whichever property mapped to their actual next action. The data was already in HubSpot. It was just being averaged together at the wrong layer.

Resolution, a fit-and-engagement playbook

If you are about to build or rebuild HubSpot lead scoring, the steps below survive every variant of this project we have seen:

Decide the two properties first. Before any inputs or weights, agree that there will be a fit score and an engagement score as separate contact-level properties, not a combined grade. Name them clearly. The naming convention is the contract.
Build the fit score from demographic inputs only. Country, revenue band, employee band, industry, job title, seniority, ICP-flag from enrichment. No behavior. The fit score should change when the company changes, not when the contact opens an email.
Build the engagement score from behavioral inputs only, with recency decay. Opens, page views, demo requests, content downloads, pricing-page visits, meeting holds. Apply a recency weight that pulls older signals down automatically; thirty to ninety days is the usual band.
Validate against last quarter's outcomes. Pull closed-won and closed-lost-no-decision; check that the fit score separates them. Pull SQL-converted vs. SQL-disqualified inbound; check that the engagement score separates them. If a property does not separate its target outcome, the inputs are wrong.
Assign each team a primary property. Inbound queues from engagement, with fit as a sanity check. Outbound filters from fit, with engagement as a tiebreaker on the eligible list. Document this in the team's playbook so the score has a single owner of next-action.
Publish a distribution preview. Show the team how many contacts land in each band of each property. A score where ninety percent of contacts cluster in one band is not a score; it is a label. Tune the band cutoffs against the distribution before turning the property loose in routing.
Treat the score as directional. No scoring model is perfect at the contact level, especially at scale where free-email-domain contacts and partial enrichment will skew the lower bands. Publish the caveat alongside the score so the team uses it as a queue input, not a verdict.

If you do these seven steps, the score becomes the property the team actually queues from. If you skip them and ship one combined grade, you will be back inside HubSpot's scoring tool inside a quarter, retuning weights that were never the problem.

Where Checkpoint comes in

Score design is also stage-dependent. A seed-stage team working a hand-picked target list does not need a fit score; the list is the score. A late Series B team running blended inbound and outbound needs both properties working hard, with team-specific weighting, a structural pattern the Harvard Business Review made decades ago when it argued that sales-force design has to evolve with the business life cycle, not stay fixed at the shape that worked at the previous stage. [2] Score design is the same problem one layer down.

Splitting fit and engagement is one of the cleanest leverage points inside any HubSpot instance, but the work that makes the score get used is upstream, the ICP definition, the enrichment stack, the routing rules, the playbook the inbound rep opens at the start of the day. We do this kind of HubSpot operating-layer work as embedded RevOps inside client teams. If your lead grade has stopped meaning anything, that is usually the project. Talk to us about HubSpot lead scoring.

Sources

Harvard Business Review. “The New B2B Sales Imperative.” March 2017. hbr.org
Harvard Business Review. “Match Your Sales Force Structure to Your Business Life Cycle.” July 2006. hbr.org