What is an identity graph and how does it work?

An identity graph is a database that maps all known identifiers for a given person, including email addresses, phone numbers, device IDs, postal addresses, and digital tokens like RampIDs or UID 2.0, to a single persistent profile. The graph answers one question: are these different identifiers the same human being? Connections can be deterministic (exact match on a shared identifier) or probabilistic (statistically inferred from behavioral and household signals), and a scored system like BIGDBM's attaches a 0-100 confidence index to every link.

How are identity graphs evolving beyond cookie-based tracking?

The deprecation of third-party cookies and Apple's App Tracking Transparency framework forced the industry toward durable, consent-based identifiers: hashed emails, phone numbers, postal addresses, and publisher-based login IDs like RampID and UID 2.0. Modern identity graphs are being rebuilt around these persistent first-party signals rather than browser cookies, which makes them more stable across devices and more defensible under privacy regulations.

What is scored identity resolution and why does it matter?

Scored identity resolution assigns a confidence index from 0 to 100 to every identity match, rather than a binary accept or reject. This lets teams calibrate precision to the stakes of the use case: using only high-confidence matches (85+) for suppression where a false positive means contacting an opt-out, and wider thresholds (65+) for prospecting where reach matters more. BIGDBM's Scored Identity Resolution is built specifically for threshold-based workflows that make the match quality visible rather than treating match rate as a single opaque number.

How does privacy compliance affect identity graph technology?

CCPA, CPRA, and state privacy laws require that consumers can access, delete, and correct the data held about them. At the identity graph level, this means a deletion request must propagate across every link associated with that consumer: all connected email addresses, device IDs, and downstream activations. The only sustainable approach is building privacy-by-design from the start, with opt-out suppression lists baked into every data request, consent management, and audit trails that can reconstruct the chain of consent on demand.

What does next-generation identity resolution look like?

Next-generation identity resolution combines AI-powered match scoring with privacy-safe infrastructure and real-time refresh. Instead of static batch files updated monthly, modern graphs refresh continuously as identifiers change. AI-driven confidence scoring replaces rule-based thresholds, improving accuracy on thin-signal populations. And privacy engineering is built in from the start, not bolted on: consent propagation, deletion workflows, and permitted-use controls are infrastructure-level capabilities, not compliance afterthoughts.

The Future of the Identity Graph in 2026

Every time a consumer clicks an ad on their phone, browses a product page on their laptop, and then walks into a store three days later, they leave behind three separate breadcrumb trails. Without identity resolution, those trails belong to three different strangers. With it, they belong to the same person, and your marketing can reflect that.

Identity resolution is the technology and process of connecting disparate data points, email addresses, device IDs, postal addresses, phone numbers, browser cookies, to a single, persistent consumer profile. It sounds straightforward, but doing it accurately, at scale, and in a privacy-compliant way is one of the hardest problems in modern marketing.

Why Identity Resolution Matters Now More Than Ever

Third-party cookies are disappearing. Apple's App Tracking Transparency (ATT) framework has made mobile device IDs unreliable. Meanwhile, consumer journeys have fragmented across more channels than ever: connected TV, retail media, social, search, email, SMS, in-store.

The brands that thrive in this environment are those with a first-party identity foundation, a clean, persistent, consented view of their customers that doesn't depend on browser cookies or platform intermediaries. Identity resolution is how you build that foundation.

The Three Layers of Identity Resolution

Deterministic matching connects records based on exact shared identifiers, the same email address appears in two systems, so they're the same person. This is the most accurate method but requires consumers to have actively shared the same identifier across touchpoints.

Probabilistic matching uses statistical inference: if a device at 42 Oak Street, in the same household as a known customer, visits your site at 9 PM, that's likely the same person or household. Probabilistic matches enable scale that deterministic alone cannot reach, but they introduce a confidence variable that needs to be managed.

Scored resolution, the approach BIGDBM uses, assigns a confidence index to every identity link. Instead of a binary "match / no match," you get a number from 0–100 that tells you how confident the system is. This lets downstream teams choose their own precision-vs-scale tradeoff: use only 90+ scores for high-stakes suppression, open up to 70+ for broad prospecting.

What Lives in an Identity Graph

An identity graph is the persistent store of all those resolved connections. A well-built graph links:

• People, name, DOB, gender
• Contacts, verified email addresses, phone numbers (mobile and landline)
• Households, address history, household composition
• Devices, mobile ad IDs, hashed device fingerprints
• Digital identifiers, hashed emails (SHA-256, MD5), RampIDs, UID 2.0 tokens

The quality of the graph is only as good as the freshness and sourcing of its inputs. Stale addresses, recycled phone numbers, and inferred rather than observed connections all degrade match quality. Audit-friendly lineage, knowing exactly where each link came from, is what separates a defensible graph from a black box.

Privacy Compliance Is Not Optional

CCPA, GDPR, and a growing patchwork of US state privacy laws require that consumers can find out what data is held about them and request deletion. This is operationally hard at the identity graph level: if a consumer's email appears in 14 records connected to three device IDs, a deletion request means finding and suppressing all 14.

The only sustainable approach is building privacy-by-design from the start: consent management, opt-out suppression lists baked into every data request, permitted-use controls that restrict data to allowed purposes, and audit trails that can reconstruct the chain of consent on demand.

How to Evaluate an Identity Provider

When assessing a partner, ask these five questions:

1. What is the match rate on my own first-party file? Run a test with a known segment. Match rate tells you coverage; confidence score distribution tells you quality.

2. How often is the graph refreshed? Phone numbers and addresses change constantly. A graph that refreshes quarterly will degrade fast in high-churn segments like renters or mobile-heavy demographics.

3. Can you show lineage? Every link should trace back to an observed source, not just an inference. If the provider can't explain a match, you can't defend it.

4. How are opt-outs handled end-to-end? Suppression needs to propagate from the graph through all downstream activations in near-real-time, not in a monthly batch.

5. What happens when the data is wrong? Errors happen. A mature provider has a dispute resolution process and a commitment to correcting confirmed mistakes promptly.

The Bottom Line

Identity resolution isn't a feature you buy once and forget. It's an ongoing capability that requires high-quality source data, transparent methodology, privacy engineering, and a partner willing to be accountable for the accuracy of every connection.

Done right, it turns fragmented signals into a coherent customer view, one that survives the deprecation of cookies, respects consumer preferences, and makes every downstream marketing dollar work harder.