Enigma Data
The Enigma data platform delivers intelligence on every U.S. business—from sole proprietorships to multinationals—powered by graph-model-1, a knowledge graph with 2.4 billion+ nodes mapping the U.S. business landscape.
Jump to:
- Data Model—how businesses are represented
- Data Sources—where the data comes from
- Data Quality—how accuracy is validated
- Data Delivery—how to access the data
Data Model
What exactly is a business? It's not as simple as it seems. Beneath familiar names and storefronts lies a web of identities, relationships, and operations that defy simple definitions:
- What is a business's name? (Legal names, trade names, and more.)
- How do franchises or multi-brand corporations fit into a single definition?
- What about businesses sharing addresses or operating multiple locations?
The Enigma data model is built upon four Core Entity Types:
- Brands: How a business presents itself to customers.
- Legal Entities: How a business is recognized by the government.
- Operating Locations: Sites where a business conducts its activities.
- Persons: The individuals linked to a business—owners, officers, registered agents, and other key personnel.
These are connected to one another through multiple Relationships. Entities and relationships are created when Enigma observes activity from the business based on real-world records.
Brands
A brand is the face of a business as seen by its customers. It includes trade names, logos, and marketing identities. Brands may:
- Operate across multiple locations.
- Be owned by one or more legal entities.
- Coexist with other brands under a shared corporate structure.
Example: Starbucks is a global brand known for coffee shops. Each store represents the same brand but is tied to distinct operating locations and potentially different legal entities in each country.
Legal Entities
Legal entities are how businesses are recognized by governments and regulatory bodies.
They are responsible for taxation, compliance, and legal accountability.
Two distinct types of Legal Entities exist in the Enigma data model: Registered Entity and Person. (In Object Oriented terminology, we can think of the Legal Entity as being a common abstract super class of Person and Registered Entity)
In the Enigma data model every Person or Registered Entity is related to one Legal Entity. This is because every Person or Registered Entity is itself a Legal Entity.
We represent both Registered Entities and Persons as Legal Entities because they share common traits. For example, they can each own property, are subject to legal liability and are required to pay taxes. Even though we don't agree that "Corporations are people", in the Enigma data model both corporations and people are Legal Entities.
You can distinguish between the two by inspecting the legalEntityType field on a LegalEntityName node—it will be "Person" for natural persons and a registration type (e.g., "LLC", "Corporation") for registered entities.
Registered Entities
Registered Entities are a class of Legal Entities that are registered with the government. LLCs and Corporations are typical examples of Registered Entities. Each Registered Entity is associated with one or more Registrations. A Registration is a filing of the entity with a Secretary of State for recognition.
Example: Starbucks Corporation is the registered entity that owns the Starbucks brand and in which shareholders own equity.
Persons and Roles
Persons are Legal Entities that represent the flesh-and-blood individuals that fill the roles that allow a business to function. As this description suggests, Person is tightly coupled to roles. Typically a Person is not directly associated with a particular Brand, Operating Location or Registered Entity but is related through a Role. A Role indicates what kind of responsibilities a Person performs for a business.
A single Person may have multiple roles within the same business and may have roles with multiple businesses. For example, we would consider a person who is the CEO and Founder of a business to have two distinct Roles:
- CEO
- Founder Even if a new CEO took over, the Person would retain their Founder role.
Example: Brian Niccol is the CEO of Starbucks Corporation. In the Enigma data model Brian Niccol is the Person, Starbucks is the Brand and Brian Niccol has the Role of CEO at the Brand Starbucks. Example: Jack Dorsey is the Founder of Twitter and the Founder and CEO of Block, Inc. In the Enigma data model this is represented as three distinct roles: Founder -> Twitter, Founder -> Block, CEO -> Block
Sole Proprietorships
Many businesses are structured as sole proprietorships or partnerships so no Registered Entity is involved. Sole Props and Partnerships still involve Legal Entities though since the Person(s) who own the business are considered Legal Entities. Let's compare how the ownership paths look in sole props and registered businesses
- Sole Prop: Person -> Legal Entity -> Brand
- Registered Business: Person -> Legal Entity -> Role -> Registered Entity -> Legal Entity -> Brand
This distinction matters in practice: when you need to identify the owners or principals of a business, you are looking for the persons who hold roles at a legal entity—surfaced through SoS registration filings. See Finding the Principals of a Business for a complete walkthrough.
Operating Locations
Operating locations represent the physical or virtual spaces where a business interacts with customers or conducts activities. Locations connect brands to legal entities, grounding abstract concepts in specific places.
Example: a single Starbucks store at 123 Main St. is an operating location. It's tied to the Starbucks brand and operates under a local legal entity.
Relationships Between Entities
The Enigma data model captures the relationships that connect these core entities, such as:
- Brand-to-Location: Which brands operate at which locations.
- Brand-to-Legal Entity: Which legal entities own or manage a brand.
- Location-to-Legal Entity: Which legal entities are responsible for specific operating locations.
- Person-to-Legal Entity (
person_is_instance_of_legal_entity): A natural person who is an instance of a legal entity—used to identify sole proprietors and individuals with legal standing in relation to a business. - RegisteredEntity-to-LegalEntity (
registered_entity_is_instance_of_legal_entity): A formally registered business organization that is an instance of a legal entity. - Registration-to-Role: Secretary of State filings name specific individuals or entities in roles (owner, officer, registered agent). This is the authoritative path for identifying who controls or operates a business.
The Complex Nature of Businesses
Businesses often include:
- Multiple Legal Entities: A company may establish separate legal entities for different locations or functions.
- Multiple Brands: Corporations like Gap Inc. operate distinct brands like Old Navy and Banana Republic.
- Affiliated Brands: Dealers (for example, Curry Honda) or co-locations (for example, Sephora at JCPenney).
- Franchises: Independent operators under a shared brand, such as McDonald's franchisees.
- Agents and Professionals: Individuals operating under umbrella brands (for example, "James Lavelle, State Farm Agent").
- Medical Providers: Patients often seek specific doctors who work within practices owned by larger health systems, blending individual and institutional branding.
- Persons as Brands: In services like hairstyling or therapy, individuals are the brand.
- Legal Entities as Brands: Some businesses use their legal name as their brand.
Data Sources
Enigma data is built on four primary source types that together power ground truth business identity.
Government sources form the authoritative foundation. Corporate registrations, Secretary of State filings, franchise disclosures, and professional licenses establish the legal identity map of U.S. businesses. Because every record originates from a government body, provenance is clear—making this layer essential for KYB, compliance, and due diligence workflows.
Online sources capture the operational reality of a business—websites, directories, review platforms, and social profiles. This data is inherently dynamic and often reflects changes in hours, location, or status before they appear in any official filing. That makes online sources a critical early signal layer, keeping entity profiles current between government record updates.
Third-party providers extend coverage with targeted insights such as enriched Firmographics and contact information. Rather than treated as ground truth, third-party data is weighted and reconciled against other sources to extend attribute breadth without compromising data integrity.
Card panel data is sourced from a consortium of issuer banks aggregating actual card transaction data at the merchant level. This delivers real revenue trends, transaction volumes, customer counts, and growth rates derived from what businesses actually earn—across 750M+ credit and debit cards in the U.S.
Data Quality
Enigma validates data quality continuously against labeled ground truth data. Key precision benchmarks:
| Area | Metric |
|---|---|
| Entity linking—brands to legal entities | 95% precision |
| Entity linking—brands to operating locations | 94% of brands have complete location links |
| Industry classification—NAICS code assignments | 98% precision |
| Location data—operating location status and addresses | 95% precision |
| Card revenue estimates—within ±30% of actual values | 67% of brands |
For a detailed walkthrough of the Enigma data quality methodology, see the Enigma Data Quality whitepaper. For card revenue accuracy specifics, see Understanding Card Revenue Data.
Data Delivery
The output of the Enigma data pipeline is available through three delivery channels:
- Bulk file delivery—for customers who want to work with the full dataset. Files are delivered in CSV or Parquet format to the Enigma Console, your S3 bucket, or an SFTP server. Best for list generation, market analysis, and offline enrichment workflows.
- Enigma API—for real-time query and enrichment use cases. The GraphQL API lets you search and retrieve entity data programmatically, record by record, and integrate it directly into your applications or underwriting workflows. See the GraphQL API guide.
- agentic workflows—for automated decisioning and orchestration. Enigma MCP server exposes entity data as tools that AI agents can call directly, enabling natural language queries and automated enrichment pipelines. See the AI & MCP integrations guide.