In the high-stakes worlds of law enforcement, intelligence, and financial crime prevention, information is power. But what happens when that information is buried in millions of pages of unstructured text, like emails, scanned reports, and PDFs, and a constant stream of new data arrives every week? The real bottleneck is not ingestion — it's the time it takes to transform messy text into actionable, contextualized intelligence. Investigators don’t just need facts — they need relationships, timelines, and meaning. And they need it now.
Traditional approaches to analysis of unstructured data face two critical challenges: scale and the inability to extract context. Most legacy unstructured data analysis tools rely on basic indexing and keyword searches. While these simple unstructured data analysis techniques can help locate a specific word or entity type, they completely miss the bigger picture. They miss the nuance of language, the relationships between any types of entities — people, places, documents, or accounts — and the reasons they matter in context. This limitation is a massive bottleneck, especially when dealing with the enormous volumes of data that organizations handle today. For massive datasets, even advanced methods can fall apart, leaving analysts to piece together fragments of information manually—a time-consuming and error-prone process.
This is why advanced AI, specifically designed for contextual entity and relationship extraction, is critical for financial crime compliance, anti-money laundering (AML) operations, and fraud prevention, as well as for intelligence and law enforcement agencies.
DataWalk goes beyond simple text analysis. It’s a comprehensive solution that swiftly transforms raw, unstructured data into an ontology-driven knowledge graph ready for immediate insights and analysis.
Figure 1: Entities and relationships extracted and shown in DataWalk’s Universe Viewer.
Being ontology-driven means that DataWalk uses a structured framework—a predefined map of concepts, entities (like people, locations, and companies), and their relationships. DataWalk’s AI aligns its extraction process with this map, understanding the meaning and context behind the data rather than simply pulling out keywords. This allows new intelligence to be seamlessly added to an existing knowledge model without requiring it to be rebuilt from scratch.
Here’s a closer look at the unique processes that set DataWalk apart for unstructured data analysis:
Context and Relationship Extraction:
DataWalk’s AI processes the context surrounding entities to automatically identify and extract relationships. For example, it can determine not only that "John Doe" and "Acme Corp" are mentioned, but also that "John Doe is the CEO of Acme Corp."
Figure 2: Example of relationships extracted by DataWalk’s AI
Entity Resolution and Matching:
As data is processed, DataWalk not only resolves and matches entities using standard techniques like name and attribute matching, but also applies inferencing to connect less obvious cases. For example, even if "John Doe" and "J. Doe" don’t share exact identifiers, the system may infer they are the same person if both are linked to the same address, organization, and phone number. By combining direct matching with intelligent inference, DataWalk creates a single, unified view of each entity—even when the connections are subtle or incomplete.
Figure 3: DataWalk inferencing
Inferencing and Pattern-Based Risk Scoring:
Beyond resolving entities, DataWalk’s inferencing engine automatically uncovers hidden patterns and relationships that are not explicitly stated in the source data. For instance, if two individuals are linked to the same physical address, work history, or IP activity, the system may infer a possible association—even if it's not directly mentioned. These inferred connections can then trigger pattern detection rules that identify suspicious behaviors and dynamically adjust risk scores. This enables a continuous view of emerging threats and complex criminal structures, empowering analysts to act before issues escalate.
Full Lineage:
Every step of the process is logged, creating a full audit trail. This is crucial for regulatory compliance and investigations, providing full transparency on how a piece of information was derived.
One European intelligence agency processes hundreds of new text-based data sources every week, often losing valuable context in the process.
With constantly changing and highly diverse information, manually building and updating ontologies proved impractical and unsustainable. By using DataWalk to automatically transform raw, unstructured documents into a structured knowledge graph aligned with their existing ontology, the agency was able to preserve critical context, ensure consistency, and dramatically accelerate time-to-analysis. This shift enabled analysts to detect patterns earlier, connect intelligence faster, and stay ahead of emerging threats.
Figure 4: DataWalk’s real-world results in a nutshell
The agency put DataWalk AI to the test. A high-volume dataset containing 146 million characters was processed, resulting in the extraction of 948,314 entities like people, organizations, transactions, events in just 34 minutes. Crucially, DataWalk AI also successfully extracted 691,078 relationships, providing a rich, contextual understanding of the data that was instantly ready for analysts to explore. This rapid, end-to-end process empowered the customer to maintain a timely and comprehensive view of their intelligence landscape, solidifying DataWalk as a mission-critical platform for turning raw data into real intelligence.
Figure 5: Entity and relationships displayed on a link chart
DataWalk’s inferencing engine automatically discovered new connections that were never explicitly stated in the source text, such as inferring a relationship between two people who work at the same address. These insights, combined with automated risk scoring that updated dynamically, gave the agency a comprehensive view of potential threats—a capability they couldn't achieve with their previous methods.
DataWalk delivered significant, quantifiable benefits when it came to unstructured data analysis:
About DataWalk: Transform Your Raw Data into Actionable Intelligence with AI
DataWalk is a scalable Graph & AI platform that transforms complex, disparate data into actionable insights for the most demanding analyses and investigations, including in unstructured data analysis techniques. DataWalk enables you to create a central, unified knowledge base through ontology and knowledge graphs, linking diverse data sources into a logical structure and enabling the automatic discovery of new relationships. With its high-speed data processing, low latency, and Composite AI capabilities, DataWalk enables organizations to conduct comprehensive analysis of unstructured data on massive datasets, uncover hidden connections, and make faster, more informed decisions. DataWalk also provides military-grade security, logging all activities for full traceability and audit trails, ensuring compliance with all regulatory requirements.
Markus Hartmann is an expert in leveraging advanced AI and graph technologies to transform complex unstructured data into actionable intelligence. His expertise lies in developing solutions that uncover hidden relationships and accelerate critical analysis for high-stakes applications like financial crime prevention and law enforcement.
Contact