Knowledge graphs have transformed how we represent complex real-world relationships. They have supercharged our ability to understand and utilize data by providing an interconnected view of entities and their relationships. Knowledge graphs are versatile tools across many domains with the potential to organize, connect, and leverage information for improved decision-making, discovery, and user experiences. They are not a new category, but they have recently transformed areas such as crime detection, supply chain analysis, entity resolution, etc. In addition, they are working alongside LLMs to get more accurate and deterministic results.
Not all knowledge graphs are the same. There are two basic types: property graphs and semantic web. In this paper, we provide a simple introduction to the key capabilities associated with each, and we also introduce a unique approach provided by DataWalk.
As implied by the name, a knowledge graph provides a representation of an organization’s knowledge. A knowledge graph organizes and visualizes an organization’s data in an intuitive “graph” structure based on:
A knowledge graph aggregates data into an interconnected web to enhance understanding and discovery. Knowledge graphs provide a more flexible framework for data modeling than traditional relational databases. Relational databases represent technical references between tables, whereas Knowledge Graphs (KGs) extend further, encapsulating knowledge via connections. These include cause-and-effect relationships that are not typically captured in traditional databases. Knowledge graphs can be created using various approaches, including property graphs or Semantic Web.
Typically knowledge graphs require technical skills (e.g., programming skills), though there are some exceptions.
A property graph consists of nodes, links, and properties. Both nodes and links are first-order entities. A property graph doesn’t force a strict ontology and allows you to create attributes dynamically. The accompanying figure illustrates an example of a simple property graph describing Alice and her role as a programmer at Google.
Generally speaking, a property graph is an efficient, high-performance representation. It is ideal for running graph algorithms at scale and is human-readable.
The other key type of knowledge graph is based on Semantic Web. This is a World Wide Web Consortium standard for structuring data to be machine-readable using RDF (Resource Description Framework), OWL (Web Ontology Language), and SPARQL. All data is represented in triples, which consist of a subject, predicate, and object. The subject is the entity being described; the predicate describes the relationship between the subject and the object, and the object is the related entity. The figure illustrates the same example of Alice and her role as a programmer at Google, but now using Semantic Web.
Semantic Web is highly flexible, supporting changing the model as knowledge evolves. This type of knowledge graph requires a strict ontology that makes it particularly adept at exchanging data between systems or combining data from different systems since the definitions of the data elements are apparent. Graph algorithms can be run on the Semantic Web but don’t scale as well since the models can rapidly grow in size and complexity.
The differences between a property graph and the Semantic Web become clearer when we show how each model handles changes. Continuing our previous example of Alice at Google, we now want to indicate that Alice had two different managers at Google - Bob and Jane - and we want to include these managers in each of the models.
In the property graph, we could add one manager to the attributes in the link, but since we have no concept of time, we could only incorporate one manager. If we link each manager to Alice, we won’t know how they relate to Alice’s job. We would need to redesign the data model to incorporate the change. The lack of flexibility in the property graph model prevents us from efficiently representing the additional information.
Now, let’s add the information about Alice’s managers using Semantic Web. We can add triples for Bob and his job, Jane and her job, and the timing and reporting structure for each job. We could also add the missing triples for Jane and Bob’s Head of Engineering titles. As the figure illustrates, the Semantic Web is far more flexible than the property graph and easily accommodates the additional information into our model. However, triples quickly proliferate, which means Semantic Web KGs have lower performance than property graphs when handling large, complex datasets.
DataWalk combines the best characteristics of property graphs and the Semantic Web while eliminating the disadvantages of each. Like a property graph, DataWalk entities and links both have attributes. DataWalk also shares the efficient data representation of a property graph. In addition, DataWalk’s Semantic Web characteristics provide a flexible foundation that easily accommodates additional data and model evolution. The combination of flexibility and best-in-class performance is unique to DataWalk.
At first glance, the DataWalk knowledge graph looks like a property graph. But links can be treated as objects and can be linked like entities. So, Alice’s relationship with Google as a programmer can be linked to her manager. We can look at an example to see how the DataWalk knowledge graph, like the Semantic Web, can easily accommodate change.
Adding the information about Alice’s supervisors into DataWalk’s model, we can see how well it accommodates this change.
As the figure illustrates, we can add links for Bob as supervisor to Alice directly to the link representing Alice’s job as a programmer. We can then link Jane’s role as Alice’s manager in the same way. As with the Semantic Web, you can include this new information in the model. Conversely, with a property graph, this level of flexibility is simply not possible.
DataWalk’s approach includes other capabilities associated with the Semantic Web, including inference and adherence to an ontology if desired. Using a unique link-generating capability, DataWalk can apply inference rules. DataWalk can also use an ontology compatible with OWL so that DataWalk can exchange information with other systems using an OWL definition, either inside or outside your organization.
The DataWalk model retains the best scaling characteristics of the property graph as well. DataWalk has multiple benchmarks demonstrating its linear scalability for datasets with billions of entities, even those with hundreds of complex joins and multiple traversals. If you’re seeking the flexibility of a data model that will support changes and maintain human-readability over time, along with powerful graph analytics and unmatched scalability, then DataWalk’s knowledge graph is an ideal solution.
The graphic below compares the capabilities of the various knowledge graph approaches.
The DataWalk system is a robust, full-stack data analysis platform. It provides the fastest way to connect vast amounts of siloed data and discover trends, patterns, networks, and connections. The platform utilizes patented technologies to ensure that complex queries quickly complete, even with vast amounts of data. DataWalk’s unique no-code knowledge graph technology provides a flexible and highly performant foundation, but is only one element of the DataWalk data analytics platform, as reflected in the system diagram below.
Learn More About DataWalk knowledge graph software >