Technology

 

DataWalk Abstraction

A primary and fundamental requirement of Enterprise-class analytic tools is scaling to large volumes of data. DataWalk achieves this using a horizontally scalable architecture for storing and processing data. Furthermore, DataWalk technology automatically solves three major problems associated with horizontal scalability, regardless of the business model or data mapping performed:

  1. Even distribution of data across multiple nodes
  2. No data rebalancing needed to execute queries
  3. Maximum information-join on stored content

These unique capabilities clearly differentiate DataWalk from other offerings in the marketplace.

Our unique, commercial-grade data storage solution provides flexible information management with high efficiencies required for deploying Enterprise-class analytical environments. DataWalk technology allows users to ask any questions in business terminology, without using SQL or other programming languages. This technology delivers fast, complex, multi-dimensional analyses that are completed on large, multi-billion record data sets.

Integrating different types and structures, from many sources, into one cohesive picture reflects a natural, human perception of information and makes DataWalk an easy-to-use system for performing complex analytics.

 
 

DataWalk Flexible Data Representation

Underlying data remains unchanged, regardless of the type of implementation, business logic or objects involved. The logical data structure is easily modified on the fly. There is no need to make changes to the physical model or disturb system operation to change this structure. The DataWalk structure is highly standardized; data is evenly distributed across many compute nodes to rapidly obtain answers. With DataWalk, the cost of changing the logical structure is so low (and easy) that you can experiment with the logical model and freely modify it in real time. For example, you can easily create new connections, edit existing ones, or add new sources and object descriptions.

 
 

DataWalk API – Communication

The DataWalk system is built in Java. Internally each component of the application has its own service (REST), and external access is similarly supported using APIs. Data and analyses done in DataWalk are easily made available to other programs. For example, data can easily be exported to the R Programming Language for statistical analysis, predictive analytics, or machine learning.

The connection to enterprise data stores is easily accommodated with DataWalk technology. To retrieve information from/to platforms such as Oracle, DB2, Hadoop, or other commercial systems, you can take advantage of RESTful access and JDBC / ODBC.

 
 

DataWalk Universe Viewer

The patented DataWalk Universe Viewer combines the view of all imported data sources through a graphical interface showing how all the data is interconnected. The Universe Viewer allows users to directly perform complex analytical queries (analyses, hypotheses) without requiring technical expertise. It allows precise identification of complex relationships between data, as well as rapid and immediate filtering of both directly and indirectly connected large data sets. Combining business modeling methods with data discovery and data blending creates simple, reproducible structures and analyses. Thus, the integration process is often several hundred times faster than traditional systems.

 
 
 
 

DataWalk Link Generator

The DataWalk Link Generator permits complicated analyses to be efficiently executed, based on advanced connection rules. Instead of fixed tables and preprogrammed, predesigned analytical flows, DataWalk supports flexible data connections in a logical layer. When something is changed, the entire analytic process updates without the need for programming or interrupting system operation.

A link calculates and stores information on relationships between objects. Links can be generated based on simple rules (e.g., Field A = Field B) or with more advanced business rules to connect data, even in the absence of a primary key – foreign key relation. Links are aggregated on the fly to generate fast and accurate results.

 

 
 

DataWalk Drop Folders

DataWalk uses a very flexible, adaptable, and generic method for importing content. Each time a new structure is designed in DataWalk, the system generates “Drop Folders,” which are compatible with CSV files (or derived from APIs, JSON, or JDBC output) and MS Excel Files. When a CSV file appears in a newly created folder, DataWalk automatically maps its headings to the related structure, and imports data and calculates all relevant connections. Also, the system has a built-in “Add Excel” function to add new data in .XLSX files, connecting them to the analytical environment in just a few seconds. With “Add Excel” it is possible to carry out an analysis or add new filters or conditions to existing analyses.

 
 

DataWalk User Permissions & Access Control

A major challenge with analysis of sensitive data is guaranteeing that data and the results of system processing are consistent with user privileges. DataWalk explicitly addresses this challenge with three levels of privileges:

  • Access to sets of objects per user
  • Access to an attribute of an object per user
  • Access to an object using access filters per user

Access filters

The system administrator defines the filters, per dataset, for a given user (or group). The filters are applied transparently each time the system is queried by the user. The added value of filters is supported by the following features:

  • The access rights are not demanding; they are processed by the system while performing a query, not after the query has been processed, which increases efficiency
  • The access rights are manageable and do not affect system efficiency. Filters set on calculating columns automatically change the objects according to data; filters can be set on columns which cannot be accessed by a user.
 
 

DataWalk LDI – Real Time Data Adds

DataWalk Live Data Insertion (LDI) allows the system to acquire data on the fly, without impacting performance. The system can handle processes that require a constant data upload without interrupting system operations. There is no need for taking the system offline or scheduling maintenance windows.

 
 

DataWalk Object Search – A Search Engine

DataWalk enables you to quickly find objects of interest across various datasets. The object could be a customer, a contract, a facility, an insurance policy, or an incident. The results show in which set(s) a searched element is found, and are presented in tabs of specific sets sorted according to the degree of matching the primary search term. You can move directly from the object search to the analysis of the chosen object.

 

Want more details about DataWalk technology? Get our Technical FAQ