DataWalk is a full-stack analytics platform which is architected to seamlessly scale to handle vast amounts of data, regardless of the shape of your data and the questions you want to ask of your data:
- Ingest without extensive preparation. Use either passive sinks (e.g., DataWalk drop folders), or active tools such as NiFi (integrated in DataWalk) or any ETL tool. No need to worry about how you will want to later use or analyze the data. High performance, full security, and no need to map extensive ontologies.
- Transform data while it's in the system, so that it takes the shape that you want. DataWalk can automatically repeat any transformation steps with new data.
- Store data with full compression on a highly scalable (scale-out), secure storage infrastructure. No need to worry about segmentation, queries, permissions, or load, as DataWalk automatically ensures maximum performance regardless of the shape of the data or the questions you want to ask. You can treat storage as either a long-term cache, or entity storage.
- Query data visually through the DataWalk Universe Viewer. Ask any question of your data without limitations; there is no query that is too big.
- Visualize your result, no matter how large it is.
- The DataWalk API enables you to use all of the above as part of your automated workflow.
- User Interface for all of the above eliminates the cost and delays associated with relying on data scientists and scripting.
A primary and fundamental requirement of Enterprise-class analytic tools is scaling to large volumes of data. DataWalk achieves this using a horizontally scalable architecture for storing and processing data. Furthermore, DataWalk technology automatically solves three major problems associated with horizontal scalability, regardless of the business model or data mapping performed:
- Even distribution of data across multiple nodes
- No data rebalancing needed to execute queries
- Maximum information-join on stored content
These unique capabilities clearly differentiate DataWalk from other offerings in the marketplace.
Our unique, commercial-grade data storage solution provides flexible information management with high efficiencies required for deploying Enterprise-class analytical environments. DataWalk technology allows users to ask any questions in business terminology, without using SQL or other programming languages. This technology delivers fast, complex, multi-dimensional analyses that are completed on large, multi-billion record data sets.
Integrating different types and structures, from many sources, into one cohesive picture reflects a natural, human perception of information and makes DataWalk an easy-to-use system for performing complex analytics.
DataWalk Flexible Data Representation
Underlying data remains unchanged, regardless of the type of implementation, business logic or objects involved. The logical data structure is easily modified on the fly. There is no need to make changes to the physical model or disturb system operation to change this structure. The DataWalk structure is highly standardized; data is evenly distributed across many compute nodes to rapidly obtain answers. With DataWalk, the cost of changing the logical structure is so low (and easy) that you can experiment with the logical model and freely modify it in real time. For example, you can easily create new connections, edit existing ones, or add new sources and object descriptions.
DataWalk API – Communication
The DataWalk system is built in Java. Internally each component of the application has its own service (REST), and external access is similarly supported using APIs. Data and analyses done in DataWalk are easily made available to other programs. For example, data can easily be exported to the R Programming Language for statistical analysis, predictive analytics, or machine learning.
The connection to enterprise data stores is easily accommodated with DataWalk technology. To retrieve information from/to platforms such as Oracle, DB2, Hadoop, or other commercial systems, you can take advantage of RESTful access and JDBC / ODBC.
DataWalk Universe Viewer
The patented DataWalk Universe Viewer combines the view of all imported data sources through a graphical interface showing how all the data is interconnected. The Universe Viewer allows users to directly perform complex analytical queries (analyses, hypotheses) without requiring technical expertise. It allows precise identification of complex relationships between data, as well as rapid and immediate filtering of both directly and indirectly connected large data sets. Combining business modeling methods with data discovery and data blending creates simple, reproducible structures and analyses. Thus, the integration process is often several hundred times faster than traditional systems.
DataWalk Link Generator
The DataWalk Link Generator permits complicated analyses to be efficiently executed, based on advanced connection rules. Instead of fixed tables and preprogrammed, predesigned analytical flows, DataWalk supports flexible data connections in a logical layer. When something is changed, the entire analytic process updates without the need for programming or interrupting system operation.
A link calculates and stores information on relationships between objects. Links can be generated based on simple rules (e.g., Field A = Field B) or with more advanced business rules to connect data, even in the absence of a primary key – foreign key relation. Links are aggregated on the fly to generate fast and accurate results.
DataWalk Drop Folders
DataWalk uses a very flexible, adaptable, and generic method for importing content. Each time a new structure is designed in DataWalk, the system generates “Drop Folders,” which are compatible with CSV files (or derived from APIs, JSON, or JDBC output) and MS Excel Files. When a CSV file appears in a newly created folder, DataWalk automatically maps its headings to the related structure, and imports data and calculates all relevant connections. Also, the system has a built-in “Add Excel” function to add new data in .XLSX files, connecting them to the analytical environment in just a few seconds. With “Add Excel” it is possible to carry out an analysis or add new filters or conditions to existing analyses.
DataWalk User Permissions & Access Control
A major challenge with analysis of sensitive data is guaranteeing that data and the results of system processing are consistent with user privileges. DataWalk explicitly addresses this challenge with three levels of privileges:
- Access to sets of objects per user
- Access to an attribute of an object per user
- Access to an object using access filters per user
The system administrator defines the filters, per dataset, for a given user (or group). The filters are applied transparently each time the system is queried by the user. The added value of filters is supported by the following features:
- The access rights are not demanding; they are processed by the system while performing a query, not after the query has been processed, which increases efficiency
- The access rights are manageable and do not affect system efficiency. Filters set on calculating columns automatically change the objects according to data; filters can be set on columns which cannot be accessed by a user.
DataWalk LDI – Real Time Data Adds
DataWalk Live Data Insertion (LDI) allows the system to acquire data on the fly, without impacting performance. The system can handle processes that require a constant data upload without interrupting system operations. There is no need for taking the system offline or scheduling maintenance windows.
DataWalk Object Search – A Search Engine
DataWalk enables you to quickly find objects of interest across various datasets. The object could be a customer, a contract, a facility, an insurance policy, or an incident. The results show in which set(s) a searched element is found, and are presented in tabs of specific sets sorted according to the degree of matching the primary search term. You can move directly from the object search to the analysis of the chosen object.