Knowledge cubes — A proposal for scalable and semantically-guided management of Big Data


[Download Slides]

A Knowledge Cube, or cube for short, is an intelligent and adaptive database instance capable of storing, analyzing, and searching data. Each cube is established based on semantic aspects, e.g., (1) Topical, (2) Contextual, (3) Spatial, or (4) Temporal. A cube specializes in handling data that is only relevant to the cube’s semantics. Knowledge cubes are inspired by two prime architectures: (1) Dataspaces that provides an abstraction for data management where heterogeneous data sources can co-exist and it requires no prespecified unifying schema, and (2) Linked Data that provides best practices for publishing and interlinking structured data on the web. A knowledge cube uses Linked Data as its main building block for its data layer and encompasses some of the data integration abstractions defined by Dataspaces. In this paper, knowledge cubes are proposed as a semantically-guided data management architecture, where data management is influenced by the data semantics rather than by a predefined scheme. Knowledge cubes support the five pillars of Big Data also known as the five V’s, namely Volume, Velocity, Veracity, Variety, and Value. Interesting opportunities can be leveraged when learning the semantics of the data. This paper highlights these opportunities and proposes a strawman design for knowledge cubes along with the research challenges that arise when realizing them.