Arguments for Using the RDF Data Cube Vocabulary
From www.b-kaempgen.de
On this page we describe the advantages of using the RDF Data Cube Vocabulary for representing multidimensional datasets.
Contents
Arguments for using QB
QB is fulfilling requirements for sharing of Data Cubes [HB03] (under construction):
• The format has to support a multidimensional data model. We have shown that QB can be mapped to a multidimensional data model in previous work [KH11]. More detail we will provide in Chapter 6. • The conceptual distinction between the description of schema, master or dimension data and transaction or fact data has to be supported. QB distinguishes qb:DataStructureDefiniti with qb:DimensionProperty for the schema and qb:DataSet with qb:Observation for fact data that in Linked Data also can be provided at different locations. • The format has to be transportable over a network, primarily over the Internet. As a Linked Data vocabulary, QB provides this requirement automatically. • To achieve a high level of flexibility and reuse the format has to support linking and inclusion concepts. XCube allows the linkage of schema, dimension and fact data and highlights the development of standardised reference dimensions. Beyond that, Linked Data and QB allow reuse of any single multidimensional element. • The format should be extensible to be able to adapt to different data models or to introduce new concepts. As shown by Etcheverry and Vaisman [EV12b, EV12a] extension of QB is possible. • The format must be easily convertible to and from various data sources and formats. RDF/XML representations of QB data allow XSLT transformations. However, the structure of RDF/XML is more complex than XCube. In several case studies we have shown that QB can be used to represent heterogeneous data sources and formats, e.g., SDMX, XBRL, sensor data [KC13]. Converting QB data to other RDF schemas, and from there to other data formats, also is possible using SPARQL 1.1 CONSTRUCT or SELECT queries. • The format could possibly allow online analytical processing (OLAP) to reduce the amount of data to be transferred over the network. Etcheverry and Vaisman [EV12b, EV12a] have shown how to merge locally stored datasets and Web-distributed datasets that are represented with an extension of QB using OLAP operations. We have shown how to map OLAP operations to SPARQL queries over QB datasets in previous work [KOH12] and will provide details in Chapter 7.