Difference between revisions of "OLAP4LD Demo at ESWC 2014"
From www.b-kaempgen.de
m (1 revision: From LD-Cubes) |
|
(No difference)
|
Latest revision as of 13:24, 4 January 2023
On this page, we collect information about our demonstration at ESWC 2014.
In our demonstration we will show how changes in modelling are propagated to LDCX by live modifying a published QB dataset. Also, we show common modelling errors in existing QB datasets such as missing dimension rdfs:range or qb:CodeList and observations not adhering to data structure definitions.
Contents
Demonstrating the three-step interface
What will the audience learn? How to explore one dataset.
- Select dataset -> Explore dataset...
- Select measures
- Select dimensions on rows and columns -< Update table...
FAQ
- How does the query look like? MDX query, since datasets are represented as data cubes. Why not directly a SPARQL query? Because OLAP application designers do not know about SPARQL. MDX is specifically designed for analytical queries over multidimensional datasets (cube,measures,dimensions).
SELECT /* $session: 2e72789e-08d7-d14d-2450-c9f4004b04c1 */ NON EMPTY CrossJoin({[httpXXX3AXXX2FXXX2Folap4ldYYYgooglecodeYYYcomXXX2FgitXXX2FOLAP4LDZZZtrunkXXX2FtestsXXX2Fssb001XXX2FttlXXX2FexampleYYYttlXXX23lo_quantity],[httpXXX3AXXX2FXXX2Folap4ldYYYgooglecodeYYYcomXXX2FgitXXX2FOLAP4LDZZZtrunkXXX2FtestsXXX2Fssb001XXX2FttlXXX2FexampleYYYttlXXX23lo_revenue]}, {Members([httpXXX3AXXX2FXXX2Folap4ldYYYgooglecodeYYYcomXXX2FgitXXX2FOLAP4LDZZZtrunkXXX2FtestsXXX2Fssb001XXX2FttlXXX2FexampleYYYttlXXX23lo_suppkeyCodeList])}) ON COLUMNS , NON EMPTY CrossJoin({Members([httpXXX3AXXX2FXXX2Folap4ldYYYgooglecodeYYYcomXXX2FgitXXX2FOLAP4LDZZZtrunkXXX2FtestsXXX2Fssb001XXX2FttlXXX2FexampleYYYttlXXX23lo_custkeyCodeList])}, {Members([httpXXX3AXXX2FXXX2Folap4ldYYYgooglecodeYYYcomXXX2FgitXXX2FOLAP4LDZZZtrunkXXX2FtestsXXX2Fssb001XXX2FttlXXX2FexampleYYYttlXXX23lo_orderdateCodeList])}) ON ROWS FROM [httpXXX3AXXX2FXXX2Folap4ldYYYgooglecodeYYYcomXXX2FgitXXX2FOLAP4LDZZZtrunkXXX2FtestsXXX2Fssb001XXX2FttlXXX2FexampleYYYttlXXX23ds]
- How to drill-down? By adding dimensions.
- What is done in the backend? 1. Loading data cubes. Running normalisation algorithm, checking integrity constraints... 2. Executing MDX query over the data cubes.
Live modifying a published QB dataset
What will the audience learn?
- Pastebin Example Star Schema Dataset: http://pastebin.com/raw.php?i=839G2u72#ds
- Pastebin: http://pastebin.com/839G2u72
- Example modifications:
- Change label of dataset.
- Change discount of first observation.
- Add a new dimension so that error is thrown. ("Failed specification check: IC-4. Dimensions have range. Every dimension declared in a qb:DataStructureDefinition must have a declared rdfs:range.", "Failed specification check: IC-11. All dimensions required. Every qb:Observation has a value for each dimension declared in its associated qb:DataStructureDefinition.")
[ qb:dimension :lo_superkey ]
- Remove rdfs:range or qb:CodeList for skos:Concept dimensions.
Drill-Across Queries
What will the audience learn?
- If you want to query over several datasets, you can use a comma-separated list of datasets.
- http://estatwrap.ontologycentral.com/id/tsdcc310#ds,http://estatwrap.ontologycentral.com/id/t2020_rd310#ds,http://estatwrap.ontologycentral.com/id/tsdec360#ds,http://estatwrap.ontologycentral.com/id/t2020_rd300#ds,http://estatwrap.ontologycentral.com/id/t2020_31#ds,http://estatwrap.ontologycentral.com/id/t2020_50#ds,http://estatwrap.ontologycentral.com/id/t2020_51#ds,http://estatwrap.ontologycentral.com/id/t2020_52#ds,http://estatwrap.ontologycentral.com/id/t2020_53#ds
Example datasets:
Energy dependence: http://estatwrap.ontologycentral.com/id/tsdcc310 (2001-2012) Energy productivity: http://estatwrap.ontologycentral.com/id/t2020_rd310 (2000-2012) Energy intensity: http://estatwrap.ontologycentral.com/id/tsdec360 (2001-2012) Greenhouse gas emissions per capita: http://estatwrap.ontologycentral.com/id/t2020_rd300 (2000-2011) Share of renewable energy: http://estatwrap.ontologycentral.com/id/t2020_31 (2004-2012) People at risk of poverty or social exclusion: http://estatwrap.ontologycentral.com/id/t2020_50 (2004-2012) People living in households with very low work intensity: http://estatwrap.ontologycentral.com/id/t2020_51 (2004 - 2012) People at risk of poverty after social transfers: http://estatwrap.ontologycentral.com/id/t2020_52 (2003-2012) Severely materially deprived people: http://estatwrap.ontologycentral.com/id/t2020_53 (2003-2012)
Common modelling errors
What will the audience learn?
Missing range
- Transparency International Linked Data - Corruption Perceptions Index 2011: http://transparency.270a.info/dataset/CPI2011
- Missing rdfs:range for dimension "source" [1]: "Failed specification check: IC-4. Dimensions have range. Every dimension declared in a qb:DataStructureDefinition must have a declared rdfs:range."
No resolveable URIs
- COINS - 2006-2007 dataset: http://source.data.gov.uk/dataset/coins/coins_fact_table_2006_2007 and http://finance.data.gov.uk/coins/coins_fact_table_2006_2007
- More information, see PlanetData and COINS.
No DataStructureDefinition
- Average annual producer price indices of industrial products, CA 1996 (previous year = 100): http://elpo.stat.gov.rs/lod2/RS-DATA/Prices/Annual_producer_price_indices_of_industrial_products_CA_1996/data_2011_12_06
- DSD: http://elpo.stat.gov.rs/lod2/RS-DATA/Prices/dsd#Annual_producer_price_indices_of_industrial_products_CA_1996
No properly modelled cube
- Reused Eurostat Linked Data Wrapper (http://estatwrap.ontologycentral.com/) to rdfize Eurostat datasets (http://epp.eurostat.ec.europa.eu/): http://eurostat.linked-statistics.org/data/tgs00003
- Dsd talks about dcterms:date, observations use timePeriod, timePeriod has range resource and not Literal.