Search in the site by keyword

Publications - Paper

Chimera: A Bridge Between Big Data Analytics and Semantic Technologies

Publications - Paper

Chimera: A Bridge Between Big Data Analytics and Semantic Technologies

In recent decades, enhanced Knowledge Graph (KG) analysis has been used to extract advanced insights from data.

Various companies have integrated legacy relational databases with semantic technologies using Ontology-Based Data Access (OBDA). In practice, this approach allows analysts to write SPARQL queries on both Knowledge Graphs (KGs) and relational SQL data sources, making most implementation details transparent. However, the volume of data is continuously increasing, and a growing number of companies are adopting distributed storage platforms and distributed computing engines. There is a gap between big data and semantic technologies. For instance, one of the reference OBDA systems is limited to legacy relational databases and still lacks compatibility with the big data analysis engine Apache Spark. This paper introduces Chimera, an open-source software suite designed to bridge this gap. Chimera enables a new type of bidirectional data science pipeline. Data scientists can query data stored in a data lake using SPARQL through Ontop and SparkSQL, saving the semantic results of such analysis back into the data lake. This new type of pipeline semantically enriches Spark data before saving it again.

Projects

Comments