Adaptive RDF triple partitioning for distributed SPARQL query processing

Shrivastava, Yash

Shrivastava, Yash

2018

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Add to Basket

Files

Abstract

Resource Description Framework (RDF) has been extensively used to represent the data for Semantic Web in recent times. Due to a large amount of RDF data, it is difficult to store it in a single system and query it using SPARQL. Instead, it is possible to partition the data into subsets and then query it using federated SPARQL queries. There are many challenges related to distributed querying: for instance, the processing time for a query increases in proportion to the number of distributed joins. We present a study on the impact of query- adaptive partitioning of the RDF data. We present a system called RePart that shuffles the data among the nodes of the cluster according to the incoming query workload to reduce the number of distributed joins while querying. Our evaluation based on several benchmarks demonstrates that the performance of federated queries is improved after performing the repartitioning of the triples according to the query-workload.

Details

Record ID

13420

Record Created

2024-12-05

Title

Adaptive RDF triple partitioning for distributed SPARQL query processing

Author

Shrivastava, Yash

Contributor

Kochut, Krzysztof J. Advisor
Arabnia, Hamid R. Committee Member
Arpinar, Ismailcem Budak Committee Member

College or School

Computer Sciences

Date

2018

Publisher

University of Georgia

Content Type

Thesis

Language

English

Dissertation/ Thesis Note

Graduate

Degree Type

Master of Science (MS)

Name of Granting Institution

University of Georgia, Summer 2018

Year Degree Granted

2018

Keywords

RDF; RDF Partitioning; Workload Adaptive Partitioning; Ontologies; Federated Query

Record Appears in

Electronic Theses and Dissertations > Graduate Thesis
All Resources

System Control Number

9949334100102959

PDF

Statistics

Download Full History