Files
Abstract
Reachability analysis is a fundamental operation in many applications that work with graphs as underlying data structure. It has been extensively studied from the context of static graphs. In most modern computing domains, these graphs are no longer static (they evolve over time) and they are massive. Existing platforms and algorithms built for answering reachability queries in static graphs results in huge performance costs when applied on evolving graphs. The large size of graphs and their evolving nature calls for new, scalable and efficient algorithms/systems for answering reachability queries. In this research, we propose a generic, scalable, time-and space- efficient frameworks for reachability analysis in massively evolving graph structured data. We leverage our framework for answering queries in large evolving XML documents.We have created frameworks that can efficiently answer snapshot-specific reachability queries as well as continuous reachability queries in evolving graphs, where a snapshot-specific reacha- bility query seeks to find out whether a vertex w is reachable from another vertex v in a given snapshot of a structure and a continuous reachability queries are issued only once and then logi- cally run continuously over the evolving data set to find the status(reachable/unreachable) of given set of queries.We introduce SCISSOR as a generic indexing and querying framework for answering snapshot- specific queries in large time-evolving hierarchies. The central idea is to maintain indexes for an interspersed subset of snapshots. An reachability query on a non-indexed snapshot will be pro- cessed by first answering the same query on the temporally-closest indexed-snapshot, and then progressively modifying the solution to reflect the effects of the changes occurring between the queried snapshot and indexed snapshot. We have also designed a highly efficient, scalable and provably correct algorithm for analyzing the effects of changes between the queried and the near- est indexed snapshot. we leverage this framework to propose a tunable, time and space efficient algorithms to evaluate XPath expressions on continuously evolving XML document (CEXML) repositories. We introduce a new class of XPath expression queries for CEXML document called version-specific XPath queries.We introduce CoUPE as a generic indexing and querying engine for answering continuous queries in evolving graphs. We have designed an efficient algorithm for updating the indices of graph by analyzing the changes happening to the graph. These indices combined with a novel heuristic helps us figuring out the status of the queries.