A fast algorithm for subgraph pattern matching on large labeled graphs

Saltz, Matthew Wyatt

Saltz, Matthew Wyatt

2013

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Add to Basket

Files

Abstract

In recent years, the robustness of graphs for representing complex data has led to a proliferation of research on graph databases and analytics. One important topic in the field is graph pattern matching, which can be used, for example, in the processing of queries in graph databases. Though fast querying is highly desirable, pattern matching algorithms are hindered by the NP-completeness of the subgraph isomorphism problem. This paper presents a conceptually simple, memory-efficient, pruning-based algorithm for the subgraph isomorphism problem that outperforms commonly used algorithms by orders of magnitude on large labeled graphs. This speedup is due in large part to the effectiveness of the pruning algorithm, known as dual simulation, which in many cases removes a large percentage of the vertices not found in isomorphic matches. In this paper, the runtime of the algorithm is tested on synthetic graphs of up to 10 million vertices and 250 million edges and on two real life datasets, comparing when possible to an adjacency list version of Ullmanns algorithm and to the VF2 algorithm. To the best of our knowledge, this is the first paper to test a centralized subgraph isomorphism algorithm on graphs of this magnitude. The algorithm is tested extensively to determine the effects of label density, edge density, datagraph size, degree distribution, and query graph size and type on runtime. The effectiveness of the algorithm is then demonstrated on two large real life graphs. The algorithm is easily extendable to graphs with multiple attributes on vertices and edges, making it an ideal candidate to serve as the backbone of a query processing engine for a graph database.

Details

Record ID

15757

Record Created

2024-12-05

Title

A fast algorithm for subgraph pattern matching on large labeled graphs

Author

Saltz, Matthew Wyatt

Contributor

Miller, John A. Advisor
Potter, Walter D. Committee Member
Ramaswamy, Lakshmish Committee Member

College or School

Computer Sciences

Date

2013

Publisher

University of Georgia

Content Type

Thesis

Language

English

Dissertation/ Thesis Note

Graduate

Degree Type

Master of Science (MS)

Name of Granting Institution

University of Georgia, Summer 2013

Year Degree Granted

2013

Keywords

pattern matching; graph; subgraph isomorphism; query processing

Record Appears in

Electronic Theses and Dissertations > Graduate Thesis
All Resources

System Control Number

9949333896902959

PDF

Statistics

Download Full History