Files
Abstract
Graph datasets are important because they allow for efficient analysis of intercon-nected data. Also, they enable the exploration of relationships between entities,
facilitating pattern recognition, decision making, classification, clustering and statis-
tical analysis, especially for large datasets. Graph databases are also well-suited for
handling complex data changes with minimal manual input and offer techniques for
data integration and sharing. Graphs are widely used across different fields; such as:
social network analysis, recommendation systems, fraud detection, supply chain man-
agement, bioinformatics and etc. To be able to benefit from the information of the
graph datasets, one has to use different analysis methods including using query lan-
guages such as Cypher and SQL, using GNNs or employ machine leaning algorithms
like community detection, link prediction, or node classification to gain insights. As
already mentioned, graphs usually get large in size and as a result, it is more efficient
in many cases to reduce the size of the graph before applying machine learning models
on them directly. To reduce the size of the graph, we can use an embedding method.
Embedding is the act of translating high dimensional data into low dimensional ones
while keeping the important information. After applying the embedding method, we
can use the machine learning model to obtain the data we need from the graph. In
this research, we focus on graph analysis and how to use information stored in graphs.
First, we introduce a novel method for embedding graphs. While the resulted em-
bedding can be used for various purposes, we use it for the link prediction task.
Then, we discuss several recent works for embedding graphs, their advantages and
disadvantages and finally compare the model we developed to them. Then, we intro-
duce another method for graph analysis in which we use LLMs to generate multi-hop
Cypher query and run that query on a Neo4j database.