Ranking documents using degrees of separation analysis from a large dataset of semantic relationships

Howell, Matthew Raymond

Ranking documents using degrees of separation analysis from a large dataset of semantic relationships

Howell, Matthew Raymond

2010

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Add to Basket

Files

Abstract

In this paper, a ranking mechanism is presented that ranks documents based on their Semantic Association Similarity, which is defined as the close-ness (based on degrees of separation) of associations between the entities found in each document. A large semantic knowledge base with over 1.6 million entities and 24 million associations is used as the backend dataset for comparison. Multiple ranking techniques are evaluated and speed concerns are addressed. Bloom filters are used to improve ranking speed while introducing a small percentage of false positives. A real world example of spam page identification is investigated.

Details

Record ID

8679

Record Created

2024-12-05

Title

Ranking documents using degrees of separation analysis from a large dataset of semantic relationships

Author

Howell, Matthew Raymond

Contributor

Li, Kang Advisor
Doshi, Prashant Committee Member
Ramaswamy, Lakshmish Committee Member

College or School

College of Engineering

Department

School of Computing

Date

2010

Publisher

University of Georgia

Content Type

Thesis

Language

English

Dissertation/ Thesis Note

Graduate

Degree Type

Master of Science (MS)

Name of Granting Institution

University of Georgia, Winter 2010

Year Degree Granted

2010

Keywords

Record Appears in

College, School, or Unit > College of Engineering > School of Computing
Electronic Theses and Dissertations > Graduate Thesis
All Resources

System Control Number

9949334604702959

PDF

Statistics

Download Full History