Efficient K-nearest neighbor queries using clustering with caching

Ahmed, Jaim

Efficient K-nearest neighbor queries using clustering with caching

Ahmed, Jaim

2009

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Add to Basket

Files

Abstract

We introduce a new algorithm for K-nearest neighbor queries that uses clustering and caching to improve performance. The main idea is to reduce the distance computation cost between the query point and the data points in the data set. We use a divide-and-conquer approach. First, we divide the training data into clusters based on similarity between the data points in terms of Euclidean distance. Next we use linearization for faster lookup. The data points in a cluster can be sorted based on their similarity (measured by Euclidean distance) to the center of the cluster. Fast search data structures such as the B-tree can be utilized to store data points based on their distance from the cluster center and perform fast data search. The B-tree algorithm is good for range search as well. We achieve a further performance boost by using B-tree based data caching. In this work we provide details of the algorithm, an implementation, and experimental results in a robot navigation task.

Details

Record ID

21125

Record Created

2024-12-05

Title

Efficient K-nearest neighbor queries using clustering with caching

Author

Ahmed, Jaim

Contributor

Hybinette, Maria Advisor
Kraemer, Eileen T. Committee Member
Rasheed, Khaled Committee Member

College or School

College of Engineering

Department

School of Computing

Date

2009

Publisher

University of Georgia

Content Type

Thesis

Language

English

Dissertation/ Thesis Note

Graduate

Degree Type

Master of Science (MS)

Name of Granting Institution

University of Georgia, Spring 2009

Year Degree Granted

2009

Keywords

K-Nearest Neighbors; Execution; Caching

Record Appears in

College, School, or Unit > College of Engineering > School of Computing
Electronic Theses and Dissertations > Graduate Thesis
All Resources

System Control Number

9949332933502959

PDF

Statistics

Download Full History