Reaching for the Sky: Maximizing Deep Learning Inference Throughput on Edge Devices

Subedi, Piyush

Reaching for the Sky: Maximizing Deep Learning Inference Throughput on Edge Devices

Subedi, Piyush

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Add to Basket

Files

Abstract

The wide adoption of smart devices and Internet-of-Things (IoT) sensors has led to massive growth in data generation at the edge of the Internet over the past decade. Intelligent real-time analysis of such a high volume of data, particularly leveraging highly accurate Deep Learning (DL) models, often requires the data to be processed as close to the data sources (or at the edge of the Internet) to minimize the network and processing latency. The advent of specialized, low-cost, and power-efficient edge devices has greatly facilitated DL inference tasks at the edge. However, limited research has been done to improve the inference throughput (i.e., number of inferences per second) by exploiting various system techniques. This study investigates system techniques that enhance the overall inference throughput on edge devices with DL models for image classification tasks. We present various approaches, such as batched inferencing and multi-tenancy to utilize edge devices' system resources (CPU and GPUs) and AI accelerators (e.g., Tensor Processing Units; TPUs). The evaluation results show that batched inferencing results in up to 4× more inferences per second in devices equipped with high-performance GPUs like Jeston Xavier NX. Moreover, with multi-tenancy approaches, e.g., concurrent model executions and dynamic model placements, a throughput of nearly 340 inferences per second can be achieved, which is 6× higher than the maximum throughput when running the models on a single tenant. Furthermore, a detailed analysis of the factors (hardware and software) that affect the throughput of the systems is presented, thereby shedding light on areas that could be further improved to achieve high-performance DL inference at the edge.

Details

Record ID

4862

Record Created

2024-12-05

Title

Reaching for the Sky: Maximizing Deep Learning Inference Throughput on Edge Devices

Author

Subedi, Piyush

Contributor

Kim, In Kee Advisor
Kim, In Kee Committee Member
Ramaswamy, Lakshmish Committee Member
Lee, Jaewoo Committee Member

College or School

College of Engineering

Department

School of Computing

Subjects

Computer science

Content Type

Thesis

Pagination

74

File Format

pdf

Language

English

Degree Type

Master of Science (MS)

Name of Granting Institution

University of Georgia

Year Degree Granted

2021-08

Keywords

characterization; deep learning; edge computing; machine learning; multi-tenancy; performance evaluation

Record Appears in

College, School, or Unit > College of Engineering > School of Computing
Electronic Theses and Dissertations > Graduate Thesis
All Resources

System Control Number

9949390356802959

PDF

Statistics

Download Full History