Files
Abstract
In today’s rapidly digitalizing world, the integration of diverse sensors has become increasingly prevalent. Within the transportation sector, state Departments of Transportation have
extensively deployed traffic sensors to support a wide range of engineering applications. Accurate
and reliable sensor data is crucial for efficiently monitoring and managing large-scale
transportation networks. However, these sensors inevitably suffer from issues such as data loss,
random noise, biases, and drift, often caused by sensor aging, defects, or environmental factors.
Therefore, detecting these faults is imperative to maintain data integrity. This dissertation
introduces two distinct deep learning frameworks designed to enhance traffic sensor data quality,
with a focus on fault detection. The first framework evaluates data from individual sensor stations,
while the second incorporates geospatial context by considering spatiotemporal correlations
among neighboring stations, resulting in improved fault detection accuracy. In contrast, the first
framework is context-insensitive, requiring less data as it analyzes data from individual sensors,
whereas the second framework, which integrates contextual information, demands more data.
Particularly the first framework leverages symmetric contrastive learning within a triplet network
architecture, enhanced by a cross-attention loss function to improve fault detection. Continuous
Wavelet Transformation (CWT) is first applied to convert traffic data into time-frequency wavelet
images, which are used to pretrain a triplet encoder. A novel symmetric contrastive sampling
strategy is employed to improve training efficiency by using a normal day’s data as an anchor,
from which both positive and negative examples are generated based on domain knowledge. This
approach strengthens contrastive signals, enabling faster and more stable training. The second
framework leverages graph neural networks (GNNs) to capture spatial dependencies within
clusters of sensor stations. These clusters are formed in a reduced-dimensional latent space,
constructed using a dual-encoding attention graph auto-encoder (DAGAE) that embeds both node
and edge features. A cluster-guided denoising graph auto-encoder (CG-DGAE) is then trained
using subgraphs generated from these clusters to reconstruct traffic data from corrupted inputs. A
fault score function is then applied to compare observed and reconstructed data sequences,
identifying discrepancies indicative of sensor faults. Together, these frameworks provide
intelligent and practical solutions for fault detection and data quality control. Their implementation
has the potential to transform the maintenance and operation of transportation systems,
contributing to more reliable and resilient infrastructure.