Document Type

Conference Proceeding

Publication Date

Fall 9-2025

Abstract

As organizations scale model training across large clusters and clouds, data poisoning has emerged as a significant practical threat. Most existing research focuses on data poisoning in single-node environments. Far fewer studies have compared the effectiveness of attacks across parallel training strategies, where factors like gradient aggregation and distributed memory ceilings fundamentally alter attack detectability and its impacts. To address this gap, we introduce AdversaGuard, a reproducible benchmark and accompanying application designed specifically for distributed settings to protect AI training pipelines. This research makes the following key contributions: (1) A comprehensive Distributed Data Poisoning (DDP) benchmark spanning seven distributed systems (leveraging data, model, and hybrid parallelism) and a non-distributed baseline. (2) A standardized attack suite implementing eight poisoning methods with consistent budgets for fair comparison. (3) An analysis of how different parallel strategies modulate attack impacts, demonstrating that design choices can either mask or amplify vulnerabilities. (4) A publicly available interactive application for live testing and method comparison. And (5) A novel evaluation metric, the AdversaGuard Efficiency Index (AEI), which provides a composite score for DDP robustness considering accuracy, attack success, model size, and computational overhead. To evaluate AdversaGuard as an HPC-oriented framework for benchmarking adversarial robustness and training efficiency in distributed data poisoning (DDP) settings, we experimented with eight system configurations—seven distributed regimes and a baseline—across three food-domain image datasets and four model families, ranging from compact CNNs to large Vision Transformers. Within AdversaGuard framework, we implemented eight common adversarial attacks, such as FGSM, PGD, DeepFool, and Carlini-Wagner, and analyzed how data, model, nd hybrid parallelism affect scalability, memory consumption, and vulnerability. Our key findings are that while data parallelism’s gradient averaging can mask low-budget perturbations, larger models consistently exhibit greater susceptibility to poisoning. To quantify these trade-offs, we introduce the AdversaGuard Efficiency Index (AEI), a composite metric for evaluating parallelism-aware robustness. Our implementation of a companion AdversaGuard application enables live testing (GitHub repository) with the help of a live demo (YouTube). This work underscores the critical need for scalable, adaptive defenses in modern, distributed AI training pipelines

Publication Title

SC25 - HPC Ignite

Share

COinS