Date of Award

9-13-2023

Degree Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Electrical Engineering and Computer Science

Advisor(s)

Pramod Varshney

Second Advisor

Venkata Gandikota

Keywords

Byzantines;Clustering;Distributed Optimization;Learning Distributions;Robust Machine Learning;Robust Statistics

Abstract

In today's highly connected world, the number of smart devices worldwide has increased exponentially. These devices generate huge amounts of real-time data, perform complicated computational tasks, and provide actionable information. Over the past decade, numerous machine learning approaches have been widely adopted to infer hidden information from this massive and complex data. Accuracy is not enough when developing machine learning systems for some crucial application domains. The safety and reliability guarantees on the underlying learning models are critical requirements as well. This in turn necessitates that the learned models be robust towards processing corrupted data. Data can be corrupted by adversarial attacks where the attack may consist of data taking arbitrary values adversely affecting the efficiency of the algorithm. An adversary can replace samples with erroneous or malicious samples such as false labels or arbitrary inputs. In this dissertation, we refer to this type of attack as attack on data. Moreover, with the rapid increase in the volume of the data, storing and processing all this data at a central location becomes computationally expensive. Therefore, utilizing a distributed system is warranted to distribute tasks across multiple machines (known as distributed learning). Improvement of the efficiency of the optimization algorithms with respect to computational and communication costs along with maintaining a high level of accuracy is critical in distributed learning. However, an attack can occur by replacing the transmitted data of the machines in the system with arbitrary values that may negatively impact the performance of the learning task. We refer to this attack as attack on devices. The aforementioned attack scenarios can significantly impact the accuracy of the results, thereby, negatively impacting the expected model outcome. Hence, the development of a new generation of systems that are robust to such adversarial attacks and provide provable performance guarantees is warranted. The goal of this dissertation is to develop learning algorithms that are robust to such adversarial attacks. In this dissertation, we propose learning algorithms that are robust to adversarial attacks under two frameworks: 1) supervised learning, where the true label of the samples are known; and 2) unsupervised learning, where the labels are not known. Although neural networks have gained widespread success, theoretical understanding of their performance is lacking. Therefore, in the first part of the dissertation (Chapter 2), we try to understand the inner workings of a neural network. We achieve this by learning the parameters of the network. In fact, we generalize the estimation procedure by considering the robustness aspect along with the parameter estimation in the presence of adversarial attacks (attack on data). We devise a learning algorithm to estimate the parameters (weight matrix and bias vector) of a single-layer neural network with rectified linear unit activation in the unsupervised learning framework where each output sample can potentially be an arbitrary outlier with a fixed probability. Our estimation algorithm uses gradient descent algorithms along with the median-based filter to mitigate the effect of the outliers. We further determine the number of samples required to estimate the parameters of the network in the presence of the outliers. Combining the use of distributed systems to solve large-scale problems with the recent success of deep learning, there has been a surge of development in the field of distributed learning. In fact, the research in this direction has been further catalyzed by the development of federated learning. Despite extensive research in this area, distributed learning faces the challenge of training a high-dimensional model in a distributed manner while maintaining robustness against adversarial attacks. Hence, in the second part of the dissertation (Chapters 3 and 4), we study the problem of distributed learning in the presence of adversarial nodes (attack on nodes). Specifically, we consider the worker-server architecture to minimize a global loss function under both the learning frameworks in the presence of adversarial nodes (Byzantines). Each honest node performs some computation based only on its own local data, then communicates with the central server that performs aggregation. However, an adversarial node may send arbitrary information to the central server. In Chapter 3, we consider robust distributed learning under the supervised learning framework. We propose a novel algorithm that combines the idea of variance-reduction with a filtering technique based on vector median to mitigate the effect of the Byzantines. We prove the convergence of the approach to a first-order stationary point. Further, in Chapter 4, we consider robust distributed learning under the unsupervised learning framework (robust clustering). We propose a novel algorithm that combines the idea of redundant data assignment with the paradigm of distributed clustering. We show that our proposed approaches obtain constant factor approximate solutions in the presence of adversarial nodes.

Access

Open Access

Share

COinS