Date of Award

Spring 5-15-2022

Degree Type


Degree Name

Doctor of Philosophy (PhD)


Electrical Engineering and Computer Science


Gursoy, M. Cenk

Second Advisor

Gursoy, Senem Velipasalar

Subject Categories

Electrical and Computer Engineering | Engineering


Fueled by emerging applications and exponential increase in data traffic, wireless networks have recently grown significantly and become more complex. In such large-scale complex wireless networks, it is challenging and, oftentimes, infeasible for conventional optimization methods to quickly solve critical decision-making problems. With this motivation, in this thesis, machine learning methods are developed and utilized for obtaining optimal/near-optimal solutions for timely decision making in wireless networks.

Content caching at the edge nodes is a promising technique to reduce the data traffic in next-generation wireless networks. In this context, we in the first part of the thesis study content caching at the wireless network edge using a deep reinforcement learning framework with Wolpertinger architecture. Initially, we develop a learning-based caching policy for a single base station aiming at maximizing the long-term cache hit rate. Then, we extend this study to a wireless communication network with multiple edge nodes. In particular, we propose deep actor-critic reinforcement learning based policies for both centralized and decentralized content caching.

Next, with the purpose of making efficient use of limited spectral resources, we develop a deep actor-critic reinforcement learning based framework for dynamic multichannel access. We consider both a single-user case and a scenario in which multiple users attempt to access channels simultaneously. In the single-user model, in order to evaluate the performance of the proposed channel access policy and the framework's tolerance against uncertainty, we explore different channel switching patterns and different switching probabilities. In the case of multiple users, we analyze the probabilities of each user accessing channels with favorable channel conditions and the probability of collision.

Following the analysis of the proposed learning-based dynamic multichannel access policy, we consider adversarial attacks on it. In particular, we propose two adversarial policies, one based on feed-forward neural networks and the other based on deep reinforcement learning policies. Both attack strategies aim at minimizing the accuracy of a deep reinforcement learning based dynamic channel access agent, and we demonstrate and compare their performances.

Next, anomaly detection as an active hypothesis test problem is studied. Specifically, we study deep reinforcement learning based active sequential testing for anomaly detection. We assume that there is an unknown number of abnormal processes at a time and the agent can only check with one sensor in each sampling step. To maximize the confidence level of the decision and minimize the stopping time concurrently, we propose a deep actor-critic reinforcement learning framework that can dynamically select the sensor based on the posterior probabilities. Separately, we also regard the detection of threshold crossing as an anomaly detection problem, and analyze it via hierarchical generative adversarial networks (GANs).

In the final part of the thesis, to address state estimation and detection problems in the presence of noisy sensor observations and probing costs, we develop a soft actor-critic deep reinforcement learning framework. Moreover, considering Byzantine attacks, we design a GAN-based framework to identify the Byzantine sensors. To evaluate the proposed framework, we measure the performance in terms of detection accuracy, stopping time, and the total probing cost needed for detection.


Open Access