--This letter investigates a channel assignment problem in uplink wireless communication systems. Our goal is to maximize the sum rate of all users subject to integer channel assignment constraints. A convex optimization based algorithm is provided to obtain the optimal channel assignment, where the closed-form solution is obtained in each step. Due to high computational complexity in the convex optimization based algorithm, machine learning approaches are employed to obtain computational efficient solutions. More specifically, the data are generated by using convex optimization based algorithm and the original problem is converted to a regression problem which is addressed by the integration of convolutional neural networks (CNNs), feed-forward neural networks (FNNs), random forest and gated recurrent unit networks (GRUs). The results demonstrate that the machine learning method largely reduces the computation time with slightly compromising of prediction accuracy.
Can we incorporate discrete optimization algorithms within modern machine learning models? For example, is it possible to use in deep architectures a layer whose output is the minimal cut of a parametrized graph? Given that these models are trained end-to-end by leveraging gradient information, the introduction of such layers seems very challenging due to their non-continuous output. In this paper we focus on the problem of submodular minimization, for which we show that such layers are indeed possible. The key idea is that we can continuously relax the output without sacrificing guarantees.
Recent advancements in quantum computing have driven the scientific community's quest to solve a certain class of complex problems for which quantum computers would be better suited than traditional supercomputers. To improve the efficiency with which quantum computers can solve these problems, scientists are investigating the use of artificial intelligence approaches. In a new study, scientists at the U.S. Department of Energy's (DOE) Argonne National Laboratory have developed a new algorithm based on reinforcement learning to find the optimal parameters for the Quantum Approximate Optimization Algorithm (QAOA), which allows a quantum computer to solve certain combinatorial problems such as those that arise in materials design, chemistry and wireless communications. "Combinatorial optimization problems are those for which the solution space gets exponentially larger as you expand the number of decision variables," said Argonne computer scientist Prasanna Balaprakash. "In one traditional example, you can find the shortest route for a salesman who needs to visit a few cities once by enumerating all possible routes, but given a couple thousand cities, the number of possible routes far exceeds the number of stars in the universe; even the fastest supercomputers cannot find the shortest route in a reasonable time."
We propose new continuous-time formulations for first-order stochastic optimization algorithms such as mini-batch gradient descent and variance-reduced methods. We exploit these continuous-time models, together with simple Lyapunov analysis as well as tools from stochastic calculus, in order to derive convergence bounds for various types of non-convex functions. Guided by such analysis, we show that the same Lyapunov arguments hold in discrete-time, leading to matching rates. In addition, we use these models and Ito calculus to infer novel insights on the dynamics of SGD, proving that a decreasing learning rate acts as time warping or, equivalently, as landscape stretching. Papers published at the Neural Information Processing Systems Conference.
Federated learning (FL) enables on-device training over distributed networks consisting of a massive amount of modern smart devices, such as smartphones and IoT devices. However, the leading optimization algorithm in such settings, i.e., federated averaging (FedAvg), suffers from heavy communication cost and inevitable performance drop, especially when the local data is distributed in a non-IID way. To alleviate this problem, we propose two potential solutions by introducing additional mechanisms to the on-device training. The first (FedMMD) is adopting a two-stream model with the MMD (Maximum Mean Discrepancy) constraint instead of a single model in vanilla FedAvg to be trained on devices. Experiments show that the proposed method outperforms baselines, especially in non-IID FL settings, with a reduction of more than 20% in required communication rounds. The second is FL with feature fusion (FedFusion). By aggregating the features from both the local and global models, we achieve higher accuracy at less communication cost. Furthermore, the feature fusion modules offer better initialization for newly incoming clients and thus speed up the process of convergence. Experiments in popular FL scenarios show that our FedFusion outperforms baselines in both accuracy and generalization ability while reducing the number of required communication rounds by more than 60%.