Kelowna
Translating Natural Language Queries to SQL Using the T5 Model
Wong, Albert, Pham, Lien, Lee, Young, Chan, Shek, Sadaya, Razel, Khmelevsky, Youry, Clement, Mathias, Cheng, Florence Wing Yau, Mahony, Joe, Ferri, Michael
This paper presents the development process of a natural language to SQL model using the T5 model as the basis. The models, developed in August 2022 for an online transaction processing system and a data warehouse, have a 73\% and 84\% exact match accuracy respectively. These models, in conjunction with other work completed in the research project, were implemented for several companies and used successfully on a daily basis. The approach used in the model development could be implemented in a similar fashion for other database environments and with a more powerful pre-trained language model.
zkFDL: An efficient and privacy-preserving decentralized federated learning with zero knowledge proof
Ahmadi, Mojtaba, Nourmohammadi, Reza
Federated leaning (FL) has been frequently used in various field of studies and businesses. Traditional centralized FL systems suffer from serious issues. To address these concerns, decentralized federated learning (DFL) systems have been introduced in recent years in which with the help of blockchains, try to achieve more integrity and efficiency. On the other hand, privacy-preserving is an uncovered part of these systems. To address this, and also scaling the blockchain-based computations, we propose a zero knowledge proof (ZKP) based aggregator (zkDFL) that allows clients to share their large-scale model parameters with a trusted centralized server without revealing their individual data to other clients. We utilize blockchain technology to manage the aggregation algorithm via smart contracts. The server performs a ZKP algorithm to prove to the clients that the aggregation is done according to the accepted algorithm. The server can also prove that all inputs of clients have been used. We evaluate our measure through a public dataset about wearable internet of things. As demonstrated by numerical evaluations, zkDFL introduces verifiability of correctness of aggregation process and enhances the privacy protection and scalability of DFL systems, while the gas cost has declined significantly.
Student Mastery or AI Deception? Analyzing ChatGPT's Assessment Proficiency and Evaluating Detection Strategies
Wang, Kevin, Akins, Seth, Mohammed, Abdallah, Lawrence, Ramon
Generative AI systems such as ChatGPT have a disruptive effect on learning and assessment. Computer science requires practice to develop skills in problem solving and programming that are traditionally developed using assignments. Generative AI has the capability of completing these assignments for students with high accuracy, which dramatically increases the potential for academic integrity issues and students not achieving desired learning outcomes. This work investigates the performance of ChatGPT by evaluating it across three courses (CS1,CS2,databases). ChatGPT completes almost all introductory assessments perfectly. Existing detection methods, such as MOSS and JPlag (based on similarity metrics) and GPTzero (AI detection), have mixed success in identifying AI solutions. Evaluating instructors and teaching assistants using heuristics to distinguish between student and AI code shows that their detection is not sufficiently accurate. These observations emphasize the need for adapting assessments and improved detection methods.
Bag of Views: An Appearance-based Approach to Next-Best-View Planning for 3D Reconstruction
Gazani, Sara Hatami, Tucsok, Matthew, Mantegh, Iraj, Najjaran, Homayoun
UAV-based intelligent data acquisition for 3D reconstruction and monitoring of infrastructure has experienced an increasing surge of interest due to recent advancements in image processing and deep learning-based techniques. View planning is an essential part of this task that dictates the information capture strategy and heavily impacts the quality of the 3D model generated from the captured data. Recent methods have used prior knowledge or partial reconstruction of the target to accomplish view planning for active reconstruction; the former approach poses a challenge for complex or newly identified targets while the latter is computationally expensive. In this work, we present Bag-of-Views (BoV), a fully appearance-based model used to assign utility to the captured views for both offline dataset refinement and online next-best-view (NBV) planning applications targeting the task of 3D reconstruction. With this contribution, we also developed the View Planning Toolbox (VPT), a lightweight package for training and testing machine learning-based view planning frameworks, custom view dataset generation of arbitrary 3D scenes, and 3D reconstruction. Through experiments which pair a BoV-based reinforcement learning model with VPT, we demonstrate the efficacy of our model in reducing the number of required views for high-quality reconstructions in dataset refinement and NBV planning.
Approximately Equivariant Graph Networks
Huang, Ningyuan, Levie, Ron, Villar, Soledad
Graph neural networks (GNNs) are commonly described as being permutation equivariant with respect to node relabeling in the graph. This symmetry of GNNs is often compared to the translation equivariance of Euclidean convolution neural networks (CNNs). However, these two symmetries are fundamentally different: The translation equivariance of CNNs corresponds to symmetries of the fixed domain acting on the image signals (sometimes known as active symmetries), whereas in GNNs any permutation acts on both the graph signals and the graph domain (sometimes described as passive symmetries). In this work, we focus on the active symmetries of GNNs, by considering a learning setting where signals are supported on a fixed graph. In this case, the natural symmetries of GNNs are the automorphisms of the graph. Since real-world graphs tend to be asymmetric, we relax the notion of symmetries by formalizing approximate symmetries via graph coarsening. We present a bias-variance formula that quantifies the tradeoff between the loss in expressivity and the gain in the regularity of the learned estimator, depending on the chosen symmetry group. To illustrate our approach, we conduct extensive experiments on image inpainting, traffic flow prediction, and human pose estimation with different choices of symmetries. We show theoretically and empirically that the best generalization performance can be achieved by choosing a suitably larger group than the graph automorphism, but smaller than the permutation group.
Fine-grained Expressivity of Graph Neural Networks
Bรถker, Jan, Levie, Ron, Huang, Ningyuan, Villar, Soledad, Morris, Christopher
Numerous recent works have analyzed the expressive power of message-passing graph neural networks (MPNNs), primarily utilizing combinatorial techniques such as the $1$-dimensional Weisfeiler-Leman test ($1$-WL) for the graph isomorphism problem. However, the graph isomorphism objective is inherently binary, not giving insights into the degree of similarity between two given graphs. This work resolves this issue by considering continuous extensions of both $1$-WL and MPNNs to graphons. Concretely, we show that the continuous variant of $1$-WL delivers an accurate topological characterization of the expressive power of MPNNs on graphons, revealing which graphs these networks can distinguish and the level of difficulty in separating them. We identify the finest topology where MPNNs separate points and prove a universal approximation theorem. Consequently, we provide a theoretical framework for graph and graphon similarity combining various topological variants of classical characterizations of the $1$-WL. In particular, we characterize the expressive power of MPNNs in terms of the tree distance, which is a graph distance based on the concept of fractional isomorphisms, and substructure counts via tree homomorphisms, showing that these concepts have the same expressive power as the $1$-WL and MPNNs on graphons. Empirically, we validate our theoretical findings by showing that randomly initialized MPNNs, without training, exhibit competitive performance compared to their trained counterparts. Moreover, we evaluate different MPNN architectures based on their ability to preserve graph distances, highlighting the significance of our continuous $1$-WL test in understanding MPNNs' expressivity.
Problem-Solving Guide: Predicting the Algorithm Tags and Difficulty for Competitive Programming Problems
Kim, Juntae, Cho, Eunjung, Kim, Dongwoo, Na, Dongbin
The recent program development industries have required problem-solving abilities for engineers, especially application developers. However, AI-based education systems to help solve computer algorithm problems have not yet attracted attention, while most big tech companies require the ability to solve algorithm problems including Google, Meta, and Amazon. The most useful guide to solving algorithm problems might be guessing the category (tag) of the facing problems. Therefore, our study addresses the task of predicting the algorithm tag as a useful tool for engineers and developers. Moreover, we also consider predicting the difficulty levels of algorithm problems, which can be used as useful guidance to calculate the required time to solve that problem. In this paper, we present a real-world algorithm problem multi-task dataset, AMT, by mainly collecting problem samples from the most famous and large competitive programming website Codeforces. To the best of our knowledge, our proposed dataset is the most large-scale dataset for predicting algorithm tags compared to previous studies. Moreover, our work is the first to address predicting the difficulty levels of algorithm problems. We present a deep learning-based novel method for simultaneously predicting algorithm tags and the difficulty levels of an algorithm problem given. All datasets and source codes are available at https://github.com/sronger/PSG_Predicting_Algorithm_Tags_and_Difficulty.
A Sequential Meta-Transfer (SMT) Learning to Combat Complexities of Physics-Informed Neural Networks: Application to Composites Autoclave Processing
Ramezankhani, Milad, Milani, Abbas S.
Physics-Informed Neural Networks (PINNs) have gained popularity in solving nonlinear partial differential equations (PDEs) via integrating physical laws into the training of neural networks, making them superior in many scientific and engineering applications. However, conventional PINNs still fall short in accurately approximating the solution of complex systems with strong nonlinearity, especially in long temporal domains. Besides, since PINNs are designed to approximate a specific realization of a given PDE system, they lack the necessary generalizability to efficiently adapt to new system configurations. This entails computationally expensive re-training from scratch for any new change in the system. To address these shortfalls, in this work a novel sequential meta-transfer (SMT) learning framework is proposed, offering a unified solution for both fast training and efficient adaptation of PINNs in highly nonlinear systems with long temporal domains. Specifically, the framework decomposes PDE's time domain into smaller time segments to create "easier" PDE problems for PINNs training. Then for each time interval, a meta-learner is assigned and trained to achieve an optimal initial state for rapid adaptation to a range of related tasks. Transfer learning principles are then leveraged across time intervals to further reduce the computational cost.Through a composites autoclave processing case study, it is shown that SMT is clearly able to enhance the adaptability of PINNs while significantly reducing computational cost, by a factor of 100.
Learning to Team-Based Navigation: A Review of Deep Reinforcement Learning Techniques for Multi-Agent Pathfinding
Chung, Jaehoon, Fayyad, Jamil, Younes, Younes Al, Najjaran, Homayoun
Multi-agent pathfinding (MAPF) is a critical field in many large-scale robotic applications, often being the fundamental step in multi-agent systems. The increasing complexity of MAPF in complex and crowded environments, however, critically diminishes the effectiveness of existing solutions. In contrast to other studies that have either presented a general overview of the recent advancements in MAPF or extensively reviewed Deep Reinforcement Learning (DRL) within multi-agent system settings independently, our work presented in this review paper focuses on highlighting the integration of DRL-based approaches in MAPF. Moreover, we aim to bridge the current gap in evaluating MAPF solutions by addressing the lack of unified evaluation metrics and providing comprehensive clarification on these metrics. Finally, our paper discusses the potential of model-based DRL as a promising future direction and provides its required foundational understanding to address current challenges in MAPF. Our objective is to assist readers in gaining insight into the current research direction, providing unified metrics for comparing different MAPF algorithms and expanding their knowledge of model-based DRL to address the existing challenges in MAPF.
Facilitating Sim-to-real by Intrinsic Stochasticity of Real-Time Simulation in Reinforcement Learning for Robot Manipulation
Dershan, Ram, Enayati, Amir M. Soufi, Zhang, Zengjie, Richert, Dean, Najjaran, Homayoun
Simulation is essential to reinforcement learning (RL) before implementation in the real world, especially for safety-critical applications like robot manipulation. Conventionally, RL agents are sensitive to the discrepancies between the simulation and the real world, known as the sim-to-real gap. The application of domain randomization, a technique used to fill this gap, is limited to the imposition of heuristic-randomized models. {We investigate the properties of intrinsic stochasticity of real-time simulation (RT-IS) of off-the-shelf simulation software and its potential to improve RL performance. This improvement includes a higher tolerance to noise and model imprecision and superiority to conventional domain randomization in terms of ease of use and automation. Firstly, we conduct analytical studies to measure the correlation of RT-IS with the utilization of computer hardware and validate its comparability with the natural stochasticity of a physical robot. Then, we exploit the RT-IS feature in the training of an RL agent. The simulation and physical experiment results verify the feasibility and applicability of RT-IS to robust agent training for robot manipulation tasks. The RT-IS-powered RL agent outperforms conventional agents on robots with modeling uncertainties. RT-IS requires less heuristic randomization, is not task-dependent, and achieves better generalizability than the conventional domain-randomization-powered agents. Our findings provide a new perspective on the sim-to-real problem in practical applications like robot manipulation tasks.