data unit
SPOR: A Comprehensive and Practical Evaluation Method for Compositional Generalization in Data-to-Text Generation
Compositional generalization is an important ability of language models and has many different manifestations. For data-to-text generation, previous research on this ability is limited to a single manifestation called Systematicity and lacks consideration of large language models (LLMs), which cannot fully cover practical application scenarios. In this work, we propose SPOR, a comprehensive and practical evaluation method for compositional generalization in data-to-text generation. SPOR includes four aspects of manifestations (Systematicity, Productivity, Order invariance, and Rule learnability) and allows high-quality evaluation without additional manual annotations based on existing datasets. We demonstrate SPOR on two different datasets and evaluate some existing language models including LLMs. We find that the models are deficient in various aspects of the evaluation and need further improvement. Our work shows the necessity for comprehensive research on different manifestations of compositional generalization in data-to-text generation and provides a framework for evaluation.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Africa > Ethiopia > Addis Ababa > Addis Ababa (0.05)
- North America > United States > Texas > Travis County > Austin (0.04)
- (13 more...)
- Consumer Products & Services > Restaurants (0.69)
- Government (0.68)
Observations on Building RAG Systems for Technical Documents
Soman, Sumit, Roychowdhury, Sujoy
Retrieval augmented generation (RAG) for technical documents creates challenges as embeddings do not often capture domain information. We review prior art for important factors affecting RAG and perform experiments to highlight best practices and potential challenges to build RAG systems for technical documents.
- Research Report (0.50)
- Overview (0.34)
Quantum computing & artificial intelligence: 10 things you should know
In recent years, emerging technologies have become prominent. Amongst them, quantum computing has a singular potential to change our world the most. Quantum computing has shown promising evidence to speed up heuristic computations in an incredible manner. Thus, applying quantum computing within complex solutions to address problems in pharmaceuticals and materials discovery, finance, autonomous vehicle applications, artificial intelligence, and other areas will have a significant impact on our lives. In particular, quantum computing has the potential to magnify the effects (both positives and negatives) of many AI applications.
- Health & Medicine (0.70)
- Information Technology > Security & Privacy (0.31)
- Information Technology > Hardware (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (0.48)
Energy Efficient Edge Computing: When Lyapunov Meets Distributed Reinforcement Learning
Sana, Mohamed, Merluzzi, Mattia, di Pietro, Nicola, Strinati, Emilio Calvanese
In this work, we study the problem of energy-efficient computation offloading enabled by edge computing. In the considered scenario, multiple users simultaneously compete for limited radio and edge computing resources to get offloaded tasks processed under a delay constraint, with the possibility of exploiting low power sleep modes at all network nodes. The radio resource allocation takes into account inter- and intra-cell interference, and the duty cycles of the radio and computing equipment have to be jointly optimized to minimize the overall energy consumption. To address this issue, we formulate the underlying problem as a dynamic long-term optimization. Then, based on Lyapunov stochastic optimization tools, we decouple the formulated problem into a CPU scheduling problem and a radio resource allocation problem to be solved in a per-slot basis. Whereas the first one can be optimally and efficiently solved using a fast iterative algorithm, the second one is solved using distributed multi-agent reinforcement learning due to its non-convexity and NP-hardness. The resulting framework achieves up to 96.5% performance of the optimal strategy based on exhaustive search, while drastically reducing complexity. The proposed solution also allows to increase the network's energy efficiency compared to a benchmark heuristic approach.
The Sensitivity of Word Embeddings-based Author Detection Models to Semantic-preserving Adversarial Perturbations
Duncan, Jeremiah, Fallas, Fabian, Gropp, Chris, Herron, Emily, Mahbub, Maria, Olaya, Paula, Ponce, Eduardo, Samuel, Tabitha K., Schultz, Daniel, Srinivasan, Sudarshan, Tang, Maofeng, Zenkov, Viktor, Zhou, Quan, Begoli, Edmon
Authorship analysis is an important subject in the field of natural language processing. It allows the detection of the most likely writer of articles, news, books, or messages. This technique has multiple uses in tasks related to authorship attribution, detection of plagiarism, style analysis, sources of misinformation, etc. The focus of this paper is to explore the limitations and sensitiveness of established approaches to adversarial manipulations of inputs. To this end, and using those established techniques, we first developed an experimental frame-work for author detection and input perturbations. Next, we experimentally evaluated the performance of the authorship detection model to a collection of semantic-preserving adversarial perturbations of input narratives. Finally, we compare and analyze the effects of different perturbation strategies, input and model configurations, and the effects of these on the author detection model.
- North America > United States > Tennessee > Knox County > Knoxville (0.14)
- Oceania > Australia > Victoria > Melbourne (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- (3 more...)
DeepStreet: A deep learning powered urban street network generation module
Fang, Zhou, Yang, Tianren, Jin, Ying
In countries experiencing unprecedented waves of urbanization, there is a need for rapid and high-quality urban street design. Our study presents a novel deep learning powered approach, DeepStreet (DS), for automatic street network generation that can be applied to the urban street design with local characteristics. DS is driven by a Convolutional Neural Network (CNN) that enables the interpolation of streets based on the areas of immediate vicinity. Specifically, the CNN is firstly trained to detect, recognize and capture the local features as well as the patterns of the existing street network sourced from the OpenStreetMap. With the trained CNN, DS is able to predict street networks' future expansion patterns within the predefined region conditioned on its surrounding street networks. To test the performance of DS, we apply it to an area in and around the Eixample area in the City of Barcelona, a well-known example in the fields of urban and transport planning with iconic grid-like street networks in the centre and irregular road alignments farther afield. The results show that DS can (1) detect and self-cluster different types of complex street patterns in Barcelona; (2) predict both gridiron and irregular street and road networks. DS proves to have a great potential as a novel tool for designers to efficiently design the urban street network that well maintains the consistency across the existing and newly generated urban street network. Furthermore, the generated networks can serve as a benchmark to guide the local plan-making especially in rapidly-developing cities. Keywords: Urban street network, machine learning, deep learning, Convolutional Neural Network (CNN), Generative Adversarial Network (GAN), image completion, image inpainting
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.15)
- North America > United States > Pennsylvania (0.04)
- Asia (0.04)
- Africa (0.04)
- Information Technology (0.68)
- Transportation > Infrastructure & Services (0.40)
- Transportation > Ground > Road (0.40)
Resource Management for Blockchain-enabled Federated Learning: A Deep Reinforcement Learning Approach
Hieu, Nguyen Quang, Anh, Tran The, Luong, Nguyen Cong, Niyato, Dusit, Kim, Dong In, Elmroth, Erik
Blockchain-enabled Federated Learning (BFL) enables model updates of Federated Learning (FL) to be stored in the blockchain in a secure and reliable manner. However, the issue of BFL is that the training latency may increase due to the blockchain mining process. The other issue is that mobile devices in BFL have energy and CPU constraints that may reduce the system lifetime and training efficiency. To address these issues, the Machine Learning Model Owner (MLMO) needs to (i) decide how much data and energy that the mobile devices use for the training and (ii) determine the mining difficulty to minimize the training latency and energy consumption while achieving the target model accuracy. Under the uncertainty of the BFL environment, it is challenging for the MLMO to determine the optimal decisions. We propose to use the Deep Reinforcement Learning (DRL) to derive the optimal decisions for the MLMO.
- Energy (0.68)
- Information Technology > Security & Privacy (0.47)
- Materials > Metals & Mining (0.36)
Towards automatic construction of multi-network models for heterogeneous multi-task learning
Garciarena, Unai, Mendiburu, Alexander, Santana, Roberto
Multi-task learning, as it is understood nowadays, consists of using one single model to carry out several similar tasks. From classifying hand-written characters of different alphabets to figuring out how to play several Atari games using reinforcement learning, multi-task models have been able to widen their performance range across different tasks, although these tasks are usually of a similar nature. In this work, we attempt to widen this range even further, by including heterogeneous tasks in a single learning procedure. To do so, we firstly formally define a multi-network model, identifying the necessary components and characteristics to allow different adaptations of said model depending on the tasks it is required to fulfill. Secondly, employing the formal definition as a starting point, we develop an illustrative model example consisting of three different tasks (classification, regression and data sampling). The performance of this model implementation is then analyzed, showing its capabilities. Motivated by the results of the analysis, we enumerate a set of open challenges and future research lines over which the full potential of the proposed model definition can be exploited.
- North America > Canada > Ontario > Toronto (0.14)
- Europe > Spain > Basque Country (0.05)
- North America > Cuba > La Habana Province > Havana (0.04)
- (2 more...)