Overview
A Reparameterized Discrete Diffusion Model for Text Generation
Zheng, Lin, Yuan, Jianbo, Yu, Lei, Kong, Lingpeng
We derive an alternative yet equivalent However, there are noticeably fewer success cases in employing formulation of the sampling from discrete diffusion models for large-scale text generation diffusion processes and leverage this insight to tasks. This is possibly due to the discrete nature of natural develop a family of reparameterized discrete diffusion languages, while most conventional diffusion models focus models. The derived generic framework is on continuous-valued contents. To bridge the discrepancy, highly flexible, offers a fresh perspective of the recent work aims at conducting the diffusion process over token generation process in discrete diffusion models, embeddings so that the continuous diffusion models can and features more effective training and decoding be applied to discrete texts (Li et al., 2022; Gong et al., 2022; techniques. We conduct extensive experiments Strudel et al., 2022; Dieleman et al., 2022) or logits (Han to evaluate the text generation capability of our et al., 2022; Richemond et al., 2022). Nevertheless, these model, demonstrating significant improvements approaches often require designing a well-crafted rounding over existing diffusion models.
Bayesian Optimization of ESG Financial Investments
Garrido-Merchán, Eduardo C., Piris, Gabriel González, Vaca, Maria Coronado
Financial experts and analysts seek to predict the variability of financial markets. In particular, the correct prediction of this variability ensures investors successful investments. However, there has been a big trend in finance in the last years, which are the ESG criteria. Concretely, ESG (Economic, Social and Governance) criteria have become more significant in finance due to the growing importance of investments being socially responsible, and because of the financial impact companies suffer when not complying with them. Consequently, creating a stock portfolio should not only take into account its performance but compliance with ESG criteria. Hence, this paper combines mathematical modelling, with ESG and finance. In more detail, we use Bayesian optimization (BO), a sequential state-of-the-art design strategy to optimize black-boxes with unknown analytical and costly-to compute expressions, to maximize the performance of a stock portfolio under the presence of ESG criteria soft constraints incorporated to the objective function. In an illustrative experiment, we use the Sharpe ratio, that takes into consideration the portfolio returns and its variance, in other words, it balances the trade-off between maximizing returns and minimizing risks. In the present work, ESG criteria have been divided into fourteen independent categories used in a linear combination to estimate a firm total ESG score. Most importantly, our presented approach would scale to alternative black-box methods of estimating the performance and ESG compliance of the stock portfolio. In particular, this research has opened the door to many new research lines, as it has proved that a portfolio can be optimized using a BO that takes into consideration financial performance and the accomplishment of ESG criteria.
PDSum: Prototype-driven Continuous Summarization of Evolving Multi-document Sets Stream
Yoon, Susik, Chan, Hou Pong, Han, Jiawei
Summarizing text-rich documents has been long studied in the literature, but most of the existing efforts have been made to summarize a static and predefined multi-document set. With the rapid development of online platforms for generating and distributing text-rich documents, there arises an urgent need for continuously summarizing dynamically evolving multi-document sets where the composition of documents and sets is changing over time. This is especially challenging as the summarization should be not only effective in incorporating relevant, novel, and distinctive information from each concurrent multi-document set, but also efficient in serving online applications. In this work, we propose a new summarization problem, Evolving Multi-Document sets stream Summarization (EMDS), and introduce a novel unsupervised algorithm PDSum with the idea of prototype-driven continuous summarization. PDSum builds a lightweight prototype of each multi-document set and exploits it to adapt to new documents while preserving accumulated knowledge from previous documents. To update new summaries, the most representative sentences for each multi-document set are extracted by measuring their similarities to the prototypes. A thorough evaluation with real multi-document sets streams demonstrates that PDSum outperforms state-of-the-art unsupervised multi-document summarization algorithms in EMDS in terms of relevance, novelty, and distinctiveness and is also robust to various evaluation settings.
NapSS: Paragraph-level Medical Text Simplification via Narrative Prompting and Sentence-matching Summarization
Lu, Junru, Li, Jiazheng, Wallace, Byron C., He, Yulan, Pergola, Gabriele
Accessing medical literature is difficult for laypeople as the content is written for specialists and contains medical jargon. Automated text simplification methods offer a potential means to address this issue. In this work, we propose a summarize-then-simplify two-stage strategy, which we call NapSS, identifying the relevant content to simplify while ensuring that the original narrative flow is preserved. In this approach, we first generate reference summaries via sentence matching between the original and the simplified abstracts. These summaries are then used to train an extractive summarizer, learning the most relevant content to be simplified. Then, to ensure the narrative consistency of the simplified text, we synthesize auxiliary narrative prompts combining key phrases derived from the syntactical analyses of the original text. Our model achieves results significantly better than the seq2seq baseline on an English medical corpus, yielding 3%~4% absolute improvements in terms of lexical similarity, and providing a further 1.1% improvement of SARI score when combined with the baseline. We also highlight shortcomings of existing evaluation methods, and introduce new metrics that take into account both lexical and high-level semantic similarity. A human evaluation conducted on a random sample of the test set further establishes the effectiveness of the proposed approach. Codes and models are released here: https://github.com/LuJunru/NapSS.
Explainable Artificial Intelligence: Precepts, Methods, and Opportunities for Research in Construction
Love, Peter ED, Fang, Weili, Matthews, Jane, Porter, Stuart, Luo, Hanbin, Ding, Lieyun
Explainable artificial intelligence has received limited attention in construction despite its growing importance in various other industrial sectors. In this paper, we provide a narrative review of XAI to raise awareness about its potential in construction. Our review develops a taxonomy of the XAI literature comprising its precepts and approaches. Opportunities for future XAI research focusing on stakeholder desiderata and data and information fusion are identified and discussed. We hope the opportunities we suggest stimulate new lines of inquiry to help alleviate the scepticism and hesitancy toward AI adoption and integration in construction.
How Virtual Entertainment Is Merging Cutting-Edge Technology With Legacy Techniques
BARCELONA, SPAIN - FEBRUARY 28: A visitor enjoys a Virtual Reality experience at the SK telecom ... [ ] booth on day 1 of the GSMA Mobile World Congress on February 28, 2022 in Barcelona, Spain. A pattern we see in many industries is that technology changes alongside culture and society, and this remains true for entertainment. The most obvious to consumers has been the shift from terrestrial television to streaming services like Netflix NFLX and Amazon AMZN Prime. The greater flexibility allowed consumers to decouple themselves from an entertainment schedule, being dictated by broadcast times. OTT platforms are aligned with the changing patterns of casualized work and study that focus on individual freedom.
Physics-informed machine learning
Despite great progress in simulating multiphysics problems using the numerical discretization of partial differential equations (PDEs), one still cannot seamlessly incorporate noisy data into existing algorithms, mesh generation remains complex, and high-dimensional problems governed by parameterized PDEs cannot be tackled. Moreover, solving inverse problems with hidden physics is often prohibitively expensive and requires different formulations and elaborate computer codes. Machine learning has emerged as a promising alternative, but training deep neural networks requires big data, not always available for scientific problems. Instead, such networks can be trained from additional information obtained by enforcing the physical laws (for example, at random points in the continuous space-time domain). Such physics-informed learning integrates (noisy) data and mathematical models, and implements them through neural networks or other kernel-based regression networks. Moreover, it may be possible to design specialized network architectures that automatically satisfy some of the physical invariants for better accuracy, faster training and improved generalization. Here, we review some of the prevailing trends in embedding physics into machine learning, present some of the current capabilities and limitations and discuss diverse applications of physics-informed learning both for forward and inverse problems, including discovering hidden physics and tackling high-dimensional problems. The rapidly developing field of physics-informed learning integrates data and mathematical models seamlessly, enabling accurate inference of realistic and high-dimensional multiphysics problems. This Review discusses the methodology and provides diverse examples and an outlook for further developments.
ChemVise: Maximizing Out-of-Distribution Chemical Detection with the Novel Application of Zero-Shot Learning
Moore, Alexander M., Paffenroth, Randy C., Ngo, Ken T., Uzarski, Joshua R.
Accurate chemical sensors are vital in medical, military, and home safety applications. Training machine learning models to be accurate on real world chemical sensor data requires performing many diverse, costly experiments in controlled laboratory settings to create a data set. In practice even expensive, large data sets may be insufficient for generalization of a trained model to a real-world testing distribution. Rather than perform greater numbers of experiments requiring exhaustive mixtures of chemical analytes, this research proposes learning approximations of complex exposures from training sets of simple ones by using single-analyte exposure signals as building blocks of a multiple-analyte space. We demonstrate this approach to synthetic sensor responses surprisingly improves the detection of out-of-distribution obscured chemical analytes. Further, we pair these synthetic signals to targets in an information-dense representation space utilizing a large corpus of chemistry knowledge. Through utilization of a semantically meaningful analyte representation spaces along with synthetic targets we achieve rapid analyte classification in the presence of obscurants without corresponding obscured-analyte training data. Transfer learning for supervised learning with molecular representations makes assumptions about the input data. Instead, we borrow from the natural language and natural image processing literature for a novel approach to chemical sensor signal classification using molecular semantics for arbitrary chemical sensor hardware designs.
Robotics in Elderly Healthcare: A Review of 20 Recent Research Projects
Khaksar, Weria, Saplacan, Diana, Bygrave, Lee Andrew, Torresen, Jim
Studies show dramatic increase in elderly population of Western Europe over the next few decades, which will put pressure on healthcare systems. Measures must be taken to meet these social challenges. Healthcare robots investigated to facilitate independent living for elderly. This paper aims to review recent projects in robotics for healthcare from 2008 to 2021. We provide an overview of the focus in this area and a roadmap for upcoming research. Our study was initiated with a literature search using three digital databases. Searches were performed for articles, including research projects containing the words elderly care, assisted aging, health monitoring, or elderly health, and any word including the root word robot. The resulting 20 recent research projects are described and categorized in this paper. Then, these projects were analyzed using thematic analysis. Our findings can be summarized in common themes: most projects have a strong focus on care robots functionalities; robots are often seen as products in care settings; there is an emphasis on robots as commercial products; and there is some limited focus on the design and ethical aspects of care robots. The paper concludes with five key points representing a roadmap for future research addressing robotic for elderly people.
A General Mobile Manipulator Automation Framework for Flexible Manufacturing in Hostile Industrial Environments
Pu, Can, Yang, Chuanyu, Pu, Jinnian, Fisher, Robert B.
To enable a mobile manipulator to perform human tasks from a single teaching demonstration is vital to flexible manufacturing. We call our proposed method MMPA (Mobile Manipulator Process Automation with One-shot Teaching). Currently, there is no effective and robust MMPA framework which is not influenced by harsh industrial environments and the mobile base's parking precision. The proposed MMPA framework consists of two stages: collecting data (mobile base's location, environment information, end-effector's path) in the teaching stage for robot learning; letting the end-effector repeat the nearly same path as the reference path in the world frame to reproduce the work in the automation stage. More specifically, in the automation stage, the robot navigates to the specified location without the need of a precise parking. Then, based on colored point cloud registration, the proposed IPE (Iterative Pose Estimation by Eye & Hand) algorithm could estimate the accurate 6D relative parking pose of the robot arm base without the need of any marker. Finally, the robot could learn the error compensation from the parking pose's bias to modify the end-effector's path to make it repeat a nearly same path in the world coordinate system as recorded in the teaching stage. Hundreds of trials have been conducted with a real mobile manipulator to show the superior robustness of the system and the accuracy of the process automation regardless of the harsh industrial conditions and parking precision. For the released code, please contact marketing@amigaga.com