Goto

Collaborating Authors

 Pacific Ocean



Document-aware Positional Encoding and Linguistic-guided Encoding for Abstractive Multi-document Summarization

arXiv.org Artificial Intelligence

One key challenge in multi-document summarization is to capture the relations among input documents that distinguish between single document summarization (SDS) and multi-document summarization (MDS). Few existing MDS works address this issue. One effective way is to encode document positional information to assist models in capturing cross-document relations. However, existing MDS models, such as Transformer-based models, only consider token-level positional information. Moreover, these models fail to capture sentences' linguistic structure, which inevitably causes confusions in the generated summaries. Therefore, in this paper, we propose document-aware positional encoding and linguistic-guided encoding that can be fused with Transformer architecture for MDS. For document-aware positional encoding, we introduce a general protocol to guide the selection of document encoding functions. For linguistic-guided encoding, we propose to embed syntactic dependency relations into the dependency relation mask with a simple but effective non-linear encoding learner for feature learning. Extensive experiments show the proposed model can generate summaries with high quality.


Remote Build Engineer openings near you -Updated September 11, 2022 - Remote Tech Jobs

#artificialintelligence

Exygy seeks an enthusiastic, experienced, and creative Full Stack Engineer who is passionate about making a difference in the world with technology. Join our tight-knit and growing team to build a wide variety of high-impact projects with our civic and health sector clients. This is a full-time remote position. As a senior engineer, you'll spend most of your time on multiple client project teams, building new web applications, while supporting and expanding existing projects. You'll work with our cross-functional core team, and our network of remote contributors to define, design, and deliver high-quality web software.


Cloud labs and remote research aren't the future of science โ€“ they're here

The Guardian

It's 1am on the west coast of America, but the Emerald Cloud Lab, just south of San Francisco, is still busy. I'm "visiting" via the camera on a chest-high telepresence robot, being driven round the 1,400 sq metre (15,000 sq ft) lab by Emerald's CEO, Brian Frezza, who is also sitting at home. There are no actual scientists anywhere, just a few staff in blue coats quietly following instructions from screens on their trolleys, ensuring the instruments are loaded with reagents and samples. Cloud labs mean anybody, anywhere can conduct experiments by remote control, using nothing more than their web browser. Experiments are programmed through a subscription-based online interface โ€“ software then coordinates robots and automated scientific instruments to perform the experiment and process the data.


Knowledge-based Deep Learning for Modeling Chaotic Systems

arXiv.org Machine Learning

Deep Learning has received increased attention due to its unbeatable success in many fields, such as computer vision, natural language processing, recommendation systems, and most recently in simulating multiphysics problems and predicting nonlinear dynamical systems. However, modeling and forecasting the dynamics of chaotic systems remains an open research problem since training deep learning models requires big data, which is not always available in many cases. Such deep learners can be trained from additional information obtained from simulated results and by enforcing the physical laws of the chaotic systems. This paper considers extreme events and their dynamics and proposes elegant models based on deep neural networks, called knowledge-based deep learning (KDL). Our proposed KDL can learn the complex patterns governing chaotic systems by jointly training on real and simulated data directly from the dynamics and their differential equations. This knowledge is transferred to model and forecast real-world chaotic events exhibiting extreme behavior. We validate the efficiency of our model by assessing it on three real-world benchmark datasets: El Nino sea surface temperature, San Juan Dengue viral infection, and Bj{\o}rn{\o}ya daily precipitation, all governed by extreme events' dynamics. Using prior knowledge of extreme events and physics-based loss functions to lead the neural network learning, we ensure physically consistent, generalizable, and accurate forecasting, even in a small data regime.


FathomNet: A global image database for enabling artificial intelligence in the ocean

arXiv.org Artificial Intelligence

The ocean is experiencing unprecedented rapid change, and visually monitoring marine biota at the spatiotemporal scales needed for responsible stewardship is a formidable task. As baselines are sought by the research community, the volume and rate of this required data collection rapidly outpaces our abilities to process and analyze them. Recent advances in machine learning enables fast, sophisticated analysis of visual data, but have had limited success in the ocean due to lack of data standardization, insufficient formatting, and demand for large, labeled datasets. To address this need, we built FathomNet, an open-source image database that standardizes and aggregates expertly curated labeled data. FathomNet has been seeded with existing iconic and non-iconic imagery of marine animals, underwater equipment, debris, and other concepts, and allows for future contributions from distributed data sources. We demonstrate how FathomNet data can be used to train and deploy models on other institutional video to reduce annotation effort, and enable automated tracking of underwater concepts when integrated with robotic vehicles. As FathomNet continues to grow and incorporate more labeled data from the community, we can accelerate the processing of visual data to achieve a healthy and sustainable global ocean.


Large Graph Signal Denoising with Application to Differential Privacy

arXiv.org Machine Learning

Over the last decade, signal processing on graphs has become a very active area of research. Specifically, the number of applications, for instance in statistical or deep learning, using frames built from graphs, such as wavelets on graphs, has increased significantly. We consider in particular the case of signal denoising on graphs via a data-driven wavelet tight frame methodology. This adaptive approach is based on a threshold calibrated using Stein's unbiased risk estimate adapted to a tight-frame representation. We make it scalable to large graphs using Chebyshev-Jackson polynomial approximations, which allow fast computation of the wavelet coefficients, without the need to compute the Laplacian eigendecomposition. However, the overcomplete nature of the tight-frame, transforms a white noise into a correlated one. As a result, the covariance of the transformed noise appears in the divergence term of the SURE, thus requiring the computation and storage of the frame, which leads to an impractical calculation for large graphs. To estimate such covariance, we develop and analyze a Monte-Carlo strategy, based on the fast transformation of zero mean and unit variance random variables. This new data-driven denoising methodology finds a natural application in differential privacy. A comprehensive performance analysis is carried out on graphs of varying size, from real and simulated data.


HAGCN : Network Decentralization Attention Based Heterogeneity-Aware Spatiotemporal Graph Convolution Network for Traffic Signal Forecasting

arXiv.org Artificial Intelligence

The construction of spatiotemporal networks using graph convolution networks (GCNs) has become one of the most popular methods for predicting traffic signals. However, when using a GCN for traffic speed prediction, the conventional approach generally assumes the relationship between the sensors as a homogeneous graph and learns an adjacency matrix using the data accumulated by the sensors. However, the spatial correlation between sensors is not specified as one but defined differently from various viewpoints. To this end, we aim to study the heterogeneous characteristics inherent in traffic signal data to learn the hidden relationships between sensors in various ways. Specifically, we designed a method to construct a heterogeneous graph for each module by dividing the spatial relationship between sensors into static and dynamic modules. We propose a network decentralization attention based heterogeneity-aware graph convolution network (HAGCN) method that aggregates the hidden states of adjacent nodes by considering the importance of each channel in a heterogeneous graph. Experimental results on real traffic datasets verified the effectiveness of the proposed method, achieving a 6.35% improvement over the existing model and realizing state-of-the-art prediction performance.


Evaluating Short-Term Forecasting of Multiple Time Series in IoT Environments

arXiv.org Artificial Intelligence

Modern Internet of Things (IoT) environments are monitored via a large number of IoT enabled sensing devices, with the data acquisition and processing infrastructure setting restrictions in terms of computational power and energy resources. To alleviate this issue, sensors are often configured to operate at relatively low sampling frequencies, yielding a reduced set of observations. Nevertheless, this can hamper dramatically subsequent decision-making, such as forecasting. To address this problem, in this work we evaluate short-term forecasting in highly underdetermined cases, i.e., the number of sensor streams is much higher than the number of observations. Several statistical, machine learning and neural network-based models are thoroughly examined with respect to the resulting forecasting accuracy on five different real-world datasets. The focus is given on a unified experimental protocol especially designed for short-term prediction of multiple time series at the IoT edge. The proposed framework can be considered as an important step towards establishing a solid forecasting strategy in resource constrained IoT applications.


Ubisoft confirms 'Assassin's Creed Mirage,' a stand-alone title in the Middle East

Engadget

After plenty of leaks, Ubisoft has confirmed that Assassin's Creed Mirage is the next entry in its long-running series. More details are expected to drop during the Ubisoft Forward event September 10th, but for now we can gleam some tidbits from the announcement image. It shows Basim Ibn Ishaq, a character from the recent Assassin's Creed Valhalla, leaping with his hidden blade in front of the Palace of the Golden Gate in Baghdad (via Polygon). That lines up with previous leaks around the game's setting, which also indicated that Mirage would be a return to stealth gameplay for the series. The new title was originally intended to be DLC for Valhalla, but Bloomberg reports that it was later transformed into a standalone experience to fill out Ubisoft's release schedule. No matter its conception, it's nice to see the series return to its Middle Eastern roots.