AITopics

2110.10007

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
Europe > Greece > Central Macedonia > Thessaloniki (0.04)
South America > Brazil > São Paulo (0.04)
(19 more...)

Genre:

Research Report (1.00)
Workflow (0.93)

Industry: Transportation (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

arXiv.org Artificial IntelligenceOct-19-2021

HETFORMER: Heterogeneous Transformer with Sparse Attention for Long-Text Extractive Summarization

Liu, Ye, Zhang, Jian-Guo, Wan, Yao, Xia, Congying, He, Lifang, Yu, Philip S.

To capture the semantic graph structure from raw text, most existing summarization approaches are built on GNNs with a pre-trained model. However, these methods suffer from cumbersome procedures and inefficient computations for long-text documents. To mitigate these issues, this paper proposes HETFORMER, a Transformer-based pre-trained model with multi-granularity sparse attentions for long-text extractive summarization. Specifically, we model different types of semantic nodes in raw text as a potential heterogeneous graph and directly learn heterogeneous relationships (edges) among nodes by Transformer. Extensive experiments on both single- and multi-document summarization tasks show that HETFORMER achieves state-of-the-art performance in Rouge F1 while using less memory and fewer parameters.

node, proceedings, summarization, (14 more...)

2110.06388

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Pennsylvania > Northampton County > Bethlehem (0.04)
Asia > China (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Edelman, Benjamin L., Goel, Surbhi, Kakade, Sham, Zhang, Cyril

Inductive Biases and Variable Creation in Self-Attention Mechanisms

arXiv.org Machine LearningOct-19-2021

Self-attention, an architectural motif designed to model long-range interactions in sequential data, has driven numerous recent breakthroughs in natural language processing and beyond. This work provides a theoretical analysis of the inductive biases of self-attention modules, where our focus is to rigorously establish which functions and long-range dependencies self-attention blocks prefer to represent. Our main result shows that bounded-norm Transformer layers create sparse variables: they can represent sparse functions of the input sequence, with sample complexity scaling only logarithmically with the context length. Furthermore, we propose new experimental protocols to support this analysis and to guide the practice of training Transformers, built around the large body of work on provably learning sparse Boolean functions.

arxiv preprint arxiv, softmax, transformer, (14 more...)

arXiv.org Machine Learning

2110.1009

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

#artificialintelligenceOct-18-2021, 03:25:13 GMT

Defying the odds!

The phrase "overcoming the odds" is an understatement for 24-year-old Joshua Burgess. Though born with congenital rubella syndrome, which has caused him to suffer from a number of health challenges over the years, he continues to break barriers. On September 28, Burgess participated in the prestigious UNESCO Information for All Programme's (IFAP) Second Artificial Intelligence for Information Accessibility (AI4IA) Conference, where he spoke about'Openness and Inclusivity for the Disabled Community in a New Era'. "My presentation reflected my views as a young, blind Jamaican also living with chronic hearing loss. It was important for me to note that, while I have benefited from artificial intelligence's (AI) ability to help me integrate into society, it is also important for us to recognise that it is not a one-size-fits-all. We must collaborate with key stakeholders to ensure openness, inclusivity, fairness, and accessibility for everyone," said Burgess.

burgess, unesco information, university, (10 more...)

#artificialintelligence

Country:

North America > Canada > Alberta (0.16)
North America > Jamaica (0.07)
South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.05)
(4 more...)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology: Information Technology > Artificial Intelligence (1.00)

Natural Language Processing for Smart Healthcare

Zhou, Binggui, Yang, Guanghua, Shi, Zheng, Ma, Shaodan

Smart healthcare has achieved significant progress in recent years. Emerging artificial intelligence (AI) technologies enable various smart applications across various healthcare scenarios. As an essential technology powered by AI, natural language processing (NLP) plays a key role in smart healthcare due to its capability of analysing and understanding human language. In this work we review existing studies that concern NLP for smart healthcare from the perspectives of technique and application. We focus on feature extraction and modelling for various NLP tasks encountered in smart healthcare from a technical point of view. In the context of smart healthcare applications employing NLP techniques, the elaboration largely attends to representative smart healthcare scenarios, including clinical practice, hospital management, personal care, public health, and drug development. We further discuss the limitations of current works and identify the directions for future works.

application, healthcare, smart healthcare, (14 more...)

2110.15803

Country:

Asia > Macao (0.14)
Asia > China > Guangdong Province > Zhuhai (0.04)
North America > United States > New York > New York County > New York City (0.04)
(18 more...)

Genre:

Overview (0.87)
Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
(4 more...)

Corry, Frances, Sridharan, Hamsini, Luccioni, Alexandra Sasha, Ananny, Mike, Schultz, Jason, Crawford, Kate

The Problem of Zombie Datasets:A Framework For Deprecating Datasets

What happens when a machine learning dataset is deprecated for legal, ethical, or technical reasons, but continues to be widely used? In this paper, we examine the public afterlives of several prominent deprecated or redacted datasets, including ImageNet, 80 Million Tiny Images, MS-Celeb-1M, Duke MTMC, Brainwash, and HRT Transgender, in order to inform a framework for more consistent, ethical, and accountable dataset deprecation. Building on prior research, we find that there is a lack of consistency, transparency, and centralized sourcing of information on the deprecation of datasets, and as such, these datasets and their derivatives continue to be cited in papers and circulate online. These datasets that never die -- which we term "zombie datasets" -- continue to inform the design of production-level systems, causing technical, legal, and ethical challenges; in so doing, they risk perpetuating the harms that prompted their supposed withdrawal, including concerns around bias, discrimination, and privacy. Based on this analysis, we propose a Dataset Deprecation Framework that includes considerations of risk, mitigation of impact, appeal mechanisms, timeline, post-deprecation protocol, and publication checks that can be adapted and implemented by the machine learning community. Drawing on work on datasheets and checklists, we further offer two sample dataset deprecation sheets and propose a centralized repository that tracks which datasets have been deprecated and could be incorporated into the publication protocols of venues like NeurIPS.

dataset, deprecation, repository, (13 more...)

2111.04424

Country:

Oceania > Australia (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Europe > France (0.04)
(10 more...)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Law > Litigation (0.93)
Government (0.93)
Law > Civil Rights & Constitutional Law (0.69)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Speech Representation Learning Through Self-supervised Pretraining And Multi-task Finetuning

Chen, Yi-Chen, Yang, Shu-wen, Lee, Cheng-Kuang, See, Simon, Lee, Hung-yi

Speech representation learning plays a vital role in speech processing. Among them, self-supervised learning (SSL) has become an important research direction. It has been shown that an SSL pretraining model can achieve excellent performance in various downstream tasks of speech processing. On the other hand, supervised multi-task learning (MTL) is another representation learning paradigm, which has been proven effective in computer vision (CV) and natural language processing (NLP). However, there is no systematic research on the general representation learning model trained by supervised MTL in speech processing. In this paper, we show that MTL finetuning can further improve SSL pretraining. We analyze the generalizability of supervised MTL finetuning to examine if the speech representation learned by MTL finetuning can generalize to unseen new tasks.

mtl, representation, scenario, (12 more...)

2110.0993

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Asia > Taiwan > Taiwan Province > Taipei (0.04)

Genre: Research Report (0.40)

Industry: Education (0.30)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.69)

Ensemble ALBERT on SQuAD 2.0

Li, Shilun, Li, Renee, Peng, Veronica

Machine question answering is an essential yet challenging task in natural language processing. Recently, Pre-trained Contextual Embeddings (PCE) models like Bidirectional Encoder Representations from Transformers (BERT) and A Lite BERT (ALBERT) have attracted lots of attention due to their great performance in a wide range of NLP tasks. In our Paper, we utilized the fine-tuned ALBERT models and implemented combinations of additional layers (e.g. attention layer, RNN layer) on top of them to improve model performance on Stanford Question Answering Dataset (SQuAD 2.0). We implemented four different models with different layers on top of ALBERT-base model, and two other models based on ALBERT-xlarge and ALBERT-xxlarge. We compared their performance to our baseline model ALBERT-base-v2 + ALBERT-SQuAD-out with details. Our best-performing individual model is ALBERT-xxlarge + ALBERT-SQuAD-out, which achieved an F1 score of 88.435 on the dev set. Furthermore, we have implemented three different ensemble algorithms to boost overall performance. By passing in several best-performing models' results into our weighted voting ensemble algorithm, our final result ranks first on the Stanford CS224N Test PCE SQuAD Leaderboard with F1 = 90.123.

albert-squad-out, albert-xxlarge albert-squad-out, paragraph, (17 more...)

2110.09665

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.05)
Africa > Middle East > Egypt (0.05)
South America > Colombia (0.04)
Europe > France > Occitanie > Hérault > Montpellier (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

León, Borja G., Shanahan, Murray, Belardinelli, Francesco

In a Nutshell, the Human Asked for This: Latent Goals for Following Temporal Specifications

We address the problem of building agents whose goal is to satisfy out-of distribution (OOD) multi-task instructions expressed in temporal logic (TL) by using deep reinforcement learning (DRL). Recent works provided evidence that the deep learning architecture is a key feature when teaching a DRL agent to solve OOD tasks in TL. Yet, the studies on their performance are still limited. In this work, we analyse various state-of-the-art (SOTA) architectures that include generalisation mechanisms such as relational layers, the soft-attention mechanism, or hierarchical configurations, when generalising safety-aware tasks expressed in TL. Most importantly, we present a novel deep learning architecture that induces agents to generate latent representations of their current goal given both the human instruction and the current observation from the environment. We find that applying our proposed configuration to SOTA architectures yields significantly stronger performance when executing new tasks in OOD environments.

agent, architecture, instruction, (16 more...)

2110.09461

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(8 more...)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Edge Rewiring Goes Neural: Boosting Network Resilience via Policy Gradient

Yang, Shanchao, Ma, Kaili, Wang, Baoxiang, Zha, Hongyuan

Improving the resilience of a network protects the system from natural disasters and malicious attacks. This is typically achieved by introducing new edges, which however may reach beyond the maximum number of connections a node could sustain. Many studies then resort to the degree-preserving operation of rewiring, which swaps existing edges $AC, BD$ to new edges $AB, CD$. A significant line of studies focuses on this technique for theoretical and practical results while leaving three limitations: network utility loss, local optimality, and transductivity. In this paper, we propose ResiNet, a reinforcement learning (RL)-based framework to discover resilient network topologies against various disasters and attacks. ResiNet is objective agnostic which allows the utility to be balanced by incorporating it into the objective function. The local optimality, typically seen in greedy algorithms, is addressed by casting the cumulative resilience gain into a sequential decision process of step-wise rewiring. The transductivity, which refers to the necessity to run a computationally intensive optimization for each input graph, is lifted by our variant of RL with auto-regressive permutation-invariant variable action space. ResiNet is armed by our technical innovation, Filtration enhanced GNN (FireGNN), which distinguishes graphs with minor differences. It is thus possible for ResiNet to capture local structure changes and adapt its decision among consecutive graphs, which is known to be infeasible for GNN. Extensive experiments demonstrate that with a small number of rewiring operations, ResiNet achieves a near-optimal resilience gain on multiple graphs while balancing the utility, with a large margin compared to existing approaches.

graph, node, resinet, (17 more...)

2110.09035

Country:

North America > United States > California > Los Angeles County > Long Beach (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > China > Hong Kong (0.04)
(7 more...)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (0.88)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
(4 more...)