AITopics

Public conversations on Twitter comprise many pertinent topics including disasters, protests, politics, propaganda, sports, climate change, epidemics/pandemic outbreaks, etc., that can have both regional and global aspects. Spatial discourse analysis rely on geographical data. However, today less than 1% of tweets are geotagged; in both cases--point location or bounding place information. A major issue with tweets is that Twitter users can be at location A and exchange conversations specific to location B, which we call the Location A/B problem. The problem is considered solved if location entities can be classified as either origin locations (Location As) or non-origin locations (Location Bs). In this work, we propose a simple yet effective framework--the True Origin Model--to address the problem that uses machine-level natural language understanding to identify tweets that conceivably contain their origin location information. The model achieves promising accuracy at country (80%), state (67%), city (58%), county (56%) and district (64%) levels with support from a Location Extraction Model as basic as the CoNLL-2003-based RoBERTa. We employ a tweet contexualizer (locBERT) which is one of the core components of the proposed model, to investigate multiple tweets' distributions for understanding Twitter users' tweeting behavior in terms of mentioning origin and non-origin locations. We also highlight a major concern with the currently regarded gold standard test set (ground truth) methodology, introduce a new data set, and identify further research avenues for advancing the area.

artificial intelligence, natural language, tweet, (18 more...)

doi: 10.1109/BigData55660.2022.10020460

2211.16506

Country:

Oceania > Australia > Victoria > Melbourne (0.14)
Europe > Russia (0.04)
Asia > Russia (0.04)
(15 more...)

Genre:

Research Report (1.00)
Personal > Interview (0.40)

Industry:

Information Technology > Services (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.96)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)

Estimating the Uncertainty in Emotion Class Labels with Utterance-Specific Dirichlet Priors

Wu, Wen, Zhang, Chao, Wu, Xixin, Woodland, Philip C.

Emotion recognition is a key attribute for artificial intelligence systems that need to naturally interact with humans. However, the task definition is still an open problem due to the inherent ambiguity of emotions. In this paper, a novel Bayesian training loss based on per-utterance Dirichlet prior distributions is proposed for verbal emotion recognition, which models the uncertainty in one-hot labels created when human annotators assign the same utterance to different emotion classes. An additional metric is used to evaluate the performance by detection test utterances with high labelling uncertainty. This removes a major limitation that emotion classification systems only consider utterances with labels where the majority of annotators agree on the emotion class. Furthermore, a frequentist approach is studied to leverage the continuous-valued "soft" labels obtained by averaging the one-hot labels. We propose a two-branch model structure for emotion classification on a per-utterance basis, which achieves state-of-the-art classification results on the widely used IEMOCAP dataset. Based on this, uncertainty estimation experiments were performed. The best performance in terms of the area under the precision-recall curve when detecting utterances with high uncertainty was achieved by interpolating the Bayesian training loss with the Kullback-Leibler divergence training loss for the soft labels. The generality of the proposed approach was verified using the MSP-Podcast dataset which yielded the same pattern of results.

artificial intelligence, machine learning, utterance, (19 more...)

doi: 10.1109/TAFFC.2022.3221801

2203.04443

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
Asia > China > Hong Kong (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(28 more...)

Genre:

Personal (0.93)
Research Report > New Finding (0.67)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Chia, Yew Ken, Bing, Lidong, Aljunied, Sharifah Mahani, Si, Luo, Poria, Soujanya

A Dataset for Hyper-Relational Extraction and a Cube-Filling Approach

Relation extraction has the potential for large-scale knowledge graph construction, but current methods do not consider the qualifier attributes for each relation triplet, such as time, quantity or location. The qualifiers form hyper-relational facts which better capture the rich and complex knowledge graph structure. For example, the relation triplet (Leonard Parker, Educated At, Harvard University) can be factually enriched by including the qualifier (End Time, 1967). Hence, we propose the task of hyper-relational extraction to extract more specific and complete facts from text. To support the task, we construct HyperRED, a large-scale and general-purpose dataset. Existing models cannot perform hyper-relational extraction as it requires a model to consider the interaction between three entities. Hence, we propose CubeRE, a cube-filling model inspired by table-filling approaches and explicitly considers the interaction between relation triplets and qualifiers. To improve model scalability and reduce negative class imbalance, we further propose a cube-pruning method. Our experiments show that CubeRE outperforms strong baselines and reveal possible directions for future research. Our code and data are available at github.com/declare-lab/HyperRED.

data mining, machine learning, natural language, (18 more...)

2211.10018

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Ohio (0.05)
North America > United States > New York > New York County > New York City (0.04)
(17 more...)

Genre:

Research Report (0.63)
Personal (0.46)

Industry:

Leisure & Entertainment (1.00)
Media (0.67)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Perez, Fábio, Ribeiro, Ian

Ignore Previous Prompt: Attack Techniques For Language Models

Transformer-based large language models (LLMs) provide a powerful foundation for natural language tasks in large-scale customer-facing applications. However, studies that explore their vulnerabilities emerging from malicious user interaction are scarce. By proposing PromptInject, a prosaic alignment framework for mask-based iterative adversarial prompt composition, we examine how GPT-3, the most widely deployed language model in production, can be easily misaligned by simple handcrafted inputs. In particular, we investigate two types of attacks -- goal hijacking and prompt leaking -- and demonstrate that even low-aptitude, but sufficiently ill-intentioned agents, can easily exploit GPT-3's stochastic nature, creating long-tail risks. The code for PromptInject is available at https://github.com/agencyenterprise/PromptInject.

large language model, machine learning, natural language, (14 more...)

2211.09527

Country:

Asia > Middle East > Republic of Türkiye > Batman Province > Batman (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > California > Los Angeles County > Los Angeles (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Genre:

Research Report (1.00)
Personal > Interview (0.46)

Industry:

Government (0.93)
Information Technology > Security & Privacy (0.93)
Law Enforcement & Public Safety (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

#artificialintelligenceNov-15-2022, 20:23:51 GMT

Could AI Ever Pass the Van Gogh Test?

That is, the Van Gogh Test for sheer creativity. This past Thursday night, Discovery Institute's tech summit COSM 2022 presented a live, in-person interview with Federico Faggin, the Italian physicist and computer engineer who co-won the prestigious Kyoto Prize in 1997 for helping develop the Intel 4004 chip. Faggin was interviewed by technology reporter Maria Teresa Cometto, who asked him to regale the audience with tales about helping to design early microchips. Eventually Faggin recounted a time when he was "studying neuroscience and biology, trying to understand how the brain works," and came upon a startling realization: And at one point I asked myself, "But wait a second, I mean these books, all this talk about electrical signals, biochemical signals, but when I taste some chocolate, I mean I have a taste. A computer, does it taste this? Does it have a sensation or a feeling for the signals that he has in his memory or in his CPU? So where are sensations and feelings coming from?" … And so I discovered what was later called the hard problem of consciousness.

computer, faggin, van gogh test, (11 more...)

#artificialintelligence

Country: Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.25)

Genre: Personal (0.77)

Technology:

Information Technology > Artificial Intelligence > Issues (0.37)
Information Technology > Artificial Intelligence > Cognitive Science (0.36)

arXiv.org Artificial IntelligenceNov-15-2022

Large Language Models and the Reverse Turing Test

Sejnowski, Terrence

Large Language Models (LLMs) have been transformative. They are pre-trained foundational models that are self-supervised and can be adapted with fine tuning to a wide range of natural language tasks, each of which previously would have required a separate network model. This is one step closer to the extraordinary versatility of human language. GPT-3 and more recently LaMDA can carry on dialogs with humans on many topics after minimal priming with a few examples. However, there has been a wide range of reactions and debate on whether these LLMs understand what they are saying or exhibit signs of intelligence. This high variance is exhibited in three interviews with LLMs reaching wildly different conclusions. A new possibility was uncovered that could explain this divergence. What appears to be intelligence in LLMs may in fact be a mirror that reflects the intelligence of the interviewer, a remarkable twist that could be considered a Reverse Turing Test. If so, then by studying interviews we may be learning more about the intelligence and beliefs of the interviewer than the intelligence of the LLMs. As LLMs become more capable they may transform the way we interact with machines and how they interact with each other. Increasingly, LLMs are being coupled with sensorimotor devices. LLMs can talk the talk, but can they walk the walk? A road map for achieving artificial general autonomy is outlined with seven major improvements inspired by brain systems. LLMs could be used to uncover new insights into brain function by downloading brain data during natural behaviors.

large language model, machine learning, natural language, (21 more...)

doi: 10.1162/neco_a_01563

2207.14382

Country:

Pacific Ocean > North Pacific Ocean > San Francisco Bay > Golden Gate (0.05)
Africa > Middle East > Egypt (0.05)
Atlantic Ocean > North Atlantic Ocean > English Channel (0.04)
(7 more...)

Genre:

Personal > Interview (0.46)
Personal > Honors (0.46)

Industry:

Leisure & Entertainment > Games (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Education > Educational Setting (1.00)
Media (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceNov-15-2022

Generative Long-form Question Answering: Relevance, Faithfulness and Succinctness

Su, Dan

In this thesis, we investigated the relevance, faithfulness, and succinctness aspects of Long Form Question Answering (LFQA). LFQA aims to generate an in-depth, paragraph-length answer for a given question, to help bridge the gap between real scenarios and the existing open-domain QA models which can only extract short-span answers. LFQA is quite challenging and under-explored. Few works have been done to build an effective LFQA system. It is even more challenging to generate a good-quality long-form answer relevant to the query and faithful to facts, since a considerable amount of redundant, complementary, or contradictory information will be contained in the retrieved documents. Moreover, no prior work has been investigated to generate succinct answers. We are among the first to research the LFQA task. We pioneered the research direction to improve the answer quality in terms of 1) query-relevance, 2) answer faithfulness, and 3) answer succinctness.

large language model, machine learning, question answering, (21 more...)

2211.08386

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > China > Hong Kong (0.04)
North America > United States > New York (0.04)
(20 more...)

Genre:

Personal (0.92)
Research Report > New Finding (0.67)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Government > Military (1.00)
Education (1.00)
(8 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
(2 more...)

RobohubNov-14-2022, 09:00:14 GMT

#IROS2022 best paper awards

Did you have the chance to attend the 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2022) in Kyoto? Here we bring you the papers that received an award this year in case you missed them.

iros2022 best paper award, learning efficient bimanual folding, torsten kroeger, (7 more...)

Robohub

Country: Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.27)

Genre: Personal > Honors (0.73)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Suzgun, Mirac, Melas-Kyriazi, Luke, Jurafsky, Dan

Follow the Wisdom of the Crowd: Effective Text Generation via Minimum Bayes Risk Decoding

arXiv.org Artificial IntelligenceNov-14-2022

In open-ended natural-language generation, existing text decoding methods typically struggle to produce text which is both diverse and high-quality. Greedy and beam search are known to suffer from text degeneration and linguistic diversity issues, while temperature, top-k, and nucleus sampling often yield diverse but low-quality outputs. In this work, we present crowd sampling, a family of decoding methods based on Bayesian risk minimization, to address this diversity-quality trade-off. Inspired by the principle of "the wisdom of the crowd," crowd sampling seeks to select a candidate from a pool of candidates that has the least expected risk (i.e., highest expected reward) under a generative model according to a given utility function. Crowd sampling can be seen as a generalization of numerous existing methods, including majority voting, and in practice, it can be used as a drop-in replacement for existing sampling methods. Extensive experiments show that crowd sampling delivers improvements of 3-7 ROUGE and BLEU points across a wide range of tasks, including summarization, data-to-text, translation, and textual style transfer, while achieving new state-of-the-art results on WebNLG and WMT'16.

computational linguistic, large language model, machine learning, (18 more...)

2211.07634

Country:

North America > United States > Wisconsin > Outagamie County > Appleton (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Nepal > Bagmati Province > Kathmandu District > Kathmandu (0.04)
(35 more...)

Genre:

Research Report > New Finding (0.68)
Personal > Obituary (0.46)

Industry:

Law > Criminal Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Consumer Products & Services (0.68)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.94)
(2 more...)

arXiv.org Artificial IntelligenceNov-14-2022

Gradient Imitation Reinforcement Learning for General Low-Resource Information Extraction

Hu, Xuming, Meng, Shiao, Zhang, Chenwei, Yang, Xiangli, Wen, Lijie, King, Irwin, Yu, Philip S.

Abstract--Information Extraction (IE) aims to extract structured information from heterogeneous sources. IE from natural language texts include sub-tasks such as Named Entity Recognition (NER), Relation Extraction (RE), and Event Extraction (EE). Most IE systems require comprehensive understandings of sentence structure, implied semantics, and domain knowledge to perform well; thus, IE tasks always need adequate external resources and annotations. However, it takes time and effort to obtain more human annotations. Low-Resource Information Extraction (LRIE) strives to use unsupervised data, reducing the required resources and human annotation. In practice, existing systems either utilize self-training schemes to generate pseudo labels that will cause the gradual drift problem, or leverage consistency regularization methods which inevitably possess confirmation bias. To alleviate confirmation bias due to the lack of feedback loops in existing LRIE learning paradigms, we develop a Gradient Imitation Reinforcement Learning (GIRL) method to encourage pseudo-labeled data to imitate the gradient descent direction on labeled data, which can force pseudo-labeled data to achieve better optimization capabilities similar to labeled data. Based on how well the pseudo-labeled data imitates the instructive gradient descent direction obtained from labeled data, we design a reward to quantify the imitation process and bootstrap the optimization capability of pseudo-labeled data through trial and error. In addition to learning paradigms, GIRL is not limited to specific sub-tasks, and we leverage GIRL to solve all IE sub-tasks (named entity recognition, relation extraction, and event extraction) in low-resource settings (semi-supervised IE and few-shot IE).

machine learning, natural language, vanilla model, (14 more...)

2211.06014

Country:

Europe > United Kingdom (0.14)
North America > United States > Florida (0.05)
North America > United States > Illinois > Cook County > Chicago (0.04)
(11 more...)

Genre:

Personal (0.67)
Research Report (0.50)

Industry:

Education (1.00)
Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)