AITopics | Pacific Ocean

Collaborating Authors

Pacific Ocean

ARES: An Automated Evaluation Framework for Retrieval-Augmented Generation Systems

Saad-Falcon, Jon, Khattab, Omar, Potts, Christopher, Zaharia, Matei

arXiv.org Artificial IntelligenceNov-15-2023

Evaluating retrieval-augmented generation (RAG) systems traditionally relies on hand annotations for input queries, passages to retrieve, and responses to generate. We introduce ARES, an Automated RAG Evaluation System, for evaluating RAG systems along the dimensions of context relevance, answer faithfulness, and answer relevance. Using synthetic training data, ARES finetunes lightweight LM judges to assess the quality of individual RAG components. To mitigate potential prediction errors, ARES utilizes a small set of human-annotated datapoints for prediction-powered inference (PPI). Across six different knowledge-intensive tasks in KILT and SuperGLUE, ARES accurately evaluates RAG systems while using a few hundred human annotations during evaluation. Furthermore, ARES judges remain effective across domain shifts, proving accurate even after changing the type of queries and/or documents used in the evaluated RAG systems. We make our datasets and code for replication and deployment available at https://github.com/stanford-futuredata/ARES.

context relevance, rag system, relevance, (13 more...)

arXiv.org Artificial Intelligence

2311.09476

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > New York (0.04)
North America > United States > California > Los Angeles County > Los Angeles (0.04)
(3 more...)

Genre: Research Report (0.40)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Alternatives to the Scaled Dot Product for Attention in the Transformer Neural Network Architecture

Bernhard, James

arXiv.org Artificial IntelligenceNov-15-2023

The transformer neural network architecture uses a form of attention in which the dot product of query and key is divided by the square root of the key dimension before applying softmax. This scaling of the dot product is designed to avoid the absolute value of the dot products becoming so large that applying softmax leads to vanishing gradients. In this paper, we propose some alternative scalings, including dividing the dot product instead by the sum of the key lengths before applying softmax. We use simulated keys and queries to show that in many situations this appears to be more effective at avoiding regions where applying softmax leads to vanishing gradients. Attention plays a prominent role in the transformer neural network architecture, as indicated by the title of the landmark paper introducing the architecture, "Attention Is All You Need" [1], by Vaswani et al.

attention function, scalar attention function, softmax, (14 more...)

arXiv.org Artificial Intelligence

2311.09406

Country:

Europe > Austria > Vienna (0.14)
Pacific Ocean > North Pacific Ocean > Puget Sound (0.04)
North America > United States > New York (0.04)

Genre: Research Report (0.41)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

When does In-context Learning Fall Short and Why? A Study on Specification-Heavy Tasks

Peng, Hao, Wang, Xiaozhi, Chen, Jianhui, Li, Weikai, Qi, Yunjia, Wang, Zimu, Wu, Zhili, Zeng, Kaisheng, Xu, Bin, Hou, Lei, Li, Juanzi

arXiv.org Artificial IntelligenceNov-15-2023

In-context learning (ICL) has become the default method for using large language models (LLMs), making the exploration of its limitations and understanding the underlying causes crucial. In this paper, we find that ICL falls short of handling specification-heavy tasks, which are tasks with complicated and extensive task specifications, requiring several hours for ordinary humans to master, such as traditional information extraction tasks. The performance of ICL on these tasks mostly cannot reach half of the state-of-the-art results. To explore the reasons behind this failure, we conduct comprehensive experiments on 18 specification-heavy tasks with various LLMs and identify three primary reasons: inability to specifically understand context, misalignment in task schema comprehension with humans, and inadequate long-text understanding ability. Furthermore, we demonstrate that through fine-tuning, LLMs can achieve decent performance on these tasks, indicating that the failure of ICL is not an inherent flaw of LLMs, but rather a drawback of existing alignment methods that renders LLMs incapable of handling complicated specification-heavy tasks via ICL. To substantiate this, we perform dedicated instruction tuning on LLMs for these tasks and observe a notable improvement. We hope the analyses in this paper could facilitate advancements in alignment methods enabling LLMs to meet more sophisticated human demands.

arxiv preprint arxiv, proceedings, specification-heavy task, (13 more...)

arXiv.org Artificial Intelligence

2311.08993

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > Philippines (0.04)
North America > United States > Texas (0.04)
(15 more...)

Genre: Research Report (0.86)

Industry:

Transportation (0.46)
Government (0.46)
Law (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

Estimating Appearance Models for Image Segmentation via Tensor Factorization

Neto, Jeova Farias Sales Rocha

arXiv.org Machine LearningNov-15-2023

Image Segmentation is one of the core tasks in Computer Vision and solving it often depends on modeling the image appearance data via the color distributions of each it its constituent regions. Whereas many segmentation algorithms handle the appearance models dependence using alternation or implicit methods, we propose here a new approach to directly estimate them from the image without prior information on the underlying segmentation. Our method uses local high order color statistics from the image as an input to tensor factorization-based estimator for latent variable models. This approach is able to estimate models in multiregion images and automatically output the regions proportions without prior user interaction, overcoming the drawbacks from a prior attempt to this problem. We also demonstrate the performance of our proposed method in many challenging synthetic and real imaging scenarios and show that it leads to an efficient segmentation algorithm.

artificial intelligence, machine learning, segmentation, (16 more...)

arXiv.org Machine Learning

2208.07853

Country:

North America > United States > California > San Francisco County > San Francisco (0.04)
Europe > Czechia > Prague (0.04)
Pacific Ocean > North Pacific Ocean > San Francisco Bay (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Biden and Xi look to put floor under plummeting U.S.-China ties

The Japan TimesNov-14-2023, 08:23:00 GMT

Nearly a year to the date since their last meeting, U.S. President Joe Biden and Chinese leader Xi Jinping will sit down Wednesday in the San Francisco Bay Area to try and put a floor under ties that have plummeted to fresh lows in recent months. When Biden and Xi meet on the sidelines of the Asia-Pacific Economic Cooperation forum in San Francisco, both will have a laundry list of concerns to discuss. From military-to-military lines of communication, Taiwan, and the South and East China Seas to tough U.S. semiconductor export controls, the manufacture and export of fentanyl, and artificial intelligence threats -- all will be on the table during several hours of discussions. But don't expect the talks -- the pair's seventh interaction since the start of the Biden administration but just the second in-person meeting -- to yield any dramatic breakthroughs.

biden and xi

The Japan Times

Country:

Asia > China (1.00)
North America > United States > California > San Francisco County > San Francisco (0.57)
Pacific Ocean > North Pacific Ocean > San Francisco Bay (0.31)
(2 more...)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Government > Regional Government > Asia Government > China Government (1.00)

Technology: Information Technology > Artificial Intelligence (0.67)

Add feedback

Improving Zero-shot Reader by Reducing Distractions from Irrelevant Documents in Open-Domain Question Answering

Cho, Sukmin, Seo, Jeongyeon, Jeong, Soyeong, Park, Jong C.

arXiv.org Artificial IntelligenceNov-14-2023

Large language models (LLMs) enable zero-shot approaches in open-domain question answering (ODQA), yet with limited advancements as the reader is compared to the retriever. This study aims at the feasibility of a zero-shot reader that addresses the challenges of computational cost and the need for labeled data. We find that LLMs are distracted due to irrelevant documents in the retrieved set and the overconfidence of the generated answers when they are exploited as zero-shot readers. To tackle these problems, we mitigate the impact of such documents via Distraction-aware Answer Selection (DAS) with a negation-based instruction and score adjustment for proper answer selection. Experimental results show that our approach successfully handles distraction across diverse scenarios, enhancing the performance of zero-shot readers. Furthermore, unlike supervised readers struggling with unseen data, zero-shot readers demonstrate outstanding transferability without any training.

computational linguistic, llm, zero-shot reader, (15 more...)

arXiv.org Artificial Intelligence

2310.1749

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
(16 more...)

Genre: Research Report > New Finding (0.88)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

The US Wants China to Start Talking About AI Weapons

WIREDNov-13-2023, 16:30:00 GMT

When US President Joe Biden meets with his Chinese counterpart Xi Jinping in the San Francisco Bay Area this week, the pair will have a long list of matters to discuss, including the Israel-Hamas war and Russia's ongoing invasion of Ukraine. Behind the scenes at the APEC summit, however, US officials hope to strike up a dialogue with China about placing guardrails around military use of artificial intelligence, with the ultimate goal of lessening the potential risks that rapid adoption--and reckless use--of the technology might bring. "We have a collective interest in reducing the potential risks from the deployment of unreliable AI applications," because of risks of unintended escalation, says a senior State Department official familiar with recent efforts to broach the issue, who spoke on condition of anonymity. "We very much hope to have a further conversation with China on this issue." Biden's meeting with Xi this week may provide momentum for more military dialogue.

china, declaration, military ai, (5 more...)

WIRED

Country:

Asia > China (1.00)
Pacific Ocean > North Pacific Ocean > San Francisco Bay (0.26)
North America > United States > California > San Francisco County > San Francisco (0.26)
(5 more...)

Industry: Government > Regional Government > North America Government > United States Government (1.00)

Technology: Information Technology > Artificial Intelligence > Applied AI (0.38)

Add feedback

Consistency Analysis of ChatGPT

Jang, Myeongjun Erik, Lukasiewicz, Thomas

arXiv.org Artificial IntelligenceNov-13-2023

ChatGPT has gained a huge popularity since its introduction. Its positive aspects have been reported through many media platforms, and some analyses even showed that ChatGPT achieved a decent grade in professional exams, adding extra support to the claim that AI can now assist and even replace humans in industrial fields. Others, however, doubt its reliability and trustworthiness. This paper investigates the trustworthiness of ChatGPT and GPT-4 regarding logically consistent behaviour, focusing specifically on semantic consistency and the properties of negation, symmetric, and transitive consistency. Our findings suggest that while both models appear to show an enhanced language understanding and reasoning ability, they still frequently fall short of generating logically consistent predictions. We also ascertain via experiments that prompt designing, few-shot learning and employing larger large language models (LLMs) are unlikely to be the ultimate solution to resolve the inconsistency issue of LLMs.

chatgpt, computational linguistic, consistency, (16 more...)

arXiv.org Artificial Intelligence

2303.06273

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > Canada > Ontario > Toronto (0.04)
(13 more...)

Genre: Research Report > New Finding (1.00)

Industry: Law (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

DynaConF: Dynamic Forecasting of Non-Stationary Time-Series

Liu, Siqi, Lehrmann, Andreas

arXiv.org Machine LearningNov-13-2023

Deep learning has shown impressive results in a variety of time series forecasting tasks, where modeling the conditional distribution of the future given the past is the essence. However, when this conditional distribution is non-stationary, it poses challenges for these models to learn consistently and to predict accurately. In this work, we propose a new method to model non-stationary conditional distributions over time by clearly decoupling stationary conditional distribution modeling from non-stationary dynamics modeling. Our method is based on a Bayesian dynamic model that can adapt to conditional distribution changes and a deep conditional distribution model that handles multivariate time series using a factorized output space. Our experimental results on synthetic and real-world datasets show that our model can adapt to non-stationary time series better than state-of-the-art deep learning solutions.

artificial intelligence, conditional distribution, machine learning, (13 more...)

arXiv.org Machine Learning

2209.08411

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
Pacific Ocean > North Pacific Ocean > San Francisco Bay (0.04)
North America > United States > New York (0.04)
(3 more...)

Genre: Research Report (0.64)

Industry:

Energy (0.68)
Education (0.54)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Abandoned America: AI images what famous US cities would look like after 100 years - if they were deserted by humans

Daily Mail - Science & techNov-12-2023, 12:25:57 GMT

What would American cities look like 100 years after human beings have left, with the streets devoid of human life - and beginning to be reclaimed by nature? While the chatbot put our future world in text, the AI photo generator Midjourney painted pictures of these abandoned metropolises, showing the concrete jungles transforming into jungles. Kieron Connolly, author of Abandoned Places and Abandoned Civilizations, says that visions of abandoned cities have a unique power. This isn't what city life is supposed to look like. Nature is allowed to reclaim the land,' Connolly said. ChatGPT writes, 'In the year 2123, the once-thriving metropolis of Chicago stands as a haunting testament to the passage of time and the resilience of nature.

abandoned america, chatgpt write, landscape, (13 more...)

Daily Mail - Science & tech

Country:

North America > United States > Illinois > Cook County > Chicago (0.26)
North America > United States > New York (0.07)
North America > United States > California > Los Angeles County > Los Angeles (0.07)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback