AITopics | Bridgeport

Collaborating Authors

Bridgeport

Neural Attention: A Novel Mechanism for Enhanced Expressive Power in Transformer Models

arXiv.org Artificial IntelligenceFeb-24-2025

Transformer models typically calculate attention matrices using dot products, which have limitations when capturing nonlinear relationships between embedding vectors. We propose Neural Attention, a technique that replaces dot products with feed-forward networks, enabling a more expressive representation of relationships between tokens. This approach modifies only the attention matrix calculation while preserving the matrix dimensions, making it easily adaptable to existing transformer-based architectures. We provide a detailed mathematical justification for why Neural Attention increases representational capacity and conduct controlled experiments to validate this claim. When comparing Neural Attention and Dot-Product Attention, NLP experiments on WikiText-103 show a reduction in perplexity of over 5 percent. Similarly, experiments on CIFAR-10 and CIFAR-100 show comparable improvements for image classification tasks. While Neural Attention introduces higher computational demands, we develop techniques to mitigate these challenges, ensuring practical usability without sacrificing the increased expressivity it provides. This work establishes Neural Attention as an effective means of enhancing the predictive capabilities of transformer models across a variety of applications.

dot-product attention, matrix, neural attention, (14 more...)

arXiv.org Artificial Intelligence

2502.17206

Country: North America > United States > Connecticut > Fairfield County > Bridgeport (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Evaluating Self-Generated Documents for Enhancing Retrieval-Augmented Generation with Large Language Models

Li, Jiatao, Hu, Xinyu, Yin, Xunjian, Wan, Xiaojun

arXiv.org Artificial IntelligenceDec-14-2024

The integration of documents generated by LLMs themselves (Self-Docs) alongside retrieved documents has emerged as a promising strategy for retrieval-augmented generation systems. However, previous research primarily focuses on optimizing the use of Self-Docs, with their inherent properties remaining underexplored. To bridge this gap, we first investigate the overall effectiveness of Self-Docs, identifying key factors that shape their contribution to RAG performance (RQ1). Building on these insights, we develop a taxonomy grounded in Systemic Functional Linguistics to compare the influence of various Self-Docs categories (RQ2) and explore strategies for combining them with external sources (RQ3). Our findings reveal which types of Self-Docs are most beneficial and offer practical guidelines for leveraging them to achieve significant improvements in knowledge-intensive question answering tasks.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2410.13192

Country:

Asia > Russia (1.00)
Europe > Poland > Masovia Province > Warsaw (0.04)
Europe > Eastern Europe (0.04)
(14 more...)

Genre: Research Report > New Finding (0.87)

Industry:

Transportation > Air (1.00)
Media > Film (1.00)
Leisure & Entertainment (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.84)

Add feedback

xLSTMTime : Long-term Time Series Forecasting With xLSTM

Alharthi, Musleh, Mahmood, Ausif

arXiv.org Artificial IntelligenceJul-14-2024

In recent years, transformer-based models have gained prominence in multivariate long-term time series forecasting (LTSF), demonstrating significant advancements despite facing challenges such as high computational demands, difficulty in capturing temporal dynamics, and managing long-term dependencies. The emergence of LTSF-Linear, with its straightforward linear architecture, has notably outperformed transformer-based counterparts, prompting a reevaluation of the transformer's utility in time series forecasting. In response, this paper presents an adaptation of a recent architecture termed extended LSTM (xLSTM) for LTSF. xLSTM incorporates exponential gating and a revised memory structure with higher capacity that has good potential for LTSF. Our adopted architecture for LTSF termed as xLSTMTime surpasses current approaches. We compare xLSTMTime's performance against various state-of-the-art models across multiple real-world da-tasets, demonstrating superior forecasting capabilities. Our findings suggest that refined recurrent architectures can offer competitive alternatives to transformer-based models in LTSF tasks, po-tentially redefining the landscape of time series forecasting.

dataset, forecasting, time series forecasting, (12 more...)

arXiv.org Artificial Intelligence

2407.1024

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Oceania > Australia (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
(3 more...)

Genre: Research Report > New Finding (0.68)

Industry: Energy (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

TopicGPT: A Prompt-based Topic Modeling Framework

Pham, Chau Minh, Hoyle, Alexander, Sun, Simeng, Iyyer, Mohit

arXiv.org Artificial IntelligenceNov-2-2023

Topic modeling is a well-established technique for exploring text corpora. Conventional topic models (e.g., LDA) represent topics as bags of words that often require "reading the tea leaves" to interpret; additionally, they offer users minimal semantic control over topics. To tackle these issues, we introduce TopicGPT, a prompt-based framework that uses large language models (LLMs) to uncover latent topics within a provided text collection. TopicGPT produces topics that align better with human categorizations compared to competing methods: for example, it achieves a harmonic mean purity of 0.74 against human-annotated Wikipedia topics compared to 0.64 for the strongest baseline. Its topics are also more interpretable, dispensing with ambiguous bags of words in favor of topics with natural language labels and associated free-form descriptions. Moreover, the framework is highly adaptable, allowing users to specify constraints and modify topics without the need for model retraining. TopicGPT can be further extended to hierarchical topical modeling, enabling users to explore topics at various levels of granularity. By streamlining access to high-quality and interpretable topics, TopicGPT represents a compelling, human-centered approach to topic modeling.

assignment, dataset, topicgpt, (15 more...)

arXiv.org Artificial Intelligence

2311.01449

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Minnesota (0.04)
(10 more...)

Genre: Research Report (1.00)

Industry:

Transportation (1.00)
Law (1.00)
Consumer Products & Services > Restaurants (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Can ChatGPT Plan Your Vacation?

#artificialintelligenceMar-20-2023, 02:58:45 GMT

Powerful new artificial-intelligence software is already shaking up the travel industry, but it has a long way to go until it can plan a seamless trip. I want to hit a history museum and an amusement park -- and then I'd like 7 p.m. dinner reservations near the hotel at a restaurant with vegan options and a great wine list." But for now, travelers using ChatGPT -- the powerful new A.I. software that is already offering creative cocktail recipes and writing college papers -- may have to temper their expectations. Oded Battat, the general manager at Traveland, a travel agency in Bridgeport, Conn., asked ChatGPT for outings he might offer his clients going to Tuscany to see if it could help him with his work. He got a list of 14 activities, including winery tours and museum visits, with a stop for gelato in the town square of the medieval hill town San Gimignano.

chatgpt, chatgpt plan, vacation, (2 more...)

#artificialintelligence

Country:

North America > United States > Connecticut > Fairfield County > Bridgeport (0.27)
Europe > Italy > Tuscany (0.27)
North America > United States > California > Los Angeles County > Los Angeles (0.07)

Industry: Consumer Products & Services > Travel (0.99)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

How ChatGPT and Generative AI Could Change the Way We Travel - The New York Times

#artificialintelligenceMar-16-2023, 09:05:21 GMT

I want to hit a history museum and an amusement park -- and then I'd like 7 p.m. dinner reservations near the hotel at a restaurant with vegan options and a great wine list." But for now, travelers using ChatGPT -- the powerful new A.I. software that is already offering creative cocktail recipes and writing college papers -- may have to temper their expectations. Oded Battat, the general manager at Traveland, a travel agency in Bridgeport, Conn., asked ChatGPT for outings he might offer his clients going to Tuscany to see if it could help him with his work. He got a list of 14 activities, including winery tours and museum visits, with a stop for gelato in the town square of the medieval hill town San Gimignano. "I knew of all these things," Mr. Battat said, but, he added, ChatGPT saved him the hassle of collecting all the information and delivered it in a format he was able to email to one of the clients.

battat, chatgpt and generative ai, new york time, (1 more...)

#artificialintelligence

Country:

North America > United States > Connecticut > Fairfield County > Bridgeport (0.27)
Europe > Italy > Tuscany (0.27)
North America > United States > California > Los Angeles County > Los Angeles (0.07)

Industry:

Consumer Products & Services > Food, Beverage, Tobacco & Cannabis (0.60)
Consumer Products & Services > Travel (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.40)

Add feedback

BudgetLongformer: Can we Cheaply Pretrain a SotA Legal Language Model From Scratch?

Niklaus, Joel, Giofré, Daniele

arXiv.org Artificial IntelligenceNov-30-2022

Pretrained transformer models have achieved state-of-the-art results in many tasks and benchmarks recently. Many state-of-the-art Language Models (LMs), however, do not scale well above the threshold of 512 input tokens. In specialized domains though (such as legal, scientific or biomedical), models often need to process very long text (sometimes well above 10000 tokens). Even though many efficient transformers have been proposed (such as Longformer, BigBird or FNet), so far, only very few such efficient models are available for specialized domains. Additionally, since the pretraining process is extremely costly in general - but even more so as the sequence length increases - it is often only in reach of large research labs. One way of making pretraining cheaper is the Replaced Token Detection (RTD) task, by providing more signal during training, since the loss can be computed over all tokens. In this work, we train Longformer models with the efficient RTD task on legal data to showcase that pretraining efficient LMs is possible using much less compute. We evaluate the trained models on challenging summarization tasks requiring the model to summarize long texts to show to what extent the models can achieve good performance on downstream tasks. We find that both the small and base models outperform their baselines on the in-domain BillSum and out-of-domain PubMed tasks in their respective parameter range. We publish our code and models for research purposes.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2211.17135

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > Canada > Ontario (0.04)
Asia > Middle East > Iran > Tehran Province > Tehran (0.04)
(18 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Transportation > Air (1.00)
Law > Statutes (1.00)
Law > Intellectual Property & Technology Law (1.00)
(24 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Lawyers of the world: Robots aren't replacing you--yet

#artificialintelligenceOct-25-2019, 21:14:07 GMT

ArtificiaI intelligence (AI) may soon render many jobs obsolete. Remember how popular one-hour photo shops were in the 1980s and into the mid-1990s? That's just the tip of the tech iceberg, as AI now seems to be gunning to take over the legal world. The UK-based Law Society noted in a study earlier this year: "Over the longer term, the number of jobs in the legal services sector will be increasingly affected by automation of legal services functions. This could mean that by 2038 total employment in the sector could be 20% less than it would otherwise have been, with a loss of 78,000 jobs -- equal to 67,000 full-time equivalent jobs -- compared to if productivity growth continued at its current rate."

lawyer, legal tech solution, lillquist, (12 more...)

#artificialintelligence

Country:

Europe > United Kingdom (0.25)
North America > United States > Connecticut > Fairfield County > Bridgeport (0.05)
North America > United States > California > Los Angeles County > Los Angeles (0.05)
North America > United States > California > Los Angeles County > Beverly Hills (0.05)

Industry: Law (1.00)

Technology: Information Technology > Artificial Intelligence > Robots (0.69)

Add feedback

California Inc.: Eclipse day is here, but be careful of some safety glasses

Los Angeles TimesAug-21-2017, 12:35:36 GMT

Welcome to California Inc., the weekly newsletter of the L.A. Times Business Section. Stocks took a pounding last week as the political turbulence in Washington and terror attacks in Spain caught up with the market. But closer to home employers statewide increased their payrolls by 82,600 jobs in July. Sectors that saw the most employment gains include government, which added 18,800 jobs; educational and health services, which saw an increase of 18,600; and leisure and hospitality, which was up 15,200 jobs. Dark day: The long-awaited solar eclipse sweeps across America on Monday.

artificial intelligence, california inc, social media, (14 more...)

Los Angeles Times

Country:

Europe > Spain (0.25)
North America > Mexico (0.16)
North America > Canada (0.16)
(10 more...)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Banking & Finance > Real Estate (0.96)

Technology:

Information Technology > Communications > Social Media (0.50)
Information Technology > Artificial Intelligence (0.30)

Add feedback