AITopics

2412.00758

Country:

Europe > Netherlands > South Holland > Leiden (0.05)
South America > Paraguay > Asunción > Asunción (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre: Research Report (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Fujii, Akihiro, Ushiku, Yoshitaka, Shimizu, Koji, Lu, Anh Khoa Augustin, Watanabe, Satoshi

Adaptive Constraint Integration for Simultaneously Optimizing Crystal Structures with Multiple Targeted Properties

arXiv.org Artificial IntelligenceDec-1-2024

In materials science, finding crystal structures that have targeted properties is crucial. While recent methodologies such as Bayesian optimization and deep generative models have made some advances on this issue, these methods often face difficulties in adaptively incorporating various constraints, such as electrical neutrality and targeted properties optimization, while keeping the desired specific crystal structure. To address these challenges, we have developed the Simultaneous Multi-property Optimization using Adaptive Crystal Synthesizer (SMOACS), which utilizes state-of-the-art property prediction models and their gradients to directly optimize input crystal structures for targeted properties simultaneously. SMOACS enables the integration of adaptive constraints into the optimization process without necessitating model retraining. Thanks to this feature, SMOACS has succeeded in simultaneously optimizing targeted properties while maintaining perovskite structures, even with models trained on diverse crystal types. We have demonstrated the band gap optimization while meeting a challenging constraint, that is, maintaining electrical neutrality in large atomic configurations up to 135 atom sites, where the verification of the electrical neutrality is challenging. The properties of the most promising materials have been confirmed by density functional theory calculations.

crystal structure, electrical neutrality, perovskite structure, (13 more...)

2410.08562

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Bagga, Harleen Kaur, Bernard, Jasmine, Shaheen, Sahil, Arora, Sarthak

Was that Sarcasm?: A Literature Survey on Sarcasm Detection

arXiv.org Artificial IntelligenceNov-30-2024

Sarcasm is hard to interpret as human beings. Being able to interpret sarcasm is often termed as a sign of intelligence, given the complex nature of sarcasm. Hence, this is a field of Natural Language Processing which is still complex for computers to decipher. This Literature Survey delves into different aspects of sarcasm detection, to create an understanding of the underlying problems faced during detection, approaches used to solve this problem, and different forms of available datasets for sarcasm detection.

large language model, machine learning, natural language, (17 more...)

2412.00425

Country: South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Dehghankar, Mohsen, Asudeh, Abolfazl

Rank It, Then Ask It: Input Reranking for Maximizing the Performance of LLMs on Symmetric Tasks

arXiv.org Artificial IntelligenceNov-30-2024

Large language models (LLMs) have quickly emerged as practical and versatile tools that provide new solutions for a wide range of domains. In this paper, we consider the application of LLMs on symmetric tasks where a query is asked on an (unordered) bag of elements. Examples of such tasks include answering aggregate queries on a database table. In general, when the bag contains a large number of elements, LLMs tend to overlook some elements, leading to challenges in generating accurate responses to the query. LLMs receive their inputs as ordered sequences. However, in this problem, we leverage the fact that the symmetric input is not ordered, and reordering should not affect the LLM's response. Observing that LLMs are less likely to miss elements at certain positions of the input, we introduce the problem of LLM input reranking: to find a ranking of the input that maximizes the LLM's accuracy for the given query without making explicit assumptions about the query. Finding the optimal ranking requires identifying (i) the relevance of each input element for answering the query and (ii) the importance of each rank position for the LLM's attention. We develop algorithms for estimating these values efficiently utilizing a helper LLM. We conduct comprehensive experiments on different synthetic and real datasets to validate our proposal and to evaluate the effectiveness of our proposed algorithms. Our experiments confirm that our reranking approach improves the accuracy of the LLMs on symmetric tasks by up to $99\%$ proximity to the optimum upper bound.

large language model, machine learning, natural language, (19 more...)

2412.00546

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Meisenbacher, Stefan, Selzer, Silas Aaron, Dado, Mehdi, Beichter, Maximilian, Martin, Tim, Zdrallek, Markus, Bretschneider, Peter, Hagenmeyer, Veit, Mikut, Ralf

On autoregressive deep learning models for day-ahead wind power forecasting with irregular shutdowns due to redispatching

Renewable energies and their operation are becoming increasingly vital for the stability of electrical power grids since conventional power plants are progressively being displaced, and their contribution to redispatch interventions is thereby diminishing. In order to consider renewable energies like Wind Power (WP) for such interventions as a substitute, day-ahead forecasts are necessary to communicate their availability for redispatch planning. In this context, automated and scalable forecasting models are required for the deployment to thousands of locally-distributed onshore WP turbines. Furthermore, the irregular interventions into the WP generation capabilities due to redispatch shutdowns pose challenges in the design and operation of WP forecasting models. Since state-of-the-art forecasting methods consider past WP generation values alongside day-ahead weather forecasts, redispatch shutdowns may impact the forecast. Therefore, the present paper highlights these challenges and analyzes state-of-the-art forecasting methods on data sets with both regular and irregular shutdowns. Specifically, we compare the forecasting accuracy of three autoregressive Deep Learning (DL) methods to methods based on WP curve modeling. Interestingly, the latter achieve lower forecasting errors, have fewer requirements for data cleaning during modeling and operation while being computationally more efficient, suggesting their advantages in practical applications.

artificial intelligence, data quality, machine learning, (21 more...)

2412.00423

Country:

Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.05)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
North America > United States (0.04)
(4 more...)

Genre: Research Report (0.50)

Industry:

Energy > Renewable > Wind (1.00)
Energy > Power Industry (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

AutoPQ: Automating Quantile estimation from Point forecasts in the context of sustainability

Meisenbacher, Stefan, Phipps, Kaleb, Taubert, Oskar, Weiel, Marie, Götz, Markus, Mikut, Ralf, Hagenmeyer, Veit

Optimizing smart grid operations relies on critical decision-making informed by uncertainty quantification, making probabilistic forecasting a vital tool. Designing such forecasting models involves three key challenges: accurate and unbiased uncertainty quantification, workload reduction for data scientists during the design process, and limitation of the environmental impact of model training. In order to address these challenges, we introduce AutoPQ, a novel method designed to automate and optimize probabilistic forecasting for smart grid applications. AutoPQ enhances forecast uncertainty quantification by generating quantile forecasts from an existing point forecast by using a conditional Invertible Neural Network (cINN). AutoPQ also automates the selection of the underlying point forecasting method and the optimization of hyperparameters, ensuring that the best model and configuration is chosen for each application. For flexible adaptation to various performance needs and available computing power, AutoPQ comes with a default and an advanced configuration, making it suitable for a wide range of smart grid applications. Additionally, AutoPQ provides transparency regarding the electricity consumption required for performance improvements. We show that AutoPQ outperforms state-of-the-art probabilistic forecasting methods while effectively limiting computational effort and hence environmental impact. Additionally and in the context of sustainability, we quantify the electricity consumption required for performance improvements.

artificial intelligence, machine learning, modeling & simulation, (18 more...)

2412.00419

Country:

Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
Europe > Switzerland (0.04)
South America > Uruguay > Maldonado > Maldonado (0.04)
(7 more...)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry:

Energy > Power Industry (1.00)
Energy > Renewable > Solar (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Interval Estimation of Coefficients in Penalized Regression Models of Insurance Data

Manna, Alokesh, Huang, Zijian, Dey, Dipak K., Gu, Yuwen

The Tweedie exponential dispersion family is a popular choice among many to model insurance losses that consist of zero-inflated semicontinuous data. In such data, it is often important to obtain credibility (inference) of the most important features that describe the endogenous variables. Post-selection inference is the standard procedure in statistics to obtain confidence intervals of model parameters after performing a feature extraction procedure. For a linear model, the lasso estimate often has non-negligible estimation bias for large coefficients corresponding to exogenous variables. To have valid inference on those coefficients, it is necessary to correct the bias of the lasso estimate. Traditional statistical methods, such as hypothesis testing or standard confidence interval construction might lead to incorrect conclusions during post-selection, as they are generally too optimistic. Here we discuss a few methodologies for constructing confidence intervals of the coefficients after feature selection in the Generalized Linear Model (GLM) family with application to insurance data.

artificial intelligence, bayesian inference, machine learning, (19 more...)

2410.01008

Country:

North America > United States > Connecticut (0.04)
South America > Ecuador (0.04)
North America > United States > Wisconsin (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Banking & Finance > Insurance (1.00)
Transportation > Ground > Road (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Predictive Inference With Fast Feature Conformal Prediction

Tang, Zihao, Wang, Boyuan, Wen, Chuan, Teng, Jiaye

Conformal prediction is widely adopted in uncertainty quantification, due to its post-hoc, distribution-free, and model-agnostic properties. In the realm of modern deep learning, researchers have proposed Feature Conformal Prediction (FCP), which deploys conformal prediction in a feature space, yielding reduced band lengths. However, the practical utility of FCP is limited due to the time-consuming non-linear operations required to transform confidence bands from feature space to output space. In this paper, we introduce Fast Feature Conformal Prediction (FFCP), which features a novel non-conformity score and is convenient for practical applications. FFCP serves as a fast version of FCP, in that it equivalently employs a Taylor expansion to approximate the aforementioned non-linear operations in FCP. Empirical validations showcase that FFCP performs comparably with FCP (both outperforming the vanilla version) while achieving a significant reduction in computational time by approximately 50x. The code is available at https://github.com/ElvisWang1111/FastFeatureCP

artificial intelligence, ffcp, machine learning, (15 more...)

2412.00653

Country:

Asia > Middle East > Jordan (0.04)
South America > Paraguay > Asunción > Asunción (0.04)
North America > United States > Tennessee (0.04)
(7 more...)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceNov-29-2024

Scaling Transformers for Low-Bitrate High-Quality Speech Coding

Parker, Julian D, Smirnov, Anton, Pons, Jordi, Carr, CJ, Zukowski, Zack, Evans, Zach, Liu, Xubo

The tokenization of speech with neural audio codec models is a vital part of modern AI pipelines for the generation or understanding of speech, alone or in a multimodal context. Traditionally such tokenization models have concentrated on low parameter-count architectures using only components with strong inductive biases. In this work we show that by scaling a transformer architecture with large parameter count to this problem, and applying a flexible Finite Scalar Quantization (FSQ) based bottleneck, it is possible to reach state-of-the-art speech quality at extremely low bit-rates of 400 or 700 bits-per-second. The trained models strongly out-perform existing baselines in both objective and subjective tests. Compressed coding of audio and speech data in digital format has been an active area of research since the 1970s, and reached particular prominence in the late 1990s with the emergence of mp3 (Painter & Spanias, 2000).

large language model, machine learning, natural language, (21 more...)

2411.19842

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment (0.93)
Media > Music (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Speech (0.93)

arXiv.org Artificial IntelligenceNov-29-2024

What fifty-one years of Linguistics and Artificial Intelligence research tell us about their correlation: A scientometric review

Shormani, Mohammed Q.

There is a strong correlation between linguistics and artificial intelligence (AI), best manifested by deep learning language models. This study provides a thorough scientometric analysis of this correlation, synthesizing the intellectual production during 51 years, from 1974 to 2024. It involves 5750 Web of Science-indexed articles published in 2124 journals, which are written by 20835 authors belonging to 13773 research centers in 794 countries. Two powerful software, viz., CiteSpace and VOSviewer, were used to generate mapping visualizations of the intellectual landscape, trending issues and (re)emerging hotspots. The results indicate that in the 1980s and 1990s, linguistics and AI research was not robust, characterized by unstable publication over time. It has, however, witnessed a remarkable increase of publication since then, reaching 1478 articles in 2023, and 546 articles in January-March timespan in 2024, involving emerging issues and hotspots, addressing new horizons, new topics, and launching new applications and powerful deep learning language models including ChatGPT.

large language model, machine learning, natural language, (19 more...)

2411.19858

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Spain (0.04)
Europe > United Kingdom > England (0.04)
(13 more...)

Genre: Research Report > Experimental Study (0.48)

Industry:

Health & Medicine (1.00)
Education (1.00)
Consumer Products & Services > Restaurants (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)