AITopics

With the advance of high-throughput genotyping and sequencing technologies, it becomes feasible to comprehensive evaluate the role of massive genetic predictors in disease prediction. There exists, therefore, a critical need for developing appropriate statistical measurements to access the combined effects of these genetic variants in disease prediction. Predictiveness curve is commonly used as a graphical tool to measure the predictive ability of a risk prediction model on a single continuous biomarker. Yet, for most complex diseases, risk prediciton models are formed on multiple genetic variants. We therefore propose a multi-marker predictiveness curve and provide a non-parametric method to construct the curve for case-control studies. We further introduce a global predictiveness U and a partial predictiveness U to summarize prediction curve across the whole population and sub-population of clinical interest, respectively. We also demonstrate the connections of predictiveness curve with ROC curve and Lorenz curve. Through simulation, we compared the performance of the predictiveness U to other three summary indices: R square, Total Gain, and Average Entropy, and showed that Predictiveness U outperformed the other three indexes in terms of unbiasedness and robustness. Moreover, we simulated a series of rare-variants disease model, found partial predictiveness U performed better than global predictiveness U. Finally, we conducted a real data analysis, using predictiveness curve and predictiveness U to evaluate a risk prediction model for Nicotine Dependence.

artificial intelligence, machine learning, predictiveness curve, (15 more...)

doi: 10.1177/0962280218819202

2504.00024

Country:

North America > United States > Michigan (0.04)
Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
North America > United States > Indiana (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)

Gurung, Alexander, Lapata, Mirella

Learning to Reason for Long-Form Story Generation

Generating high-quality stories spanning thousands of tokens requires competency across a variety of skills, from tracking plot and character arcs to keeping a consistent and engaging style. Due to the difficulty of sourcing labeled datasets and precise quality measurements, most work using large language models (LLMs) for long-form story generation uses combinations of hand-designed prompting techniques to elicit author-like behavior. This is a manual process that is highly dependent on the specific story-generation task. Motivated by the recent success of applying RL with Verifiable Rewards to domains like math and coding, we propose a general story-generation task (Next-Chapter Prediction) and a reward formulation (Verified Rewards via Completion Likelihood Improvement) that allows us to use an unlabeled book dataset as a learning signal for reasoning. We learn to reason over a story's condensed information and generate a detailed plan for the next chapter. Our reasoning is evaluated via the chapters it helps a story-generator create, and compared against non-trained and supervised finetuning (SFT) baselines. Pairwise human judgments reveal the chapters our learned reasoning produces are preferred across almost all metrics, and the effect is more pronounced in Scifi and Fantasy genres.

large language model, machine learning, natural language, (17 more...)

2503.22828

Country:

Asia > Middle East > Jordan (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
(11 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Law (0.67)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.93)

Törnberg, Petter, Chueri, Juliana

Elite Political Discourse has Become More Toxic in Western Countries

Toxic and uncivil politics is widely seen as a growing threat to democratic values and governance, yet our understanding of the drivers and evolution of political incivility remains limited. Leveraging a novel dataset of nearly 18 million Twitter messages from parliamentarians in 17 countries over five years, this paper systematically investigates whether politics internationally is becoming more uncivil, and what are the determinants of political incivility. Our analysis reveals a marked increase in toxic discourse among political elites, and that it is associated to radical-right parties and parties in opposition. Toxicity diminished markedly during the early phase of the COVID-19 pandemic and, surprisingly, during election campaigns. Furthermore, our results indicate that posts relating to ``culture war'' topics, such as migration and LGBTQ+ rights, are substantially more toxic than debates focused on welfare or economic issues. These findings underscore a troubling shift in international democracies toward an erosion of constructive democratic dialogue.

large language model, machine learning, natural language, (21 more...)

2503.22411

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Oceania > New Zealand (0.04)
(15 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Government > Voting & Elections (1.00)
Government > Regional Government > North America Government > United States Government (0.93)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)

Nematirad, Reza, Pahwa, Anil, Natarajan, Balasubramaniam

SPDNet: Seasonal-Periodic Decomposition Network for Advanced Residential Demand Forecasting

--Residential electricity demand forecasting is critical for efficient energy management and grid stability. Accurate predictions enable utility companies to optimize planning and operations. However, real-world residential electricity demand data often exhibit intricate temporal variability, including multiple seasonalities, periodicities, and abrupt fluctuations, which pose significant challenges for forecasting models. Previous models that rely on statistical methods, recurrent, convolutional neural networks, and transformers often struggle to capture these intricate temporal dynamics. T o address these challenges, we propose the Seasonal-Periodic Decomposition Network (SPDNet), a novel deep learning framework consisting of two main modules. The first is the Seasonal-Trend Decomposition Module (STDM), which decomposes the input data into trend, seasonal, and residual components. The second is the Periodical Decomposition Module (PDM), which employs the Fast Fourier Transform to identify the dominant periods. For each dominant period, 1D input data is reshaped into a 2D tensor, where rows represent periods and columns correspond to frequencies. The 2D representations are then processed through three submodules: a 1D convolution to capture sharp fluctuations, a transformer-based encoder to model global patterns, and a 2D convolution to capture interactions between periods. Extensive experiments conducted on real-world residential electricity load data demonstrate that SPDNet outperforms traditional and advanced models in both forecasting accuracy and computational efficiency. The code is available in this repository: https://github.com/Tims2D/SPDNet.

data mining, forecasting, machine learning, (20 more...)

2503.22485

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)
North America > United States > Kansas (0.04)
Oceania > Australia > Queensland (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Energy > Power Industry (1.00)
Transportation > Ground > Road (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

MFH: A Multi-faceted Heuristic Algorithm Selection Approach for Software Verification

Su, Jie, Deng, Liansai, Wen, Cheng, Wang, Rong, Ma, Zhi, Zhang, Nan, Tian, Cong, Duan, Zhenhua, Qin, Shengchao

Currently, many verification algorithms are available to improve the reliability of software systems. Selecting the appropriate verification algorithm typically demands domain expertise and non-trivial manpower. An automated algorithm selector is thus desired. However, existing selectors, either depend on machine-learned strategies or manually designed heuristics, encounter issues such as reliance on high-quality samples with algorithm labels and limited scalability. In this paper, an automated algorithm selection approach, namely MFH, is proposed for software verification. Our approach leverages the heuristics that verifiers producing correct results typically implement certain appropriate algorithms, and the supported algorithms by these verifiers indirectly reflect which ones are potentially applicable. Specifically, MFH embeds the code property graph (CPG) of a semantic-preserving transformed program to enhance the robustness of the prediction model. Furthermore, our approach decomposes the selection task into the sub-tasks of predicting potentially applicable algorithms and matching the most appropriate verifiers. Additionally, MFH also introduces a feedback loop on incorrect predictions to improve model prediction accuracy. We evaluate MFH on 20 verifiers and over 15,000 verification tasks. Experimental results demonstrate the effectiveness of MFH, achieving a prediction accuracy of 91.47% even without ground truth algorithm labels provided during the training phase. Moreover, the prediction accuracy decreases only by 0.84% when introducing 10 new verifiers, indicating the strong scalability of the proposed approach.

artificial intelligence, machine learning, natural language, (15 more...)

2503.22228

Country:

North America > United States > New York > New York County > New York City (0.14)
Asia > China > Guangdong Province > Guangzhou (0.05)
Europe > Switzerland (0.05)
(4 more...)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology (0.46)

Technology:

Information Technology > Software Engineering (1.00)
Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Congenital Heart Disease Classification Using Phonocardiograms: A Scalable Screening Tool for Diverse Environments

Jabbar, Abdul, Grooby, Ethan, Crozier, Jack, Gallon, Alexander, Pham, Vivian, Ahmad, Khawza I, Hassanuzzaman, Md, Mostafa, Raqibul, Khandoker, Ahsan H., Marzbanrad, Faezeh

Congenital heart disease (CHD) is a critical condition that demands early detection, particularly in infancy and childhood. This study presents a deep learning model designed to detect CHD using phonocardiogram (PCG) signals, with a focus on its application in global health. We evaluated our model on several datasets, including the primary dataset from Bangladesh, achieving a high accuracy of 94.1%, sensitivity of 92.7%, specificity of 96.3%. The model also demonstrated robust performance on the public PhysioNet Challenge 2022 and 2016 datasets, underscoring its generalizability to diverse populations and data sources. We assessed the performance of the algorithm for single and multiple auscultation sites on the chest, demonstrating that the model maintains over 85% accuracy even when using a single location. Furthermore, our algorithm was able to achieve an accuracy of 80% on low-quality recordings, which cardiologists deemed non-diagnostic. This research suggests that an AI- driven digital stethoscope could serve as a cost-effective screening tool for CHD in resource-limited settings, enhancing clinical decision support and ultimately improving patient outcomes.

artificial intelligence, deep learning, machine learning, (17 more...)

2503.22773

Country:

Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)
South America > Brazil > Pernambuco (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(6 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Tokenization of Gaze Data

Rolff, Tim, Karimian, Jurik, Hypki, Niklas, Schmidt, Susanne, Lappe, Markus, Steinicke, Frank

A considerable part of the performance of today's large language models (LLM's) and multimodal large language models (MLLM's) depends on their tokenization strategies. While tokenizers are extensively researched for textual and visual input, there is no research on tokenization strategies for gaze data due to its nature. However, a corresponding tokenization strategy would allow using the vision capabilities of pre-trained MLLM's for gaze data, for example, through fine-tuning. In this paper, we aim to close this research gap by analyzing five different tokenizers for gaze data on three different datasets for the forecasting and generation of gaze data through LLMs (cf.~\cref{fig:teaser}). We evaluate the tokenizers regarding their reconstruction and compression abilities. Further, we train an LLM for each tokenization strategy, measuring its generative and predictive performance. Overall, we found that a quantile tokenizer outperforms all others in predicting the gaze positions and k-means is best when predicting gaze velocities.

large language model, machine learning, natural language, (17 more...)

2503.22145

Country:

Europe > Germany > North Rhine-Westphalia > Münster Region > Münster (0.14)
Europe > Germany > Hamburg (0.04)
Oceania > New Zealand > South Island > Canterbury Region > Christchurch (0.04)
(4 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Cooperative Hybrid Multi-Agent Pathfinding Based on Shared Exploration Maps

Liu, Ning, Shen, Sen, Kong, Xiangrui, Zhang, Hongtao, Bräunl, Thomas

Multi-Agent Pathfinding is used in areas including multi-robot formations, warehouse logistics, and intelligent vehicles. However, many environments are incomplete or frequently change, making it difficult for standard centralized planning or pure reinforcement learning to maintain both global solution quality and local flexibility. This paper introduces a hybrid framework that integrates D* Lite global search with multi-agent reinforcement learning, using a switching mechanism and a freeze-prevention strategy to handle dynamic conditions and crowded settings. We evaluate the framework in the discrete POGEMA environment and compare it with baseline methods. Experimental outcomes indicate that the proposed framework substantially improves success rate, collision rate, and path efficiency. The model is further tested on the EyeSim platform, where it maintains feasible Pathfinding under frequent changes and large-scale robot deployments.

agent, artificial intelligence, machine learning, (16 more...)

2503.22162

Country:

Asia > China > Hong Kong (0.04)
Oceania > Australia > Western Australia (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.94)

When Autonomy Breaks: The Hidden Existential Risk of AI

Krook, Joshua

AI risks are typically framed around physical threats to humanity, a loss of control or an accidental error causing humanity's extinction. However, I argue in line with the gradual disempowerment thesis, that there is an underappreciated risk in the slow and irrevocable decline of human autonomy. As AI starts to outcompete humans in various areas of life, a tipping point will be reached where it no longer makes sense to rely on human decision-making, creativity, social care or even leadership. What may follow is a process of gradual de-skilling, where we lose skills that we currently take for granted. Traditionally, it is argued that AI will gain human skills over time, and that these skills are innate and immutable in humans. By contrast, I argue that humans may lose such skills as critical thinking, decision-making and even social care in an AGI world. The biggest threat to humanity is therefore not that machines will become more like humans, but that humans will become more like machines.

artificial intelligence, machine learning, natural language, (17 more...)

2503.22151

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.05)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(8 more...)

Genre: Research Report (1.00)

Industry:

Media (1.00)
Law (1.00)
Health & Medicine > Therapeutic Area (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Diallo, Papa Abdou Karim Karou, Zouaq, Amal

FRASE: Structured Representations for Generalizable SPARQL Query Generation

Translating natural language questions into SPARQL queries enables Knowledge Base querying for factual and up-to-date responses. However, existing datasets for this task are predominantly template-based, leading models to learn superficial mappings between question and query templates rather than developing true generalization capabilities. As a result, models struggle when encountering naturally phrased, template-free questions. This paper introduces FRASE (FRAme-based Semantic Enhancement), a novel approach that leverages Frame Semantic Role Labeling (FSRL) to address this limitation. We also present LC-QuAD 3.0, a new dataset derived from LC-QuAD 2.0, in which each question is enriched using FRASE through frame detection and the mapping of frame-elements to their argument. We evaluate the impact of this approach through extensive experiments on recent large language models (LLMs) under different fine-tuning configurations. Our results demonstrate that integrating frame-based structured representations consistently improves SPARQL generation performance, particularly in challenging generalization scenarios when test questions feature unseen templates (unknown template splits) and when they are all naturally phrased (reformulated questions).

artificial intelligence, large language model, natural language, (14 more...)

2503.22144

Country:

Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)