AITopics

Distribution shifts on graphs -- the discrepancies in data distribution between training and employing a graph machine learning model -- are ubiquitous and often unavoidable in real-world scenarios. These shifts may severely deteriorate model performance, posing significant challenges for reliable graph machine learning. Consequently, there has been a surge in research on graph machine learning under distribution shifts, aiming to train models to achieve satisfactory performance on out-of-distribution (OOD) test data. In our survey, we provide an up-to-date and forward-looking review of deep graph learning under distribution shifts. Specifically, we cover three primary scenarios: graph OOD generalization, training-time graph OOD adaptation, and test-time graph OOD adaptation. We begin by formally formulating the problems and discussing various types of distribution shifts that can affect graph learning, such as covariate shifts and concept shifts. To provide a better understanding of the literature, we systematically categorize the existing models based on our proposed taxonomy and investigate the adopted techniques behind. We also summarize commonly used datasets in this research area to facilitate further investigation. Finally, we point out promising research directions and the corresponding challenges to encourage further study in this vital domain. Additionally, we provide a continuously updated reading list at https://github.com/kaize0409/Awesome-Graph-OOD.

artificial intelligence, inductive learning, machine learning, (16 more...)

2410.19265

Country:

North America > United States > Virginia > Albemarle County > Charlottesville (0.14)
South America > Brazil (0.04)
North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
(4 more...)

Genre: Overview (1.00)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.67)

Brandstätter, Stefan, Seeböck, Philipp, Fürböck, Christoph, Pochepnia, Svitlana, Prosch, Helmut, Langs, Georg

Rigid Single-Slice-in-Volume registration via rotation-equivariant 2D/3D feature matching

In medical imaging, the aim is often to place a 2D image in a 3D volumetric observation to w. Current approaches for rigid single slice in volume registration are limited by requirements such as pose initialization, stacks of adjacent slices, or reliable anatomical landmarks. Here, we propose a self-supervised 2D/3D registration approach to match a single 2D slice to the corresponding 3D volume. The method works in data without anatomical priors such as images of tumors. It addresses the dimensionality disparity and establishes correspondences between 2D in-plane and 3D out-of-plane rotation-equivariant features by using group equivariant CNNs. These rotation-equivariant features are extracted from the 2D query slice and aligned with their 3D counterparts. Results demonstrate the robustness of the proposed slice-in-volume registration on the NSCLC-Radiomics CT and KIRBY21 MRI datasets, attaining an absolute median angle error of less than 2 degrees and a mean-matching feature accuracy of 89% at a tolerance of 3 pixels.

artificial intelligence, machine learning, registration, (20 more...)

doi: 10.1007/978-3-031-73480-9_22

2410.18683

Country:

Europe > Austria > Vienna (0.14)
South America > Peru > Lima Department > Lima Province > Lima (0.04)
Europe > France > Grand Est > Bas-Rhin > Strasbourg (0.04)

Genre:

Research Report > New Finding (0.66)
Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Papanikou, Vasiliki, Papadakos, Panagiotis, Karamanidou, Theodora, Stavropoulos, Thanos G., Pitoura, Evaggelia, Tsaparas, Panayiotis

Health Misinformation in Social Networks: A Survey of IT Approaches

The spread of misinformation online, most commonly known as fake news, is an important issue that has become more pronounced in the last two decades due to the prevalence of social media. Platforms like Twitter, Reddit, and Facebook, have been commonly identified as the main channels for propagating misinformation and have been criticized for not acting on addressing the conditions that permit the circulation and amplification of false information [32]. Such misinformation includes false claims and non fact-checked news items, that originate from sources of questionable credibility [113]. The problem of misinformation becomes critical when it pertains to healthcare and health issues, since it puts lives and the public health at risk. One of the first cases of widely spread misinformation in the medical domain is the falsehood that the MMR vaccine (Measles, Mumps, Rubella) causes autism [109]. The falsehood originated from a fraudulent article titled "Ileal-lymphoid-nodular hyperplasia, non-specific colitis, and pervasive developmental disorder in children" published in the prestigious Lancet journal in 1998 [171, 197]. This study turned tens of thousands of parents against the vaccine, and as a result, in 2020, many countries, including the United Kingdom, Greece, Venezuela, and Brazil, lost their measles elimination status. In 2020, twenty-two years after publishing this study Lancet retracted the paper [203].

information retrieval, large language model, machine learning, (20 more...)

2410.1867

Country:

Europe > United Kingdom (0.47)
South America > Venezuela (0.24)
South America > Brazil (0.24)
(20 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Health & Medicine > Therapeutic Area > Vaccines (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
(6 more...)

Does Data Contamination Detection Work (Well) for LLMs? A Survey and Evaluation on Detection Assumptions

Fu, Yujuan, Uzuner, Ozlem, Yetisgen, Meliha, Xia, Fei

Large language models (LLMs) have demonstrated great performance across various benchmarks, showing potential as general-purpose task solvers. However, as LLMs are typically trained on vast amounts of data, a significant concern in their evaluation is data contamination, where overlap between training data and evaluation datasets inflates performance assessments. While multiple approaches have been developed to identify data contamination, these approaches rely on specific assumptions that may not hold universally across different settings. To bridge this gap, we systematically review 47 papers on data contamination detection, categorize the underlying assumptions, and assess whether they have been rigorously validated. We identify and analyze eight categories of assumptions and test three of them as case studies. Our analysis reveals that when classifying instances used for pretraining LLMs, detection approaches based on these three assumptions perform close to random guessing, suggesting that current LLMs learn data distributions rather than memorizing individual instances. Overall, this work underscores the importance of approaches clearly stating their underlying assumptions and testing their validity across various scenarios.

large language model, machine learning, natural language, (18 more...)

2410.18966

Country:

Asia > Thailand > Bangkok > Bangkok (0.04)
Asia > Singapore (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(4 more...)

Genre:

Research Report (0.64)
Overview (0.46)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Dubey, Harsh Vardhan, Lee, Ji Ah, Flaherty, Patrick

Maximum a Posteriori Inference for Factor Graphs via Benders' Decomposition

arXiv.org Machine LearningOct-24-2024

Many Bayesian statistical inference problems come down to computing a maximum a-posteriori (MAP) assignment of latent variables. Yet, standard methods for estimating the MAP assignment do not have a finite time guarantee that the algorithm has converged to a fixed point. Previous research has found that MAP inference can be represented in dual form as a linear programming problem with a non-polynomial number of constraints. A Lagrangian relaxation of the dual yields a statistical inference algorithm as a linear programming problem. However, the decision as to which constraints to remove in the relaxation is often heuristic. We present a method for maximum a-posteriori inference in general Bayesian factor models that sequentially adds constraints to the fully relaxed dual problem using Benders' decomposition. Our method enables the incorporation of expressive integer and logical constraints in clustering problems such as must-link, cannot-link, and a minimum number of whole samples allocated to each cluster. Using this approach, we derive MAP estimation algorithms for the Bayesian Gaussian mixture model and latent Dirichlet allocation. Empirical results show that our method produces a higher optimal posterior value compared to Gibbs sampling and variational Bayes methods for standard data sets and provides certificate of convergence.

algorithm, bender, constraint, (14 more...)

arXiv.org Machine Learning

2410.19131

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Middle East > Israel (0.04)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
(4 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Łaniewski, Stanisław, Ślepaczuk, Robert

Enhancing literature review with LLM and NLP methods. Algorithmic trading case

This study utilizes machine learning algorithms to analyze and organize knowledge in the field of algorithmic trading. By filtering a dataset of 136 million research papers, we identified 14,342 relevant articles published between 1956 and Q1 2020. We compare traditional practices-such as keyword-based algorithms and embedding techniques-with state-of-the-art topic modeling methods that employ dimensionality reduction and clustering. This comparison allows us to assess the popularity and evolution of different approaches and themes within algorithmic trading. We demonstrate the usefulness of Natural Language Processing (NLP) in the automatic extraction of knowledge, highlighting the new possibilities created by the latest iterations of Large Language Models (LLMs) like ChatGPT. The rationale for focusing on this topic stems from our analysis, which reveals that research articles on algorithmic trading are increasing at a faster rate than the overall number of publications. While stocks and main indices comprise more than half of all assets considered, certain asset classes, such as cryptocurrencies, exhibit a much stronger growth trend. Machine learning models have become the most popular methods in recent years. The study demonstrates the efficacy of LLMs in refining datasets and addressing intricate questions about the analyzed articles, such as comparing the efficiency of different models. Our research shows that by decomposing tasks into smaller components and incorporating reasoning steps, we can effectively tackle complex questions supported by case analyses. This approach contributes to a deeper understanding of algorithmic trading methodologies and underscores the potential of advanced NLP techniques in literature reviews.

large language model, machine learning, natural language, (18 more...)

2411.05013

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
Europe > Poland > Masovia Province > Warsaw (0.04)
South America > Colombia > Meta Department > Villavicencio (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Research Report > Experimental Study (0.93)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

da Silva, Flavio S. Correa, Sawhney, Simon

Population stratification for prediction of mortality in post-AKI patients

AKI is associated with increases in (1) post-discharge mortality risk, (2) length of hospital stay and (3) healthcare expenditures [19], as well as short term unplanned re-admissions and mid term progressive chronic conditions. Around 33% of AKI patients require unplanned re-admissions within 90 days after discharge and around 15% develop progressive chronic kidney disease over the first year after discharge [14, 16]. AKI is multi-factorial, and accurate follow up planning is challenging. Machine learning has been viewed as promising to build tools to support decision making in clinical follow-up planning. Broadly speaking, recent initiatives can be structured along two alternatives: 1. Tools grounded on prior medical expert knowledge, which is used to stratify patients according to meaningful attributes, in such way that specialised plans can be devised for each group of patients [5, 13, 15, 19, 23]. 2. Tools grounded on machine learning techniques, which take control of the planning process and build accurate decision procedures which, however, demand extreme care in selection of new patients, to ensure compliance with population definitions that are used during preparation of decision procedures [1, 2, 3, 7, 21, 22]. Compliance with ethical standards demands that such tools are fair, transparent, and optimised for the benefit of patients. Technical requirements to ensure ethical compliance must include algorithmic transparency to support fairness and transparency in decision making and optimised, goal-oriented patient stratification to ensure human-centred optimised performance. The research initiative presented in this article focused on the development of a tool to support clinical follow up planning for post-AKI patients after hospital discharge, with particular attention to ethical compliance based on technical requirements.

artificial intelligence, machine learning, predictor, (18 more...)

2410.17865

Country:

South America > Brazil > São Paulo (0.04)
Europe > United Kingdom > Scotland (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.69)

Industry: Health & Medicine > Therapeutic Area > Nephrology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Hsieh, Po-Yu, Hou, June-Hao

Visualization and Optimization of Continuum Robots: Integration of Lie Group Kinematics and Evolutionary Algorithm

Continuum robots, known for their high flexibility and adaptability, offer immense potential for applications such as medical surgery, confined-space inspections, and wearable devices. However, their non-linear elastic nature and complex kinematics present significant challenges in digital modeling and visualization. Identifying the modal shape coefficients of specific robot configuration often requires plenty of physical experiments, which is time-consuming and robot-specific. To address this issue, this research proposes a computational framework that utilizes evolutionary algorithm (EA) to simplify the coefficient identification process. Our method starts by generating datasets using Lie group kinematics and physics-based simulations, defining both ideal configurations and models to be optimized. With the deployment of EA solver, the deviations were iteratively minimized through two fitness objectives \textemdash mean square error of shape deviation (\(\text{MSE}_1\)) and tool center point (TCP) vector deviation (\(\text{MSE}_2\)) \textemdash to align the robot's backbone curve with the desired configuration. Built on the Computer-Aided Design (CAD) platform Grasshopper, this framework provides real-time visualization suitable for development of continuum robots. Results show that this integrated method achieves precise alignment and effective identification. Overall, the objective of this research aims to reduce the modeling complexity of continuum robots, enabling precise, efficient virtual simulation before robot programming and implementation.

artificial intelligence, evolutionary algorithm, machine learning, (16 more...)

2410.14305

Country:

Asia > Taiwan (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)

Genre: Research Report > New Finding (0.34)

Industry:

Information Technology (0.54)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.76)

Hypergraphs as Weighted Directed Self-Looped Graphs: Spectral Properties, Clustering, Cheeger Inequality

Li, Zihao, Fu, Dongqi, Liu, Hengyu, He, Jingrui

Hypergraphs naturally arise when studying group relations and have been widely used in the field of machine learning. There has not been a unified formulation of hypergraphs, yet the recently proposed edge-dependent vertex weights (EDVW) modeling is one of the most generalized modeling methods of hypergraphs, i.e., most existing hypergraphs can be formulated as EDVW hypergraphs without any information loss to the best of our knowledge. However, the relevant algorithmic developments on EDVW hypergraphs remain nascent: compared to spectral graph theories, the formulations are incomplete, the spectral clustering algorithms are not well-developed, and one result regarding hypergraph Cheeger Inequality is even incorrect. To this end, deriving a unified random walk-based formulation, we propose our definitions of hypergraph Rayleigh Quotient, NCut, boundary/cut, volume, and conductance, which are consistent with the corresponding definitions on graphs. Then, we prove that the normalized hypergraph Laplacian is associated with the NCut value, which inspires our HyperClus-G algorithm for spectral clustering on EDVW hypergraphs. Finally, we prove that HyperClus-G can always find an approximately linearly optimal partitioning in terms of Both NCut and conductance. Additionally, we provide extensive experiments to validate our theoretical findings from an empirical perspective.

artificial intelligence, data mining, machine learning, (17 more...)

2411.03331

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
(25 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.34)

SPEED++: A Multilingual Event Extraction Framework for Epidemic Prediction and Preparedness

Parekh, Tanmay, Kwan, Jeffrey, Yu, Jiarui, Johri, Sparsh, Ahn, Hyosang, Muppalla, Sreya, Chang, Kai-Wei, Wang, Wei, Peng, Nanyun

Social media is often the first place where communities discuss the latest societal trends. Prior works have utilized this platform to extract epidemic-related information (e.g. infections, preventive measures) to provide early warnings for epidemic prediction. However, these works only focused on English posts, while epidemics can occur anywhere in the world, and early discussions are often in the local, non-English languages. In this work, we introduce the first multilingual Event Extraction (EE) framework SPEED++ for extracting epidemic event information for a wide range of diseases and languages. To this end, we extend a previous epidemic ontology with 20 argument roles; and curate our multilingual EE dataset SPEED++ comprising 5.1K tweets in four languages for four diseases. Annotating data in every language is infeasible; thus we develop zero-shot cross-lingual cross-disease models (i.e., training only on English COVID data) utilizing multilingual pre-training and show their efficacy in extracting epidemic-related events for 65 diverse languages across different diseases. Experiments demonstrate that our framework can provide epidemic warnings for COVID-19 in its earliest stages in Dec 2019 (3 weeks before global discussions) from Chinese Weibo posts without any training in Chinese. Furthermore, we exploit our framework's argument extraction capabilities to aggregate community epidemic discussions like symptoms and cure measures, aiding misinformation detection and public attention monitoring. Overall, we lay a strong foundation for multilingual epidemic preparedness.

large language model, machine learning, natural language, (19 more...)

2410.18393

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Ontario > Toronto (0.04)
(24 more...)

Genre: Research Report > New Finding (0.87)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
(2 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)