AITopics

Recent AI advances have enabled multi-modal systems to model and translate diverse information spaces. Extending beyond text and vision, we introduce OneProt, a multi-modal AI for proteins that integrates structural, sequence, alignment, and binding site data. Using the ImageBind framework, OneProt aligns the latent spaces of modality encoders along protein sequences. It demonstrates strong performance in retrieval tasks and surpasses state-of-the-art methods in various downstream tasks, including metal ion binding classification, gene-ontology annotation, and enzyme function prediction. This work expands multi-modal capabilities in protein models, paving the way for applications in drug discovery, biocatalytic reaction planning, and protein engineering.

modality, oneprot, sequence, (15 more...)

2411.04863

Country:

North America > United States (0.46)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
Europe > Germany > North Rhine-Westphalia > Düsseldorf Region > Düsseldorf (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.46)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.68)

Technology:

Information Technology > Biomedical Informatics > Translational Bioinformatics (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
(2 more...)

A multi-purpose automatic editing system based on lecture semantics for remote education

Hu, Panwen, Huang, Rui

Remote teaching has become popular recently due to its convenience and safety, especially under extreme circumstances like a pandemic. However, online students usually have a poor experience since the information acquired from the views provided by the broadcast platforms is limited. One potential solution is to show more camera views simultaneously, but it is technically challenging and distracting for the viewers. Therefore, an automatic multi-camera directing/editing system, which aims at selecting the most concerned view at each time instance to guide the attention of online students, is in urgent demand. However, existing systems mostly make simple assumptions and focus on tracking the position of the speaker instead of the real lecture semantics, and therefore have limited capacities to deliver optimal information flow. To this end, this paper proposes an automatic multi-purpose editing system based on the lecture semantics, which can both direct the multiple video streams for real-time broadcasting and edit the optimal video offline for review purposes. Our system directs the views by semantically analyzing the class events while following the professional directing rules, mimicking a human director to capture the regions of interest from the viewpoint of the onsite students. We conduct both qualitative and quantitative analyses to verify the effectiveness of the proposed system and its components.

editing system, student, video, (17 more...)

2411.04859

Country:

South America > Uruguay > Maldonado > Maldonado (0.04)
North America > United States > California > Los Angeles County > Los Angeles (0.04)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Media (1.00)
Leisure & Entertainment (1.00)
Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Human Computer Interaction (0.93)

DISCO: DISCovering Overfittings as Causal Rules for Text Classification Models

Zhang, Zijian, Setty, Vinay, Wang, Yumeng, Anand, Avishek

With the rapid advancement of neural language models, the deployment of over-parameterized models has surged, increasing the need for interpretable explanations comprehensible to human inspectors. Existing post-hoc interpretability methods, which often focus on unigram features of single input textual instances, fail to capture the models' decision-making process fully. Additionally, many methods do not differentiate between decisions based on spurious correlations and those based on a holistic understanding of the input. Our paper introduces DISCO, a novel method for discovering global, rule-based explanations by identifying causal n-gram associations with model predictions. This method employs a scalable sequence mining technique to extract relevant text spans from training data, associate them with model predictions, and conduct causality checks to distill robust rules that elucidate model behavior. These rules expose potential overfitting and provide insights into misleading feature combinations. We validate DISCO through extensive testing, demonstrating its superiority over existing methods in offering comprehensive insights into complex model behaviors. Our approach successfully identifies all shortcuts manually introduced into the training data (100% detection rate on the MultiRC dataset), resulting in an 18.8% regression in model performance -- a capability unmatched by any other method. Furthermore, DISCO supports interactive explanations, enabling human inspectors to distinguish spurious causes in the rule-based output. This alleviates the burden of abundant instance-wise explanations and helps assess the model's risk when encountering out-of-distribution (OOD) data.

dataset, prediction, shortcut, (16 more...)

2411.04649

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Netherlands > South Holland > Leiden (0.04)
(12 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Martinez-Seras, Aitor, Del Ser, Javier, Andres, Alain, Garcia-Bringas, Pablo

On the Inherent Robustness of One-Stage Object Detection against Out-of-Distribution Data

Robustness is a fundamental aspect for developing safe and trustworthy models, particularly when they are deployed in the open world. In this work we analyze the inherent capability of one-stage object detectors to robustly operate in the presence of out-of-distribution (OoD) data. Specifically, we propose a novel detection algorithm for detecting unknown objects in image data, which leverages the features extracted by the model from each sample. Differently from other recent approaches in the literature, our proposal does not require retraining the object detector, thereby allowing for the use of pretrained models. Our proposed OoD detector exploits the application of supervised dimensionality reduction techniques to mitigate the effects of the curse of dimensionality on the features extracted by the model. Furthermore, it utilizes high-resolution feature maps to identify potential unknown objects in an unsupervised fashion. Our experiments analyze the Pareto trade-off between the performance detecting known and unknown objects resulting from different algorithmic configurations and inference confidence thresholds. We also compare the performance of our proposed algorithm to that of logits-based post-hoc OoD methods, as well as possible fusion strategies. Finally, we discuss on the competitiveness of all tested methods against state-of-the-art OoD approaches for object detection models over the recently published Unknown Object Detection benchmark. The obtained results verify that the performance of avant-garde post-hoc OoD detectors can be further improved when combined with our proposed algorithm.

configuration, detection, detector, (16 more...)

2411.04586

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Spain > Basque Country > Biscay Province > Bilbao (0.04)
South America > Argentina > Patagonia > Río Negro Province > Viedma (0.04)
Europe > Monaco (0.04)

Genre: Research Report > New Finding (0.93)

Industry: Government (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Dahan, Noam, Stanovsky, Gabriel

The State and Fate of Summarization Datasets

Automatic summarization has consistently attracted attention, due to its versatility and wide application in various downstream tasks. Despite its popularity, we find that annotation efforts have largely been disjointed, and have lacked common terminology. Consequently, it is challenging to discover existing resources or identify coherent research directions. To address this, we survey a large body of work spanning 133 datasets in over 100 languages, creating a novel ontology covering sample properties, collection methods and distribution. With this ontology we make key observations, including the lack in accessible high-quality datasets for low-resource languages, and the field's over-reliance on the news domain and on automatically collected distant supervision. Finally, we make available a web interface that allows users to interact and explore our ontology and dataset collection, as well as a template for a summarization data card, which can be used to streamline future research into a more coherent body of work.

dataset, summarization, varab and schluter, (11 more...)

2411.04585

Country:

North America > Canada (0.14)
Asia > Middle East > Iran (0.04)
South America > Argentina > Patagonia > Río Negro Province > Viedma (0.04)
(9 more...)

Genre: Overview (1.00)

Industry:

Law (0.93)
Media > News (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Communications > Social Media (0.68)

Multi-Agents are Social Groups: Investigating Social Influence of Multiple Agents in Human-Agent Interactions

Song, Tianqi, Tan, Yugin, Zhu, Zicheng, Feng, Yibin, Lee, Yi-Chieh

Multi-agent systems - systems with multiple independent AI agents working together to achieve a common goal - are becoming increasingly prevalent in daily life. Drawing inspiration from the phenomenon of human group social influence, we investigate whether a group of AI agents can create social pressure on users to agree with them, potentially changing their stance on a topic. We conducted a study in which participants discussed social issues with either a single or multiple AI agents, and where the agents either agreed or disagreed with the user's stance on the topic. We found that conversing with multiple agents (holding conversation content constant) increased the social pressure felt by participants, and caused a greater shift in opinion towards the agents' stances on each topic. Our study shows the potential advantages of multi-agent systems over single-agent platforms in causing opinion change. We discuss design implications for possible multi-agent systems that promote social good, as well as the potential for malicious actors to use these systems to manipulate public opinion.

agent, participant, social influence, (12 more...)

2411.04578

Country:

Asia > Singapore > Central Region > Singapore (0.04)
South America > Colombia > Meta Department > Villavicencio (0.04)
North America > United States > Virginia (0.04)
(4 more...)

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)
Research Report > Experimental Study > Negative Result (0.68)

Industry:

Education (1.00)
Health & Medicine > Consumer Health (0.94)
Information Technology (0.69)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.31)

Pillai, Leena G, Manohar, Kavya, Raju, Basil K, Sherly, Elizabeth

Multistage Fine-tuning Strategies for Automatic Speech Recognition in Low-resource Languages

This paper presents a novel multistage fine-tuning strategy designed to enhance automatic speech recognition (ASR) performance in low-resource languages using OpenAI's Whisper model. In this approach we aim to build ASR model for languages with limited digital resources by sequentially adapting the model across linguistically similar languages. We experimented this on the Malasar language, a Dravidian language spoken by approximately ten thousand people in the Western Ghats of South India. Malasar language faces critical challenges for technological intervention due to its lack of a native script and absence of digital or spoken data resources. Working in collaboration with Wycliffe India and Malasar community members, we created a spoken Malasar corpus paired with transcription in Tamil script, a closely related major language. In our approach to build ASR model for Malasar, we first build an intermediate Tamil ASR, leveraging higher data availability for Tamil annotated speech. This intermediate model is subsequently fine-tuned on Malasar data, allowing for more effective ASR adaptation despite limited resources. The multistage fine-tuning strategy demonstrated significant improvements over direct fine-tuning on Malasar data alone, achieving a word error rate (WER) of 51.9%, which is 4.5% absolute reduction when compared to the direct fine-tuning method. Further a WER reduction to 47.3% was achieved through punctuation removal in post-processing, which addresses formatting inconsistencies that impact evaluation. Our results underscore the effectiveness of sequential multistage fine-tuning combined with targeted post-processing as a scalable strategy for ASR system development in low-resource languages, especially where linguistic similarities can be leveraged to bridge gaps in training data.

automatic speech recognition, fine-tuning, malasar, (12 more...)

2411.04573

Country:

Asia > India > Kerala > Thiruvananthapuram (0.05)
Asia > Nepal (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(3 more...)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Sulimov, Pavel, Lehmann, Claude, Stockinger, Kurt

GenJoin: Conditional Generative Plan-to-Plan Query Optimizer that Learns from Subplan Hints

Query optimization has become a research area where classical algorithms are being challenged by machine learning algorithms. At the same time, recent trends in learned query optimizers have shown that it is prudent to take advantage of decades of database research and augment classical query optimizers by shrinking the plan search space through different types of hints (e.g. by specifying the join type, scan type or the order of joins) rather than completely replacing the classical query optimizer with machine learning models. It is especially relevant for cases when classical optimizers cannot fully enumerate all logical and physical plans and, as an alternative, need to rely on less robust approaches like genetic algorithms. However, even symbiotically learned query optimizers are hampered by the need for vast amounts of training data, slow plan generation during inference and unstable results across various workload conditions. In this paper, we present GenJoin - a novel learned query optimizer that considers the query optimization problem as a generative task and is capable of learning from a random set of subplan hints to produce query plans that outperform the classical optimizer. GenJoin is the first learned query optimizer that significantly and consistently outperforms PostgreSQL as well as state-of-the-art methods on two well-known real-world benchmarks across a variety of workloads using rigorous machine learning evaluations.

execution time, postgresql, query, (12 more...)

2411.04525

Country:

Europe > Switzerland > Zürich > Zürich (0.04)
South America > Brazil (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Avilov, Fedor S., Eremin, Roman A., Budennyy, Semen A., Humonen, Innokentiy S.

On the Robustness of Machine Learning Models in Predicting Thermodynamic Properties: a Case of Searching for New Quasicrystal Approximants

Despite an artificial intelligence-assisted modeling of disordered crystals is a widely used and well-tried method of new materials design, the issues of its robustness, reliability, and stability are still not resolved and even not discussed enough. To highlight it, in this work we composed a series of nested intermetallic approximants of quasicrystals datasets and trained various machine learning models on them correspondingly. Our qualitative and, what is more important, quantitative assessment of the difference in the predictions clearly shows that different reasonable changes in the training sample can lead to the completely different set of the predicted potentially new materials. We also showed the advantage of pre-training and proposed a simple yet effective trick of sequential training to increase stability.

neural network, prediction, thermodynamic property, (15 more...)

2410.13873

Country:

Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
South America > Suriname > Marowijne District > Albina (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(3 more...)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

On the Rigour of Scientific Writing: Criteria, Analysis, and Insights

James, Joseph, Xiao, Chenghao, Li, Yucheng, Lin, Chenghua

Rigour is crucial for scientific research as it ensures the reproducibility and validity of results and findings. Despite its importance, little work exists on modelling rigour computationally, and there is a lack of analysis on whether these criteria can effectively signal or measure the rigour of scientific papers in practice. In this paper, we introduce a bottom-up, data-driven framework to automatically identify and define rigour criteria and assess their relevance in scientific writing. Our framework includes rigour keyword extraction, detailed rigour definition generation, and salient criteria identification. Furthermore, our framework is domain-agnostic and can be tailored to the evaluation of scientific rigour for different areas, accommodating the distinct salient criteria across fields. We conducted comprehensive experiments based on datasets collected from two high impact venues for Machine Learning and NLP (i.e., ICLR and ACL) to demonstrate the effectiveness of our framework in modelling rigour. In addition, we analyse linguistic patterns of rigour, revealing that framing certainty is crucial for enhancing the perception of scientific rigour, while suggestion certainty and probability uncertainty diminish it.

criteria, rigour, rigour criteria, (16 more...)

2410.04981

Country:

North America > Canada > Ontario > Toronto (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > Dominican Republic (0.04)
(6 more...)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.46)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)