AITopics

2504.05881

Country:

South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
North America > United States (0.04)
Europe > United Kingdom > Wales (0.04)
Europe > United Kingdom > England (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Consumer Products & Services > Retirement (1.00)
Banking & Finance > Insurance (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)

FOX NewsApr-7-2025, 08:00:52 GMT

Northern Border 'quiet crisis' brews as expert floats unconventional solution to combat human smuggling

A "quiet crisis" is emerging at the U.S.-Canada border, as one expert proposes an unconventional solution to fight human smuggling: leveraging advanced technologies like artificial intelligence. While national attention is largely fixed on the southern border, an increasingly concerning situation is unfolding along the country's northern border, said Jon Brewton, the founder and CEO of Data2 and a U.S. Air Force Veteran. "U.S. Customs and Border Patrol has seen a fairly alarming increase in illegal crossings, drug trafficking, and even encountering individuals on the terrorist watch list," he told Fox News Digital. "And as difficult as securing the southern border has been, the northern border is twice as long." While the vast majority of illegal crossings happen at the southern border, officials have been warning for years that the northern line has seen an increase.

border, northern border, southern border, (12 more...)

FOX News

Country:

North America > United States > Vermont (0.06)
North America > United States > New York (0.06)
South America (0.05)
(7 more...)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Government > Immigration & Customs (1.00)

Technology: Information Technology > Artificial Intelligence (1.00)

arXiv.org Machine LearningApr-7-2025

Topological Schr\"odinger Bridge Matching

Yang, Maosheng

Given two boundary distributions, the Schr\"odinger Bridge (SB) problem seeks the ``most likely`` random evolution between them with respect to a reference process. It has revealed rich connections to recent machine learning methods for generative modeling and distribution matching. While these methods perform well in Euclidean domains, they are not directly applicable to topological domains such as graphs and simplicial complexes, which are crucial for data defined over network entities, such as node signals and edge flows. In this work, we propose the Topological Schr\"odinger Bridge problem (TSBP) for matching signal distributions on a topological domain. We set the reference process to follow some linear tractable topology-aware stochastic dynamics such as topological heat diffusion. For the case of Gaussian boundary distributions, we derive a closed-form topological SB (TSB) in terms of its time-marginal and stochastic differential. In the general case, leveraging the well-known result, we show that the optimal process follows the forward-backward topological dynamics governed by some unknowns. Building on these results, we develop TSB-based models for matching topological signals by parameterizing the unknowns in the optimal process as (topological) neural networks and learning them through likelihood training. We validate the theoretical results and demonstrate the practical applications of TSB-based models on both synthetic and real-world networks, emphasizing the role of topology. Additionally, we discuss the connections of TSB-based models to other emerging models, and outline future directions for topological signal matching.

artificial intelligence, conference paper, machine learning, (19 more...)

2504.04799

Country:

Europe > Netherlands > South Holland > Delft (0.04)
Asia > Middle East > Jordan (0.04)
Oceania > Australia (0.04)
(4 more...)

Genre:

Research Report (0.63)
Overview (0.45)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.93)
Energy (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Machine LearningApr-7-2025

Ensuring Safety in an Uncertain Environment: Constrained MDPs via Stochastic Thresholds

Zuo, Qian, He, Fengxiang

This paper studies constrained Markov decision processes (CMDPs) with constraints against stochastic thresholds, aiming at safety of reinforcement learning in unknown and uncertain environments. We leverage a Growing-Window estimator sampling from interactions with the uncertain and dynamic environment to estimate the thresholds, based on which we design Stochastic Pessimistic-Optimistic Thresholding (SPOT), a novel model-based primal-dual algorithm for multiple constraints against stochastic thresholds. SPOT enables reinforcement learning under both pessimistic and optimistic threshold settings. We prove that our algorithm achieves sublinear regret and constraint violation; i.e., a reward regret of $\tilde{\mathcal{O}}(\sqrt{T})$ while allowing an $\tilde{\mathcal{O}}(\sqrt{T})$ constraint violation over $T$ episodes. The theoretical guarantees show that our algorithm achieves performance comparable to that of an approach relying on fixed and clear thresholds. To the best of our knowledge, SPOT is the first reinforcement learning algorithm that realises theoretical guaranteed performance in an uncertain environment where even thresholds are unknown.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

2504.04973

Country: South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Souza, J. V. S., Vieira, C. B., Cavalcanti, G. D. C., Cruz, R. M. O.

Imbalanced malware classification: an approach based on dynamic classifier selection

arXiv.org Artificial IntelligenceApr-5-2025

In recent years, the rise of cyber threats has emphasized the need for robust malware detection systems, especially on mobile devices. Malware, which targets vulnerabilities in devices and user data, represents a substantial security risk. A significant challenge in malware detection is the imbalance in datasets, where most applications are benign, with only a small fraction posing a threat. This study addresses the often-overlooked issue of class imbalance in malware detection by evaluating various machine learning strategies for detecting malware in Android applications. We assess monolithic classifiers and ensemble methods, focusing on dynamic selection algorithms, which have shown superior performance compared to traditional approaches. In contrast to balancing strategies performed on the whole dataset, we propose a balancing procedure that works individually for each classifier in the pool. Our empirical analysis demonstrates that the KNOP algorithm obtained the best results using a pool of Random Forest. Additionally, an instance hardness assessment revealed that balancing reduces the difficulty of the minority class and enhances the detection of the minority class (malware). The code used for the experiments is available at https://github.com/jvss2/Machine-Learning-Empirical-Evaluation.

artificial intelligence, classifier, machine learning, (13 more...)

2504.00041

Country:

South America > Brazil > Pernambuco > Recife (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > Netherlands > South Holland > Dordrecht (0.04)

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.36)

arXiv.org Artificial IntelligenceApr-5-2025

Opioid Named Entity Recognition (ONER-2025) from Reddit

Ahmad, Muhammad, Farid, Humaira, Ameer, Iqra, Amjad, Maaz, Muzamil, Muhammad, Hamza, Ameer, Jalal, Muhammad, Batyrshin, Ildar, Sidorov, Grigori

The opioid overdose epidemic remains a critical public health crisis, particularly in the United States, leading to significant mortality and societal costs. Social media platforms like Reddit provide vast amounts of unstructured data that offer insights into public perceptions, discussions, and experiences related to opioid use. This study leverages Natural Language Processing (NLP), specifically Opioid Named Entity Recognition (ONER-2025), to extract actionable information from these platforms. Our research makes four key contributions. First, we created a unique, manually annotated dataset sourced from Reddit, where users share self-reported experiences of opioid use via different administration routes. This dataset contains 331,285 tokens and includes eight major opioid entity categories. Second, we detail our annotation process and guidelines while discussing the challenges of labeling the ONER-2025 dataset. Third, we analyze key linguistic challenges, including slang, ambiguity, fragmented sentences, and emotionally charged language, in opioid discussions. Fourth, we propose a real-time monitoring system to process streaming data from social media, healthcare records, and emergency services to identify overdose events. Using 5-fold cross-validation in 11 experiments, our system integrates machine learning, deep learning, and transformer-based language models with advanced contextual embeddings to enhance understanding. Our transformer-based models (bert-base-NER and roberta-base) achieved 97% accuracy and F1-score, outperforming baselines by 10.23% (RF=0.88).

information retrieval, large language model, machine learning, (19 more...)

2504.00027

Country:

North America > Mexico > Mexico City > Mexico City (0.04)
South America (0.04)
North America > United States > Texas > Lubbock County > Lubbock (0.04)
(6 more...)

Genre: Research Report > New Finding (0.69)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Addiction Disorder (1.00)
Health & Medicine > Public Health (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(3 more...)

arXiv.org Artificial IntelligenceApr-3-2025

RipVIS: Rip Currents Video Instance Segmentation Benchmark for Beach Monitoring and Safety

Dumitriu, Andrei, Tatui, Florin, Miron, Florin, Ralhan, Aakash, Ionescu, Radu Tudor, Timofte, Radu

Rip currents are strong, localized and narrow currents of water that flow outwards into the sea, causing numerous beach-related injuries and fatalities worldwide. Accurate identification of rip currents remains challenging due to their amorphous nature and the lack of annotated data, which often requires expert knowledge. To address these issues, we present RipVIS, a large-scale video instance segmentation benchmark explicitly designed for rip current segmentation. RipVIS is an order of magnitude larger than previous datasets, featuring $184$ videos ($212,328$ frames), of which $150$ videos ($163,528$ frames) are with rip currents, collected from various sources, including drones, mobile phones, and fixed beach cameras. Our dataset encompasses diverse visual contexts, such as wave-breaking patterns, sediment flows, and water color variations, across multiple global locations, including USA, Mexico, Costa Rica, Portugal, Italy, Greece, Romania, Sri Lanka, Australia and New Zealand. Most videos are annotated at $5$ FPS to ensure accuracy in dynamic scenarios, supplemented by an additional $34$ videos ($48,800$ frames) without rip currents. We conduct comprehensive experiments with Mask R-CNN, Cascade Mask R-CNN, SparseInst and YOLO11, fine-tuning these models for the task of rip current segmentation. Results are reported in terms of multiple metrics, with a particular focus on the $F_2$ score to prioritize recall and reduce false negatives. To enhance segmentation performance, we introduce a novel post-processing step based on Temporal Confidence Aggregation (TCA). RipVIS aims to set a new standard for rip current segmentation, contributing towards safer beach environments. We offer a benchmark website to share data, models, and results with the research community, encouraging ongoing collaboration and future contributions, at https://ripvis.ai.

artificial intelligence, deep learning, machine learning, (17 more...)

2504.01128

Country:

Oceania > New Zealand (0.24)
Oceania > Australia (0.24)
North America > Mexico (0.24)
(14 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Data Science (0.93)
(2 more...)

Faria, Gonçalo, Smith, Noah A.

Sample, Don't Search: Rethinking Test-Time Alignment for Language Models

arXiv.org Machine LearningApr-3-2025

Increasing test-time computation has emerged as a promising direction for improving language model performance, particularly in scenarios where model finetuning is impractical or impossible due to computational constraints or private model weights. However, existing test-time search methods using a reward model (RM) often degrade in quality as compute scales, due to the over-optimization of what are inherently imperfect reward proxies. We introduce QAlign, a new test-time alignment approach. As we scale test-time compute, QAlign converges to sampling from the optimal aligned distribution for each individual prompt. By adopting recent advances in Markov chain Monte Carlo for text generation, our method enables better-aligned outputs without modifying the underlying model or even requiring logit access. We demonstrate the effectiveness of QAlign on mathematical reasoning benchmarks (GSM8K and GSM-Symbolic) using a task-specific RM, showing consistent improvements over existing test-time compute methods like best-of-n and majority voting. Furthermore, when applied with more realistic RMs trained on the Tulu 3 preference dataset, QAlign outperforms direct preference optimization (DPO), best-of-n, majority voting, and weighted majority voting on a diverse range of datasets (GSM8K, MATH500, IFEval, MMLU-Redux, and TruthfulQA). A practical solution to aligning language models at test time using additional computation without degradation, our approach expands the limits of the capability that can be obtained from off-the-shelf language models without further training.

large language model, machine learning, natural language, (19 more...)

2504.0379

Country:

South America > Colombia > Meta Department > Villavicencio (0.04)
North America > United States (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
(3 more...)

Genre: Research Report (0.86)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.88)
(2 more...)

arXiv.org Artificial IntelligenceApr-3-2025

Efficient Annotator Reliability Assessment with EffiARA

Cook, Owen, Vasilakes, Jake, Roberts, Ian, Song, Xingyi

Data annotation is an essential component of the machine learning pipeline; it is also a costly and time-consuming process. With the introduction of transformer-based models, annotation at the document level is increasingly popular; however, there is no standard framework for structuring such tasks. The EffiARA annotation framework is, to our knowledge, the first project to support the whole annotation pipeline, from understanding the resources required for an annotation task to compiling the annotated dataset and gaining insights into the reliability of individual annotators as well as the dataset as a whole. The framework's efficacy is supported by two previous studies: one improving classification performance through annotator-reliability-based soft label aggregation and sample weighting, and the other increasing the overall agreement among annotators through removing identifying and replacing an unreliable annotator. This work introduces the EffiARA Python package and its accompanying webtool, which provides an accessible graphical user interface for the system. We open-source the EffiARA Python package at https://github.com/MiniEggz/EffiARA and the webtool is publicly accessible at https://effiara.gate.ac.uk.

annotator, large language model, machine learning, (20 more...)

2504.00589

Country:

North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > Italy (0.04)

Genre:

Research Report (0.50)
Workflow (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.34)

arXiv.org Machine LearningApr-2-2025

Comparative Analysis of Deepfake Detection Models: New Approaches and Perspectives

Batista, Matheus Martins

The growing threat posed by deepfake videos, capable of manipulating realities and disseminating misinformation, drives the urgent need for effective detection methods. This work investigates and compares different approaches for identifying deepfakes, focusing on the GenConViT model and its performance relative to other architectures present in the DeepfakeBenchmark. To contextualize the research, the social and legal impacts of deepfakes are addressed, as well as the technical fundamentals of their creation and detection, including digital image processing, machine learning, and artificial neural networks, with emphasis on Convolutional Neural Networks (CNNs), Generative Adversarial Networks (GANs), and Transformers. The performance evaluation of the models was conducted using relevant metrics and new datasets established in the literature, such as WildDeep-fake and DeepSpeak, aiming to identify the most effective tools in the battle against misinformation and media manipulation. The obtained results indicated that GenConViT, after fine-tuning, exhibited superior performance in terms of accuracy (93.82%) and generalization capacity, surpassing other architectures in the DeepfakeBenchmark on the DeepSpeak dataset. This study contributes to the advancement of deepfake detection techniques, offering contributions to the development of more robust and effective solutions against the dissemination of false information.

artificial intelligence, machine learning, modelo, (19 more...)

2504.029

Country:

North America > United States (0.14)
South America > Brazil > Minas Gerais > Itajubá (0.04)
South America > Brazil > Rio Grande do Sul > Porto Alegre (0.04)
(2 more...)

Genre: Research Report (0.40)

Industry:

Media (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)