AITopics

2504.10527

Country:

Europe (0.46)
South America > Brazil (0.28)
North America > United States (0.28)

Genre:

Overview (1.00)
Research Report > New Finding (0.45)

Industry:

Information Technology (1.00)
Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Consumer Health (1.00)
(3 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(5 more...)

Lima, João Alberto de Oliveira

Poly-Vector Retrieval: Reference and Content Embeddings for Legal Documents

arXiv.org Artificial IntelligenceApr-16-2025

Retrieval-Augmented Generation (RAG) has emerged as an effective paradigm for generating contextually accurate answers by integrating Large Language Models (LLMs) with retrieval mechanisms. However, in legal contexts, users frequently reference norms by their labels or nicknames (e.g., Article 5 of the Constitution or Consumer Defense Code (CDC)), rather than by their content, posing challenges for traditional RAG approaches that rely solely on semantic embeddings of text. Furthermore, legal texts themselves heavily rely on explicit cross-references (e.g., "pursuant to Article 34") that function as pointers. Both scenarios pose challenges for traditional RAG approaches that rely solely on semantic embeddings of text, often failing to retrieve the necessary referenced content. This paper introduces Poly-Vector Retrieval, a method assigning multiple distinct embeddings to each legal provision: one embedding captures the content (the full text), another captures the label (the identifier or proper name), and optionally additional embeddings capture alternative denominations. Inspired by Frege's distinction between Sense and Reference, this poly-vector retrieval approach treats labels, identifiers and reference markers as rigid designators and content embeddings as carriers of semantic substance. Experiments on the Brazilian Federal Constitution demonstrate that Poly-Vector Retrieval significantly improves retrieval accuracy for label-centric queries and potential to resolve internal and external cross-references, without compromising performance on purely semantic queries. The study discusses philosophical and practical implications of explicitly separating reference from content in vector embeddings and proposes future research directions for applying this approach to broader legal datasets and other domains characterized by explicit reference identifiers.

crfb, large language model, machine learning, (21 more...)

2504.10508

Country: South America > Brazil (0.48)

Genre: Research Report > New Finding (0.92)

Industry: Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

arXiv.org Artificial IntelligenceApr-16-2025

EthosGPT: Mapping Human Value Diversity to Advance Sustainable Development Goals (SDGs)

Zhang, Luyao

Large language models (LLMs) are transforming global decision-making and societal systems by processing diverse data at unprecedented scales. However, their potential to homogenize human values poses critical risks, similar to biodiversity loss undermining ecological resilience. Rooted in the ancient Greek concept of ethos, meaning both individual character and the shared moral fabric of communities, EthosGPT draws on a tradition that spans from Aristotle's virtue ethics to Adam Smith's moral sentiments as the ethical foundation of economic cooperation. These traditions underscore the vital role of value diversity in fostering social trust, institutional legitimacy, and long-term prosperity. EthosGPT addresses the challenge of value homogenization by introducing an open-source framework for mapping and evaluating LLMs within a global scale of human values. Using international survey data on cultural indices, prompt-based assessments, and comparative statistical analyses, EthosGPT reveals both the adaptability and biases of LLMs across regions and cultures. It offers actionable insights for developing inclusive LLMs, such as diversifying training data and preserving endangered cultural heritage to ensure representation in AI systems. These contributions align with the United Nations Sustainable Development Goals (SDGs), especially SDG 10 (Reduced Inequalities), SDG 11.4 (Cultural Heritage Preservation), and SDG 16 (Peace, Justice and Strong Institutions). Through interdisciplinary collaboration, EthosGPT promotes AI systems that are both technically robust and ethically inclusive, advancing value plurality as a cornerstone for sustainable and equitable futures.

large language model, machine learning, natural language, (15 more...)

2504.09861

Country:

South America (1.00)
North America (1.00)
Asia > Middle East (1.00)
(2 more...)

Genre:

Questionnaire & Opinion Survey (1.00)
Research Report > New Finding (0.46)

Industry: Social Sector (0.61)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

EngadgetApr-15-2025, 18:15:11 GMT

Anthropic's Claude can now read your emails

Anthropic announced that its Claude AI can integrate with Google Workspace. This tie-in allows the AI assistant to access any information in Gmail, Google Documents and Google Calendar. Enterprise-level customers even get a special cataloguing option for Documents that aims to offer even better speed and accuracy when retrieving information. This update could make Claude more helpful when it comes to using the chatbot for scheduling or accessing information within the Google ecosystem. The blog post with the announcement specified that the Enterprise option comes with special security controls for confidentiality, but doesn't detail if or how other users might be able to keep Claude from accessing sensitive information that might be stored in an email or document.

anthropic, claude, information, (3 more...)

Engadget

Country:

South America > Brazil (0.08)
North America > United States (0.08)
Asia > Japan (0.08)

Industry: Information Technology > Security & Privacy (0.62)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.85)

Nishida, Naoto, Ishiguro, Yoshio, Rekiomto, Jun, Yamashita, Naomi

Dynamik: Syntactically-Driven Dynamic Font Sizing for Emphasis of Key Information

In today's globalized world, there are increasing opportunities for individuals to communicate using a common non-native language (lingua franca). Non-native speakers often have opportunities to listen to foreign languages, but may not comprehend them as fully as native speakers do. To aid real-time comprehension, live transcription of subtitles is frequently used in everyday life (e.g., during Zoom conversations, watching YouTube videos, or on social networking sites). However, simultaneously reading subtitles while listening can increase cognitive load. In this study, we propose Dynamik, a system that reduces cognitive load during reading by decreasing the size of less important words and enlarging important ones, thereby enhancing sentence contrast. Our results indicate that Dynamik can reduce certain aspects of cognitive load, specifically, participants' perceived performance and effort among individuals with low proficiency in English, as well as enhance the users' sense of comprehension, especially among people with low English ability. We further discuss our methods' applicability to other languages and potential improvements and further research directions.

artificial intelligence, machine learning, social media, (18 more...)

doi: 10.1145/3708359.3712115

2504.09734

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > Italy > Sardinia > Cagliari (0.06)
North America > United States > New York > New York County > New York City (0.06)
(39 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Media (1.00)
Education (1.00)
Government > Regional Government > North America Government > United States Government (0.68)
(3 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

arXiv.org Machine LearningApr-15-2025

TransST: Transfer Learning Embedded Spatial Factor Modeling of Spatial Transcriptomics Data

Liu, Shuo Shuo, Wang, Shikun, Chen, Yuxuan, Rustgi, Anil K., Yuan, Ming, Hu, Jianhua

Background: Spatial transcriptomics have emerged as a powerful tool in biomedical research because of its ability to capture both the spatial contexts and abundance of the complete RNA transcript profile in organs of interest. However, limitations of the technology such as the relatively low resolution and comparatively insufficient sequencing depth make it difficult to reliably extract real biological signals from these data. To alleviate this challenge, we propose a novel transfer learning framework, referred to as TransST, to adaptively leverage the cell-labeled information from external sources in inferring cell-level heterogeneity of a target spatial transcriptomics data. Results: Applications in several real studies as well as a number of simulation settings show that our approach significantly improves existing techniques. For example, in the breast cancer study, TransST successfully identifies five biologically meaningful cell clusters, including the two subgroups of cancer in situ and invasive cancer; in addition, only TransST is able to separate the adipose tissues from the connective issues among all the studied methods. Conclusions: In summary, the proposed method TransST is both effective and robust in identifying cell subclusters and detecting corresponding driving biomarkers in spatial transcriptomics data.

artificial intelligence, machine learning, transst, (17 more...)

arXiv.org Machine Learning

2504.12353

Country:

North America > United States (0.28)
Europe > Netherlands > South Holland > Leiden (0.04)
South America > Argentina (0.04)

Genre: Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area > Dermatology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (0.50)
Health & Medicine > Therapeutic Area > Oncology > Carcinoma (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.63)

RoboCup Rescue 2025 Team Description Paper UruBots

Farias, Kevin, Moraes, Pablo, Nunes, Igor, Deniz, Juan, Barcelona, Sebastian, Sodre, Hiago, Moraes, William, Rodriguez, Monica, Mazondo, Ahilen, Sandin, Vincent, da Silva, Gabriel, Saravia, Victoria, Melgar, Vinicio, Fernandez, Santiago, Grando, Ricardo

--This paper describes the approach used by T eam UruBots for participation in the 2025 RoboCup Rescue Robot League competition. Our team aims to participate for the first time in this competition at RoboCup, using experience learned from previous competitions and research. We present our vehicle and our approach to tackle the task of detecting and finding victims in search and rescue environments. Our approach contains known topics in robotics, such as ROS, SLAM, Human Robot Interaction and segmentation and perception. Our proposed approach is open source, available to the RoboCup Rescue community, where we aim to learn and contribute to the league.

artificial intelligence, deep learning, machine learning, (14 more...)

2504.09778

Country: South America (0.16)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment > Sports > Soccer (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Soccer Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Cardoso, Lucas, Santos, Vitor, Ribeiro, José, Kawasaki, Regiane, Prudêncio, Ricardo, Alves, Ronnie

Enhancing Classifier Evaluation: A Fairer Benchmarking Strategy Based on Ability and Robustness

Benchmarking is a fundamental practice in machine learning (ML) for comparing the performance of classification algorithms. However, traditional evaluation methods often overlook a critical aspect: the joint consideration of dataset complexity and an algorithm's ability to generalize. Without this dual perspective, assessments may favor models that perform well on easy instances while failing to capture their true robustness. To address this limitation, this study introduces a novel evaluation methodology that combines Item Response Theory (IRT) with the Glicko-2 rating system, originally developed to measure player strength in competitive games. IRT assesses classifier ability based on performance over difficult instances, while Glicko-2 updates performance metrics - such as rating, deviation, and volatility - via simulated tournaments between classifiers. This combined approach provides a fairer and more nuanced measure of algorithm capability. A case study using the OpenML-CC18 benchmark showed that only 15% of the datasets are truly challenging and that a reduced subset with 50% of the original datasets offers comparable evaluation power. Among the algorithms tested, Random Forest achieved the highest ability score. The results highlight the importance of improving benchmark design by focusing on dataset quality and adopting evaluation strategies that reflect both difficulty and classifier proficiency.

artificial intelligence, decision tree learning, machine learning, (20 more...)

2504.09759

Country:

South America > Brazil (0.28)
North America > United States (0.28)

Genre:

Research Report (1.00)
Workflow (0.68)

Industry:

Education (0.67)
Leisure & Entertainment > Games (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.34)

UruBots RoboCup Work Team Description Paper

Sodre, Hiago, Deniz, Juan, Moraes, Pablo, Moraes, William, Nunes, Igor, Sandin, Vincent, Mazondo, Ahilen, Fernandez, Santiago, da Silva, Gabriel, Rodriguez, Monica, Barcelona, Sebastian, Grando, Ricardo

This work presents a team description paper for the RoboCup @work League. Our team, UruBots, has been developing robots and projects for research and competitions in the last three years, attending robotics competitions in Uruguay and around the world. In this instance, we aim to participate and contribute to the RoboCup @Work category, hopefully making our debut in this prestigious competition. For that, we present an approach based on the Limo robot, whose main characteristic is its hybrid locomotion system with wheels and tracks, with some extras added by the team to complement the robot's functionalities. Overall, our approach allows the robot to efficiently and autonomously navigate a @work scenario, with the ability to manipulate objects, perform autonomous navigation, and engage in a simulated industrial environment.

artificial intelligence, machine learning, robot, (16 more...)

2504.09755

Country: South America > Uruguay (0.36)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment > Sports > Soccer (0.84)

Technology:

Information Technology > Artificial Intelligence > Robots > Soccer Robots (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Borne--Pons, Paul, Czerkawski, Mikolaj, Martin, Rosalie, Rouffet, Romain

MESA: Text-Driven Terrain Generation Using Latent Diffusion and Global Copernicus Data

It is a complex and time-consuming task, particularly when it involves large-scale landscapes, which are getting more common with the current boom in popularity of open world games. The current state-of-the-art (SOT A) in terrain modeling relies mainly on procedural and simulation methods [8], which rarely scale well beyond a certain point (compute expensive or lack of realism) and can easily fail to capture the variety of the landscape the world offers. The recent advances in generative machine learning and especially in the area of diffusion models have paved the way for models that can learn a representation of Earth's landscapes directly from real terrain data. By abstracting the complexity of the underlying physical processes, generative models can learn to reproduce patterns and mutual dependencies between visual features, which can lead to* First author high levels of perceptual realism. This work explores the potential of following a similar data-centric methodology for a joint domain of terrain surface model and optical reflectance.

artificial intelligence, deep learning, machine learning, (17 more...)

2504.0721

Country:

South America (0.28)
Europe (0.28)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games > Computer Games (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)