Goto

Collaborating Authors

 Personal


DoTA-RAG: Dynamic of Thought Aggregation RAG

arXiv.org Artificial Intelligence

In this paper, we introduce DoTA-RAG (Dynamic-of-Thought Aggregation RAG), a retrieval-augmented generation system optimized for high-throughput, large-scale web knowledge indexes. Traditional RAG pipelines often suffer from high latency and limited accuracy over massive, diverse datasets. DoTA-RAG addresses these challenges with a three-stage pipeline: query rewriting, dynamic routing to specialized sub-indexes, and multi-stage retrieval and ranking. We further enhance retrieval by evaluating and selecting a superior embedding model, re-embedding the large FineWeb-10BT corpus. Moreover, we create a diverse Q&A dataset of 500 questions generated via the DataMorgana setup across a broad range of WebOrganizer topics and formats. DoTA-RAG improves the answer correctness score from 0.752 (baseline, using LiveRAG pre-built vector store) to 1.478 while maintaining low latency, and it achieves a 0.929 correctness score on the Live Challenge Day. These results highlight DoTA-RAG's potential for practical deployment in domains requiring fast, reliable access to large and evolving knowledge sources.


"I Hadn't Thought About That": Creators of Human-like AI Weigh in on Ethics And Neurodivergence

arXiv.org Artificial Intelligence

Human-like AI agents such as robots and chatbots are becoming increasingly popular, but they present a variety of ethical concerns. The first concern is in how we define humanness, and how our definition impacts communities historically dehumanized by scientific research. Autistic people in particular have been dehumanized by being compared to robots, making it even more important to ensure this marginalization is not reproduced by AI that may promote neuronormative social behaviors. Second, the ubiquitous use of these agents raises concerns surrounding model biases and accessibility. In our work, we investigate the experiences of the people who build and design these technologies to gain insights into their understanding and acceptance of neurodivergence, and the challenges in making their work more accessible to users with diverse needs. Even though neurodivergent individuals are often marginalized for their unique communication styles, nearly all participants overlooked the conclusions their end-users and other AI system makers may draw about communication norms from the implementation and interpretation of humanness applied in participants' work. This highlights a major gap in their broader ethical considerations, compounded by some participants' neuronormative assumptions about the behaviors and traits that distinguish "humans" from "bots" and the replication of these assumptions in their work. We examine the impact this may have on autism inclusion in society and provide recommendations for additional systemic changes towards more ethical research directions.


Q&A: How anacondas, chickens, and locals may be able to coexist in the Amazon

Popular Science

Breakthroughs, discoveries, and DIY tips sent every weekday. South America's lush Amazon region is a biodiversity hotspot, which means that every living thing must find a way to co-exist. Even some of the most feared snakes on the planet–anacondas. In a paper published June 16 in the journal Frontiers in Amphibian and Reptile Science, conservation biologists Beatriz Cosendey and Juarez Carlos Brito Pezzuti from the Federal University of Pará's Center for Amazonian Studies in Brazil, analyze the key points behind the interactions between humans and the local anaconda populations. Ahead of the paper's publication, the team at Frontiers conducted this wide-ranging Q&A with Conesday.


RSCF: Relation-Semantics Consistent Filter for Entity Embedding of Knowledge Graph

arXiv.org Artificial Intelligence

In knowledge graph embedding, leveraging relation specific entity transformation has markedly enhanced performance. However, the consistency of embedding differences before and after transformation remains unaddressed, risking the loss of valuable inductive bias inherent in the embeddings. This inconsistency stems from two problems. First, transformation representations are specified for relations in a disconnected manner, allowing dissimilar transformations and corresponding entity embeddings for similar relations. Second, a generalized plug-in approach as a SFBR (Semantic Filter Based on Relations) disrupts this consistency through excessive concentration of entity embeddings under entity-based regularization, generating indistinguishable score distributions among relations. In this paper, we introduce a plug-in KGE method, Relation-Semantics Consistent Filter (RSCF). Its entity transformation has three features for enhancing semantic consistency: 1) shared affine transformation of relation embeddings across all relations, 2) rooted entity transformation that adds an entity embedding to its change represented by the transformed vector, and 3) normalization of the change to prevent scale reduction. To amplify the advantages of consistency that preserve semantics on embeddings, RSCF adds relation transformation and prediction modules for enhancing the semantics. In knowledge graph completion tasks with distance-based and tensor decomposition models, RSCF significantly outperforms state-of-the-art KGE methods, showing robustness across all relations and their frequencies.


"It's not a representation of me": Examining Accent Bias and Digital Exclusion in Synthetic AI Voice Services

arXiv.org Artificial Intelligence

Recent advances in artificial intelligence (AI) speech generation and voice cloning technologies have produced naturalistic speech and accurate voice replication, yet their influence on sociotechnical systems across diverse accents and linguistic traits is not fully understood. This study evaluates two synthetic AI voice services (Speechify and ElevenLabs) through a mixed methods approach using surveys and interviews to assess technical performance and uncover how users' lived experiences influence their perceptions of accent variations in these speech technologies. Our findings reveal technical performance disparities across five regional, English-language accents and demonstrate how current speech generation technologies may inadvertently reinforce linguistic privilege and accent-based discrimination, potentially creating new forms of digital exclusion. Overall, our study highlights the need for inclusive design and regulation by providing actionable insights for developers, policymakers, and organizations to ensure equitable and socially responsible AI speech technologies.


Jennie Garth claims ex-husband Peter Facinelli's dating app age range matched their daughter's

FOX News

Jennie Garth told Fox News Digital that once her youngest child graduates from high school, she will be moving out of California. Jennie Garth is bringing up ex-Peter Facinelli's dating past, specifically about his time on the exclusive celebrity dating app, Raya. During a podcast interview, Garth, 53, claimed that the actor's age range was close to their eldest daughter, Luca, who is now 27. "My ex-husband Peter, I was told, was on Raya, and his age, whatever range, that he was looking for was also the age range of his oldest daughter," Garth shared on the "I Do, Part 2" podcast with Jana Kramer and guest J.P. Rosenbaum. "So, she came across him on her thing."


The Newspaper That Hired ChatGPT

The Atlantic - Technology

For more than 20 years, print media has been a bit of a punching bag for digital-technology companies. Craigslist killed the paid classifieds, free websites led people to think newspapers and magazines were committing robbery when they charged for subscriptions, and the smartphone and social media turned reading full-length articles into a chore. Now generative AI is in the mix--and many publishers, desperate to avoid being left behind once more, are rushing to harness the technology themselves. Several major publications, including The Atlantic, have entered into corporate partnerships with OpenAI and other AI firms. Any number of experiments have ensued--publishers have used the software to help translate work into different languages, draft headlines, and write summaries or even articles.


Four science-based rules that will make your conversations flow

New Scientist

One of the four pillars of good conversation is levity. You needn't be a comedian, you can but have some fun Conversation lies at the heart of our relationships – yet many of us find it surprisingly hard to talk to others. We may feel anxious at the thought of making small talk with strangers and struggle to connect with the people who are closest to us. If that sounds familiar, Alison Wood Brooks hopes to help. She is a professor at Harvard Business School, where she teaches an oversubscribed course called "TALK: How to talk gooder in business and life", and the author of a new book, Talk: The science of conversation and the art of being ourselves.


Preparing for kick-off at RoboCup2025: an interview with General Chair Marco Simões

AIHub

The Salvador Convention Center, where RoboCup 2025 will take place. RoboCup is an international scientific initiative with the goal of advancing the state of the art of intelligent robots, AI and automation. The annual RoboCup event, where teams gather from across the globe to take part in competitions across a number of leagues, will this year take place in Brazil, from 15-21 July. We spoke to Marco Simões, one of the General Chairs of RoboCup 2025 and President of RoboCup Brazil, to find out what plans they have for the event, some new initiatives, and how RoboCup has grown in Brazil over the past ten years. RoboCup will be held in Salvador, Brazil.


A High-Quality Dataset and Reliable Evaluation for Interleaved Image-Text Generation

arXiv.org Artificial Intelligence

Recent advancements in Large Multimodal Models (LMMs) have significantly improved multimodal understanding and generation. However, these models still struggle to generate tightly interleaved image-text outputs, primarily due to the limited scale, quality and instructional richness of current training datasets. To address this, we introduce InterSyn, a large-scale multimodal dataset constructed using our Self-Evaluation with Iterative Refinement (SEIR) method. InterSyn features multi-turn, instruction-driven dialogues with tightly interleaved imagetext responses, providing rich object diversity and rigorous automated quality refinement, making it well-suited for training next-generation instruction-following LMMs. Furthermore, to address the lack of reliable evaluation tools capable of assessing interleaved multimodal outputs, we introduce SynJudge, an automatic evaluation model designed to quantitatively assess multimodal outputs along four dimensions: text content, image content, image quality, and image-text synergy. Experimental studies show that the SEIR method leads to substantially higher dataset quality compared to an otherwise identical process without refinement. Moreover, LMMs trained on InterSyn achieve uniform performance gains across all evaluation metrics, confirming InterSyn's utility for advancing multimodal systems.