UAE
Hidden city built 5,000 years ago by lost advanced civilization discovered underneath vast desert
For centuries, the Rub' al-Khali desert near Saudi Arabia and Dubai -- known as the Empty Quarter -- was dismissed as a lifeless sea of sand. In 2002, Sheikh Mohammed bin Rashid Al Maktoum, ruler of Dubai, spotted unusual dune formations and a large black deposit while flying over the desert. That led to the discovery of Saruq Al-Hadid, an archaeological site rich in remnants of copper and iron smelting, which is now believed to be part of a 5,000-year-old civilization buried beneath the sands. Researchers have now found traces of this ancient society approximately 10 feet beneath the desert surface, hidden in plain sight and long overlooked due to the harsh environment and shifting dunes of the Empty Quarter. This discovery brings fresh life to the legend of a mythical city known as'Atlantis of the Sands.'
LIVE: Learnable In-Context Vector for Visual Question Answering
As language models continue to scale, Large Language Models (LLMs) have exhibited emerging capabilities in In-Context Learning (ICL), enabling them to solve language tasks by prefixing a few in-context demonstrations (ICDs) as context. Inspired by these advancements, researchers have extended these techniques to develop Large Multimodal Models (LMMs) with ICL capabilities. However, applying ICL usually faces two major challenges: 1) using more ICDs will largely increase the inference time and 2) the performance is sensitive to the selection of ICDs. These challenges are further exacerbated in LMMs due to the integration of multiple data types and the combinational complexity of multimodal ICDs. Recently, to address these challenges, some NLP studies introduce non-learnable In-Context Vectors (ICVs) which extract useful task information from ICDs into a single vector and then insert it into the LLM to help solve the corresponding task. However, although useful in simple NLP tasks, these non-learnable methods fail to handle complex multimodal tasks like Visual Question Answering (VQA). In this study, we propose Learnable In-Context Vector (LIVE) to distill essential task information from demonstrations, improving ICL performance in LMMs. Experiments show that LIVE can significantly reduce computational costs while enhancing accuracy in VQA tasks compared to traditional ICL and other non-learnable ICV methods.
Linguistic Binding in Diffusion Models: Enhancing Attribute Correspondence through Attention Map Alignment
Text-conditioned image generation models often generate incorrect associations between entities and their visual attributes. This reflects an impaired mapping between linguistic binding of entities and modifiers in the prompt and visual binding of the corresponding elements in the generated image. As one example, a query like "a pink sunflower and a yellow flamingo" may incorrectly produce an image of a yellow sunflower and a pink flamingo. To remedy this issue, we propose SynGen, an approach which first syntactically analyses the prompt to identify entities and their modifiers, and then uses a novel loss function that encourages the cross-attention maps to agree with the linguistic binding reflected by the syntax. Specifically, we encourage large overlap between attention maps of entities and their modifiers, and small overlap with other entities and modifier words. The loss is optimized during inference, without retraining or fine-tuning the model. Human evaluation on three datasets, including one new and challenging set, demonstrate significant improvements of SynGen compared with current state of the art methods. This work highlights how making use of sentence structure during inference can efficiently and substantially improve the faithfulness of text-to-image generation.
Trump hails growing ties with UAE on last leg of Gulf tour
President Donald Trump has hailed deepening ties between the United States and the United Arab Emirates and said that the latter will invest 1.4 trillion in the former's artificial intelligence sector over the next decade. "I have absolutely no doubt that the relationship will only get bigger and better," Trump said on Thursday at a meeting with UAE President Sheikh Mohamed bin Zayed Al Nahyan, on the final leg of his three-country tour of the Gulf region that saw him strike a series of lucrative tech, business and military deals that he said amounted to 10 trillion. Sheikh Mohammed said the UAE remained "committed to working with the United States to advance peace and stability in our region and globally". The deal with UAE is expected to enable the Gulf country to build data centres vital to developing artificial intelligence models. The countries did not say which AI chips could be included in UAE data centres.
Trump's Computer Chip Deals With Saudi Arabia and UAE Divide US Government
Over the course of a three-day trip to the Middle East, President Trump and his emissaries from Silicon Valley have transformed the Persian Gulf from an artificial-intelligence neophyte into an A.I. power broker. They have reached an enormous deal with the United Arab Emirates to deliver hundreds of thousands of today's most advanced chips from Nvidia annually to build one of the world's largest data center hubs in the region, three people familiar with the talks said. The shipments would begin this year, and include roughly 100,000 chips for G42, an Emirati A.I. firm, with the rest going to U.S. cloud service providers. The administration revealed the agreement on Thursday in an announcement unveiling a new A.I. campus in Abu Dhabi supported by 5 gigawatts of electrical power. It would the largest such project outside of the United States and help U.S. companies serve customers in Africa, Europe and Asia, the administration said.
The Middle East Has Entered the AI Group Chat
Donald Trump's jaunt to the Middle East featured an entourage of billionaire tech bros, a fighter-jet escort, and business deals designed to reshape the global landscape of artificial intelligence. On the final stop of the tour in Abu Dhabi, the US President announced that unnamed US companies would partner with the United Arab Emirates to create the largest AI datacenter cluster outside of America. Trump said that the US companies will help G42, an Emirati company, build five gigawatts of AI computing capacity in the UAE. Sheikh Tahnoon bin Zayed Al Nahyan, who leads the UAE's Artificial Intelligence and Advanced Technology Council, and is in charge of a 1.5 trillion fortune aimed at building AI capabilities, said the move will strengthen the UAE's position "as a hub for cutting-edge research and sustainable development, delivering transformative benefits for humanity." A few days earlier, as Trump arrived in Riyadh, Saudi Arabia announced Humain, an AI investment firm owned by the kingdom's Public Investment Fund.
Bacteria-inspired robot uses 12 spinning flagella to roam underwater
An underwater robot can delicately propel itself in any direction with its 12 flexible arms, inspired by the flagella of bacteria. Its creators claim it can carry out underwater inspections without endangering humans or wildlife, as propeller-driven robots would. Flagella are tiny, hair-like protrusions found on many bacteria that can spin clockwise or counterclockwise to create propulsion. "[Bacteria] have something called a biological motor, which rotates this elongated structure, and this elongated structure produces thrust, and that's how bacteria is propelled," says Anup Teejo Mathew at Khalifa University in Abu Dhabi,โฆ
UNITYAI-GUARD: Pioneering Toxicity Detection Across Low-Resource Indian Languages
Beniwal, Himanshu, Venkat, Reddybathuni, Kumar, Rohit, Srivibhav, Birudugadda, Jain, Daksh, Doddi, Pavan, Dhande, Eshwar, Ananth, Adithya, Kuldeep, null, Kubadia, Heer, Sharda, Pratham, Singh, Mayank
This work introduces UnityAI-Guard, a framework for binary toxicity classification targeting low-resource Indian languages. While existing systems predominantly cater to high-resource languages, UnityAI-Guard addresses this critical gap by developing state-of-the-art models for identifying toxic content across diverse Brahmic/Indic scripts. Our approach achieves an impressive average F1-score of 84.23% across seven languages, leveraging a dataset of 888k training instances and 35k manually verified test instances. By advancing multilingual content moderation for linguistically diverse regions, UnityAI-Guard also provides public API access to foster broader adoption and application.
CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language Models
Artificial intelligence has significantly impacted medical applications, particularly with the advent of Medical Large Vision Language Models (Med-LVLMs), sparking optimism for the future of automated and personalized healthcare. However, the trustworthiness of Med-LVLMs remains unverified, posing significant risks for future model deployment. In this paper, we introduce CARES and aim to Comprehensively evAluate the tRustworthinESs of Med-LVLMs across the medical domain. We assess the trustworthiness of Med-LVLMs across five dimensions, including trustfulness, fairness, safety, privacy, and robustness. CARES comprises about 41K question-answer pairs in both closed and open-ended formats, covering 16 medical image modalities and 27 anatomical regions. Our analysis reveals that the models consistently exhibit concerns regarding trustworthiness, often displaying factual inaccuracies and failing to maintain fairness across different demographic groups. Furthermore, they are vulnerable to attacks and demonstrate a lack of privacy awareness. We publicly release our benchmark and code in https://cares-ai.github.io/. WARNING: This paper contains model outputs that may be considered offensive.
Divergences between Language Models and Human Brains
Do machines and humans process language in similar ways? Recent research has hinted at the affirmative, showing that human neural activity can be effectively predicted using the internal representations of language models (LMs). Although such results are thought to reflect shared computational principles between LMs and human brains, there are also clear differences in how LMs and humans represent and use language. In this work, we systematically explore the divergences between human and machine language processing by examining the differences between LM representations and human brain responses to language as measured by Magnetoencephalography (MEG) across two datasets in which subjects read and listened to narrative stories. Using an LLM-based data-driven approach, we identify two domains that LMs do not capture well: social/emotional intelligence and physical commonsense. We validate these findings with human behavioral experiments and hypothesize that the gap is due to insufficient representations of social/emotional and physical knowledge in LMs. Our results show that fine-tuning LMs on these domains can improve their alignment with human brain responses.