Goto

Collaborating Authors

 Indian Ocean


Self-attentive Transformer for Fast and Accurate Postprocessing of Temperature and Wind Speed Forecasts

arXiv.org Artificial Intelligence

Current postprocessing techniques often require separate models for each lead time and disregard possible inter-ensemble relationships by either correcting each member separately or by employing distributional approaches. In this work, we tackle these shortcomings with an innovative, fast and accurate Transformer which postprocesses each ensemble member individually while allowing information exchange across variables, spatial dimensions and lead times by means of multi-headed self-attention. Weather foreacasts are postprocessed over 20 lead times simultaneously while including up to twelve meteorological predictors. We use the EUPPBench dataset for training which contains ensemble predictions from the European Center for Medium-range Weather Forecasts' integrated forecasting system alongside corresponding observations. The work presented here is the first to postprocess the ten and one hundred-meter wind speed forecasts within this benchmark dataset, while also correcting the two-meter temperature. Our approach significantly improves the original forecasts, as measured by the CRPS, with 17.5 % for two-meter temperature, nearly 5% for ten-meter wind speed and 5.3 % for one hundred-meter wind speed, outperforming a classical member-by-member approach employed as competitive benchmark. Furthermore, being up to 75 times faster, it fulfills the demand for rapid operational weather forecasts in various downstream applications, including renewable energy forecasting.


Satellite images spy Iranian 'mothership' linked to mysterious drones flying over New Jersey

Daily Mail - Science & tech

Satellite images have spotted the Iranian'mothership' linked to the mysterious drones in New Jersey. The Shahid Bagheri drone carrier was last seen at its berth in the Iran Shipbuilding & Offshore Industries Complex on November 12, but an image taken 18 days later showed its docking station empty. That is around the same time New Jersey police started to be inundated with sightings of drones in the skies, flying in clusters and acting strangely. New Jersey Republican Rep Jeff Van Drew claimed this week there was'circumstantial evidence' that Iran's ship was releasing the drones from America's East Coast. Van Drew said that Iran made a deal with China'to purchase drones, a mothership and other technologies' for the drone attack on the US, a theory the Pentagon has dismissed.


What are the mystery drones flying over the US?

New Scientist

Mysterious drones have been swarming the night skies above New Jersey and other nearby states for a month. They've been spotted over several US military sites. They've been videoed over houses and apartment buildings. A swarm was seen following a US Coast Guard rescue boat at the same time that New Jersey police reported 50 drones arriving on land from the ocean. But no one seems to know who's piloting them, or whether it's a coordinated effort.


Pentagon announces new counter-drone strategy as unmanned attacks on US interests skyrocket

FOX News

Fox News' Stephanie Bennett reports the latest on the unidentified drones from London. The Pentagon unveiled a new counter-drone strategy after a spate of incursions near U.S. bases prompted concerns over a lack of an action plan for the increasing threat of unmanned aerial vehicles. Though much of the strategy remains classified, Defense Secretary Lloyd Austin will implement a new counter-drone office within the Pentagon – Joint Counter-Small UAS Office – and a new Warfighter Senior Integration Group, according to a new memo. The Pentagon will also begin work on a second Replicator initiative, but it will be up to the incoming Trump administration to decide whether to fund this plan. The first Replicator initiative worked to field inexpensive, dispensable drones to thwart drone attacks by adversarial groups across the Middle East and elsewhere.


Houthis claim attack on central Israel in response to Gaza 'massacres'

Al Jazeera

Yemen's Houthi group says it has carried out a drone attack in central Israel's Tel Aviv area in "a specific military operation" in support of Palestinians in Gaza. The Houthis said in a statement on Monday that their forces struck "a sensitive target of the Israeli enemy". An Israeli military statement said a drone hit a building in the city of Yavne after air defence systems failed to detect it and an investigation into the failure is under way. The Houthis said the operation "achieved its objective" without providing details. No injuries were reported in the attack, which caused damage to several apartments in the building, according to Israeli media reports.


Harnessing Transfer Learning from Swahili: Advancing Solutions for Comorian Dialects

arXiv.org Artificial Intelligence

If today some African languages like Swahili have enough resources to develop high-performing Natural Language Processing (NLP) systems, many other languages spoken on the continent are still lacking such support. For these languages, still in their infancy, several possibilities exist to address this critical lack of data. Among them is Transfer Learning, which allows low-resource languages to benefit from the good representation of other languages that are similar to them. In this work, we adopt a similar approach, aiming to pioneer NLP technologies for Comorian, a group of four languages or dialects belonging to the Bantu family. Our approach is initially motivated by the hypothesis that if a human can understand a different language from their native language with little or no effort, it would be entirely possible to model this process on a machine. To achieve this, we consider ways to construct Comorian datasets mixed with Swahili. One thing to note here is that in terms of Swahili data, we only focus on elements that are closest to Comorian by calculating lexical distances between candidate and source data. We empirically test this hypothesis in two use cases: Automatic Speech Recognition (ASR) and Machine Translation (MT). Our MT model achieved ROUGE-1, ROUGE-2, and ROUGE-L scores of 0.6826, 0.42, and 0.6532, respectively, while our ASR system recorded a WER of 39.50\% and a CER of 13.76\%. This research is crucial for advancing NLP in underrepresented languages, with potential to preserve and promote Comorian linguistic heritage in the digital age.


Monet: Mixture of Monosemantic Experts for Transformers

arXiv.org Artificial Intelligence

Understanding the internal computations of large language models (LLMs) is crucial for aligning them with human values and preventing undesirable behaviors like toxic content generation. However, mechanistic interpretability is hindered by polysemanticity -- where individual neurons respond to multiple, unrelated concepts. While Sparse Autoencoders (SAEs) have attempted to disentangle these features through sparse dictionary learning, they have compromised LLM performance due to reliance on post-hoc reconstruction loss. To address this issue, we introduce Mixture of Monosemantic Experts for Transformers (Monet) architecture, which incorporates sparse dictionary learning directly into end-to-end Mixture-of-Experts pretraining. Our novel expert decomposition method enables scaling the expert count to 262,144 per layer while total parameters scale proportionally to the square root of the number of experts. Our analyses demonstrate mutual exclusivity of knowledge across experts and showcase the parametric knowledge encapsulated within individual experts. Moreover, Monet allows knowledge manipulation over domains, languages, and toxicity mitigation without degrading general performance. Our pursuit of transparent LLMs highlights the potential of scaling expert counts to enhance mechanistic interpretability and directly resect the internal knowledge to fundamentally adjust model behavior. The source code and pretrained checkpoints are available at https://github.com/dmis-lab/Monet.


Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models

arXiv.org Artificial Intelligence

Today's most advanced vision-language models (VLMs) remain proprietary. The strongest open-weight models rely heavily on synthetic data from proprietary VLMs to achieve good performance, effectively distilling these closed VLMs into open ones. As a result, the community has been missing foundational knowledge about how to build performant VLMs from scratch. We present Molmo, a new family of VLMs that are state-of-the-art in their class of openness. Our key contribution is a collection of new datasets called PixMo, including a dataset of highly detailed image captions for pre-training, a free-form image Q&A dataset for fine-tuning, and an innovative 2D pointing dataset, all collected without the use of external VLMs. The success of our approach relies on careful modeling choices, a well-tuned training pipeline, and, most critically, the quality of our newly collected datasets. Our best-in-class 72B model not only outperforms others in the class of open weight and data models, but also outperforms larger proprietary models including Claude 3.5 Sonnet, and Gemini 1.5 Pro and Flash, second only to GPT-4o based on both academic benchmarks and on a large human evaluation. Our model weights, new datasets, and source code are available at https://molmo.allenai.org/blog.


CultureLLM: Incorporating Cultural Differences into Large Language Models

arXiv.org Artificial Intelligence

Large language models (LLMs) are reported to be partial to certain cultures owing to the training data dominance from the English corpora. Since multilingual cultural data are often expensive to collect, existing efforts handle this by prompt engineering or culture-specific pre-training. However, they might overlook the knowledge deficiency of low-resource culture and require extensive computing resources. In this paper, we propose CultureLLM, a cost-effective solution to incorporate cultural differences into LLMs. CultureLLM adopts World Value Survey (WVS) as seed data and generates semantically equivalent training data via the proposed semantic data augmentation. Using only 50 seed samples from WVS with augmented data, we fine-tune culture-specific LLMs and one unified model (CultureLLM-One) for 9 cultures covering rich and low-resource languages. Extensive experiments on 60 culture-related datasets demonstrate that CultureLLM significantly outperforms various counterparts such as GPT-3.5 (by 8.1%) and Gemini Pro (by 9.5%) with comparable performance to GPT-4 or even better. Our human study shows that the generated samples are semantically equivalent to the original samples, providing an effective solution for LLMs augmentation. Code is released at https://github.com/Scarelette/CultureLLM.


Artificial Intelligence Mangrove Monitoring System Based on Deep Learning and Sentinel-2 Satellite Data in the UAE (2017-2024)

arXiv.org Artificial Intelligence

Mangroves play a crucial role in maintaining coastal ecosystem health and protecting biodiversity. Therefore, continuous mapping of mangroves is essential for understanding their dynamics. Earth observation imagery typically provides a cost-effective way to monitor mangrove dynamics. However, there is a lack of regional studies on mangrove areas in the UAE. This study utilizes the UNet++ deep learning model combined with Sentinel-2 multispectral data and manually annotated labels to monitor the spatiotemporal dynamics of densely distributed mangroves (coverage greater than 70%) in the UAE from 2017 to 2024, achieving an mIoU of 87.8% on the validation set. Results show that the total mangrove area in the UAE in 2024 was approximately 9,142.21 hectares, an increase of 2,061.33 hectares compared to 2017, with carbon sequestration increasing by approximately 194,383.42 tons, equivalent to fixing about 713,367.36 tons of carbon dioxide. Abu Dhabi has the largest mangrove area and plays a dominant role in the UAE's mangrove growth, increasing by 1,855.6 hectares between 2017-2024, while other emirates have also contributed to mangrove expansion through stable and sustainable growth in mangrove areas. This comprehensive growth pattern reflects the collective efforts of all emirates in mangrove restoration.