AITopics | Atlantic Ocean

Collaborating Authors

Atlantic Ocean

Multi-fidelity climate model parameterization for better generalization and extrapolation

Bhouri, Mohamed Aziz, Peng, Liran, Pritchard, Michael S., Gentine, Pierre

arXiv.org Artificial IntelligenceSep-18-2023

Machine-learning-based parameterizations (i.e. representation of sub-grid processes) of global climate models or turbulent simulations have recently been proposed as a powerful alternative to physical, but empirical, representations, offering a lower computational cost and higher accuracy. Yet, those approaches still suffer from a lack of generalization and extrapolation beyond the training data, which is however critical to projecting climate change or unobserved regimes of turbulence. Here we show that a multi-fidelity approach, which integrates datasets of different accuracy and abundance, can provide the best of both worlds: the capacity to extrapolate leveraging the physically-based parameterization and a higher accuracy using the machine-learning-based parameterizations. In an application to climate modeling, the multi-fidelity framework yields more accurate climate projections without requiring major increase in computational resources. Our multi-fidelity randomized prior networks (MF-RPNs) combine physical parameterization data as low-fidelity and storm-resolving historical run's data as high-fidelity. To extrapolate beyond the training data, the MF-RPNs are tested on high-fidelity warming scenarios, $+4K$, data. We show the MF-RPN's capacity to return much more skillful predictions compared to either low- or high-fidelity (historical data) simulations trained only on one regime while providing trustworthy uncertainty quantification across a wide range of scenarios. Our approach paves the way for the use of machine-learning based methods that can optimally leverage historical observations or high-fidelity simulations and extrapolate to unseen regimes such as climate change.

parameterization, tendency, vertical level, (17 more...)

arXiv.org Artificial Intelligence

2309.10231

Country:

Atlantic Ocean > South Atlantic Ocean (0.04)
North America > United States > New York > New York County > New York City (0.04)
South America (0.04)
(8 more...)

Genre: Research Report > New Finding (0.46)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

BHASA: A Holistic Southeast Asian Linguistic and Cultural Evaluation Suite for Large Language Models

Leong, Wei Qi, Ngui, Jian Gang, Susanto, Yosephine, Rengarajan, Hamsawardhini, Sarveswaran, Kengatharaiyer, Tjhi, William Chandra

arXiv.org Artificial IntelligenceSep-18-2023

The rapid development of Large Language Models (LLMs) and the emergence of novel abilities with scale have necessitated the construction of holistic, diverse and challenging benchmarks such as HELM and BIG-bench. However, at the moment, most of these benchmarks focus only on performance in English and evaluations that include Southeast Asian (SEA) languages are few in number. We therefore propose BHASA, a holistic linguistic and cultural evaluation suite for LLMs in SEA languages. It comprises three components: (1) a NLP benchmark covering eight tasks across Natural Language Understanding (NLU), Generation (NLG) and Reasoning (NLR) tasks, (2) LINDSEA, a linguistic diagnostic toolkit that spans the gamut of linguistic phenomena including syntax, semantics and pragmatics, and (3) a cultural diagnostics dataset that probes for both cultural representation and sensitivity. For this preliminary effort, we implement the NLP benchmark only for Indonesian, Vietnamese, Thai and Tamil, and we only include Indonesian and Tamil for LINDSEA and the cultural diagnostics dataset. As GPT-4 is purportedly one of the best-performing multilingual LLMs at the moment, we use it as a yardstick to gauge the capabilities of LLMs in the context of SEA languages. Our initial experiments on GPT-4 with BHASA find it lacking in various aspects of linguistic capabilities, cultural representation and sensitivity in the targeted SEA languages. BHASA is a work in progress and will continue to be improved and expanded in the future.

abstractive summarization, computational linguistics, nlp benchmark component, (14 more...)

arXiv.org Artificial Intelligence

2309.06085

Country:

North America > United States > Montana (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.13)
Asia > Singapore (0.04)
(50 more...)

Genre:

Overview (0.92)
Personal (0.92)
Research Report > New Finding (0.67)

Industry:

Health & Medicine (1.00)
Government (1.00)
Education (0.92)
Leisure & Entertainment (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Unifying Perspective on Non-Stationary Kernels for Deeper Gaussian Processes

Noack, Marcus M., Luo, Hengrui, Risser, Mark D.

arXiv.org Machine LearningSep-18-2023

The Gaussian process (GP) is a popular statistical technique for stochastic function approximation and uncertainty quantification from data. GPs have been adopted into the realm of machine learning in the last two decades because of their superior prediction abilities, especially in data-sparse scenarios, and their inherent ability to provide robust uncertainty estimates. Even so, their performance highly depends on intricate customizations of the core methodology, which often leads to dissatisfaction among practitioners when standard setups and off-the-shelf software tools are being deployed. Arguably the most important building block of a GP is the kernel function which assumes the role of a covariance operator. Stationary kernels of the Mat\'ern class are used in the vast majority of applied studies; poor prediction performance and unrealistic uncertainty quantification are often the consequences. Non-stationary kernels show improved performance but are rarely used due to their more complicated functional form and the associated effort and expertise needed to define and tune them optimally. In this perspective, we want to help ML practitioners make sense of some of the most common forms of non-stationarity for Gaussian processes. We show a variety of kernels in action using representative datasets, carefully study their properties, and compare their performances. Based on our findings, we propose a new kernel that combines some of the identified advantages of existing kernels.

artificial intelligence, machine learning, modeling & simulation, (19 more...)

arXiv.org Machine Learning

2309.10068

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Germany > Bavaria > Lower Franconia > Würzburg (0.04)
Atlantic Ocean > North Atlantic Ocean (0.04)

Genre: Research Report > New Finding (0.34)

Industry:

Energy (1.00)
Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
(2 more...)

Add feedback

From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning

Li, Ming, Zhang, Yong, Li, Zhitao, Chen, Jiuhai, Chen, Lichang, Cheng, Ning, Wang, Jianzong, Zhou, Tianyi, Xiao, Jing

arXiv.org Artificial IntelligenceSep-15-2023

In the realm of Large Language Models, the balance between instruction data quality and quantity has become a focal point. Recognizing this, we introduce a self-guided methodology for LLMs to autonomously discern and select cherry samples from vast open-source datasets, effectively minimizing manual curation and potential cost for instruction tuning an LLM. Our key innovation, the Instruction-Following Difficulty (IFD) metric, emerges as a pivotal tool to identify discrepancies between a model's expected responses and its autonomous generation prowess. Through the adept application of IFD, cherry samples are pinpointed, leading to a marked uptick in model training efficiency. Empirical validations on renowned datasets like Alpaca and WizardLM underpin our findings; with a mere 10% of conventional data input, our strategy showcases improved results. This synthesis of self-guided cherry-picking and the IFD metric signifies a transformative leap in the optimization of LLMs, promising both efficiency and resource-conscious advancements. Codes, data, and models are available: https://github.com/MingLiiii/Cherry_LLM

cherry model, dataset, instruction, (15 more...)

arXiv.org Artificial Intelligence

2308.12032

Country:

North America > Canada > Ontario > Toronto (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
(11 more...)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine > Consumer Health (1.00)
Education > Health & Safety > School Nutrition (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

Temporal-spatial model via Trend Filtering

Padilla, Carlos Misael Madrid, Padilla, Oscar Hernan Madrid, Wang, Daren

arXiv.org Machine LearningSep-12-2023

This research focuses on the estimation of a non-parametric regression function designed for data with simultaneous time and space dependencies. In such a context, we study the Trend Filtering, a nonparametric estimator introduced by \cite{mammen1997locally} and \cite{rudin1992nonlinear}. For univariate settings, the signals we consider are assumed to have a kth weak derivative with bounded total variation, allowing for a general degree of smoothness. In the multivariate scenario, we study a $K$-Nearest Neighbor fused lasso estimator as in \cite{padilla2018adaptive}, employing an ADMM algorithm, suitable for signals with bounded variation that adhere to a piecewise Lipschitz continuity criterion. By aligning with lower bounds, the minimax optimality of our estimators is validated. A unique phase transition phenomenon, previously uncharted in Trend Filtering studies, emerges through our analysis. Both Simulation studies and real data applications underscore the superior performance of our method when compared with established techniques in the existing literature.

artificial intelligence, inequality, machine learning, (18 more...)

arXiv.org Machine Learning

2308.16172

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.13)
Europe > Spain > Galicia > Madrid (0.05)
Asia > Japan > Honshū > Kansai > Wakayama Prefecture > Wakayama (0.04)
(6 more...)

Genre: Research Report > New Finding (0.45)

Industry: Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.87)

Add feedback

Ukraine claims to retake Black Sea drilling rigs from Russian control

BBC NewsSep-11-2023, 19:40:55 GMT

Now it's Russia that appears to have most to worry about, as Ukrainian drones and commandos launch raids on the northwest corner of Crimea, damaging a radar base on the Tarkhankut Peninsula and even planting a Ukrainian flag during an operation to mark Independence Day, on 24 August.

artificial intelligence, retake black sea drilling rig

BBC News

Country:

Europe > Ukraine > Crimea (0.44)
Atlantic Ocean > Mediterranean Sea > Black Sea (0.40)

Industry: Energy > Oil & Gas > Upstream (0.40)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.44)

Add feedback

Can the UK's new ARIA science agency deliver 'moonshot' technologies?

New ScientistSep-11-2023, 00:01:30 GMT

The UK's Advanced Research and Invention Agency (ARIA) has chosen eight scientists who will each be given up to £50 million to allocate as they see fit, in the hopes that a high-risk, high-reward approach to research funding will deliver results that benefit UK society and fuel economic growth. ARIA is the brainchild of Dominic Cummings, an adviser to former UK prime minister Boris Johnson who has long wanted to shake up UK science funding. "A small group of people can make a huge breakthrough with little money but the right structure, the right ways of thinking," Cummings wrote in 2017. He was inspired by the US's Advanced Research Projects Agency (ARPA), which spurred computer science as a discipline and created a forerunner of the internet in the 1960s and 1970s. It did this, in the words of one of its leading scientists, by having "visions rather than goals" and because it "funded people, not projects".

flanagan, new aria science agency deliver, programme director, (8 more...)

New Scientist

Country:

North America > United States (1.00)
Europe > United Kingdom > England > Greater Manchester > Manchester (0.05)
Europe > North Sea (0.05)
Atlantic Ocean > North Atlantic Ocean > North Sea (0.05)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Government > Military (1.00)

Technology: Information Technology > Artificial Intelligence (0.32)

Add feedback

Ukraine, Russia report downing dozens of drones over Kyiv, Crimea

Al JazeeraSep-10-2023, 04:32:47 GMT

Ukraine has reported downing more than two dozen Russian drones over the country's capital, Kyiv, as Russia's defence ministry announced the destruction of eight Ukrainian drones near the annexed Crimean peninsula. The extent of the damage from the rival attacks early on Sunday was not immediately clear. Kyiv Mayor Vitali Klitschko said that at least one person was wounded in the city's historic Podil neighbourhood and a fire broke out near one of its parks. Debris from downed drones fell on the Darnytskyi, Solomianskyi, Shevchenkivskyi, Sviatoshynskyi and Podil districts, Klitschko and the city's military administration said. In the Shevchenkivskyi district, debris sparked a fire in an apartment, which was quickly distinguished.

drone, kyiv, ukraine, (12 more...)

Al Jazeera

Country:

Asia > Russia (1.00)
Europe > Ukraine > Kyiv Oblast > Kyiv (0.96)
Europe > Russia (0.68)
(2 more...)

Industry:

Government > Military (1.00)
Government > Regional Government > Europe Government > Russia Government (0.41)
Government > Regional Government > Asia Government > Russia Government (0.41)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)

Add feedback

Detective McDavitt and the Curious Case of the Clown Wedgefish

Mother JonesSep-9-2023, 10:00:35 GMT

How do you find an elusive animal that most people have never even seen dead in a fish market? Matthew McDavitt, above, knows how.Melody Robbins This story was originally published by Hakai Magazine and is reproduced here as part of the Climate Desk collaboration. Peter Kyne sits down at his desk to write a eulogy for a fish he's never met. No scientist has seen signs of the critically endangered Rhynchobatus cooki, or clown wedgefish, since a dead one turned up at a fish market in 1996. Kyne, a conservation biologist at Charles Darwin University in Australia who studies wedgefish, has worked only with preserved specimens of the spotted sea creature. "This thing's dust," Kyne thinks, feeling defeated as he writes the somber news in a draft assessment of the global conservation status of wedgefish species for the International Union for Conservation of Nature. Wedgefish are a type of ray.

clown wedgefish, mcdavitt, wedgefish, (14 more...)

Mother Jones

Country:

Oceania > Australia (0.25)
Asia > Singapore (0.05)
Asia > Indonesia > Sumatra (0.05)
(5 more...)

Industry: Food & Agriculture > Fishing (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (0.95)

Add feedback

Analysis of Disinformation and Fake News Detection Using Fine-Tuned Large Language Model

Pavlyshenko, Bohdan M.

arXiv.org Artificial IntelligenceSep-9-2023

The paper considers the possibility of fine-tuning Llama 2 large language model (LLM) for the disinformation analysis and fake news detection. For fine-tuning, the PEFT/LoRA based approach was used. In the study, the model was fine-tuned for the following tasks: analysing a text on revealing disinformation and propaganda narratives, fact checking, fake news detection, manipulation analytics, extracting named entities with their sentiments. The obtained results show that the fine-tuned Llama 2 model can perform a deep analysis of texts and reveal complex styles and narratives. Extracted sentiments for named entities can be considered as predictive features in supervised machine learning models.

narrative, sentiment, ukraine, (15 more...)

arXiv.org Artificial Intelligence

2309.04704

Country:

Asia > Russia (0.95)
Europe > Bulgaria (0.28)
Europe > Russia (0.06)
(15 more...)

Genre:

Personal (1.00)
Research Report > New Finding (0.34)

Industry:

Media > News (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Government > Regional Government > Europe Government > Russia Government (0.68)
Government > Regional Government > Asia Government > Russia Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback