AITopics

2402.12335

Country:

North America > United States > California (0.04)
North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (1.00)

Industry:

Materials > Chemicals (0.48)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Vision (0.86)

Daily Mail - Science & techFeb-18-2024, 16:55:53 GMT

Six cutting-edge technologies that could reverse global warming: From dumping WHALE POOP in the sea to engineering CLOUDS to block out sun

Around the world, ambitious projects are testing everything from seeding clouds with chemicals to pouring artificial whale excrement into the sea. The goal is to remove CO2 from the atmosphere via so called'geoengineering' and'carbon capture' processes - and help to mitigate climate change. Geoengineering sees heat from the sun reflected back into space to limit climate change, while'carbon capture' captures CO2 from the air, either directly or by capturing it in rain among other techniques. The White House cautiously supported further research into an idea straight out of science fiction - 'blocking the sun' to cool the atmosphere - in a report last year. The federally mandated report said that there is'a compelling case for research to better understand both the potential benefits and risks'.

artificial intelligence, climate change, survey article, (13 more...)

Daily Mail - Science & tech

Country: North America > United States (0.37)

Genre:

Overview > Innovation (0.40)
Research Report (0.35)

Industry:

Energy (1.00)
Materials > Chemicals (0.50)

Technology: Information Technology > Artificial Intelligence (0.35)

Gao, Wenhao, Raghavan, Priyanka, Shprints, Ron, Coley, Connor W.

Substrate Scope Contrastive Learning: Repurposing Human Bias to Learn Atomic Representations

arXiv.org Artificial IntelligenceFeb-18-2024

Learning molecular representation is a critical step in molecular machine learning that significantly influences modeling success, particularly in data-scarce situations. The concept of broadly pre-training neural networks has advanced fields such as computer vision, natural language processing, and protein engineering. However, similar approaches for small organic molecules have not achieved comparable success. In this work, we introduce a novel pre-training strategy, substrate scope contrastive learning, which learns atomic representations tailored to chemical reactivity. This method considers the grouping of substrates and their yields in published substrate scope tables as a measure of their similarity or dissimilarity in terms of chemical reactivity. We focus on 20,798 aryl halides in the CAS Content Collection spanning thousands of publications to learn a representation of aryl halide reactivity. We validate our pre-training approach through both intuitive visualizations and comparisons to traditional reactivity descriptors and physical organic chemistry principles. The versatility of these embeddings is further evidenced in their application to yield prediction, regioselectivity prediction, and the diverse selection of new substrates. This work not only presents a chemistry-tailored neural network pre-training strategy to learn reactivity-aligned atomic representations, but also marks a first-of-its-kind approach to benefit from the human bias in substrate scope design.

aryl halide, molecule, representation, (14 more...)

2402.16882

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Asia > Middle East > Jordan (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.46)
Materials > Chemicals (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Wettig, Alexander, Gupta, Aatmik, Malik, Saumya, Chen, Danqi

QuRating: Selecting High-Quality Data for Training Language Models

Selecting high-quality pre-training data is important for creating capable language models, but existing methods rely on simple heuristics. We introduce QuRating, a method for selecting pre-training data that captures the abstract qualities of texts which humans intuitively perceive. In this paper, we investigate four qualities - writing style, required expertise, facts & trivia, and educational value. We find that LLMs are able to discern these qualities and observe that they are better at making pairwise judgments of texts than at rating the quality of a text directly. We train a QuRater model to learn scalar ratings from pairwise judgments, and use it to annotate a 260B training corpus with quality ratings for each of the four criteria. In our experiments, we select 30B tokens according to the different quality ratings and train 1.3B-parameter language models on the selected data. We find that it is important to balance quality and diversity, as selecting only the highest-rated documents leads to poor results. When we sample using quality ratings as logits over documents, our models achieve lower perplexity and stronger in-context learning performance than baselines. Beyond data selection, we use the quality ratings to construct a training curriculum which improves performance without changing the training dataset. We extensively analyze the quality ratings and discuss their characteristics, biases, and wider implications.

large language model, machine learning, southern asia 10, (22 more...)

2402.09739

Country:

North America > United States > Texas (0.67)
Europe > Russia (0.67)
Asia > Middle East > Republic of Türkiye (0.67)
(41 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Personal (1.00)
Instructional Material (1.00)

Industry:

Transportation > Passenger (1.00)
Transportation > Air (1.00)
Retail (1.00)
(40 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Symmetry-Breaking Augmentations for Ad Hoc Teamwork

Hammond, Ravi, Craggs, Dustin, Guo, Mingyu, Foerster, Jakob, Reid, Ian

In many collaborative settings, artificial intelligence (AI) agents must be able to adapt to new teammates that use unknown or previously unobserved strategies. While often simple for humans, this can be challenging for AI agents. For example, if an AI agent learns to drive alongside others (a training set) that only drive on one side of the road, it may struggle to adapt this experience to coordinate with drivers on the opposite side, even if their behaviours are simply flipped along the left-right symmetry. To address this we introduce symmetry-breaking augmentations (SBA), which increases diversity in the behaviour of training teammates by applying a symmetry-flipping operation. By learning a best-response to the augmented set of teammates, our agent is exposed to a wider range of behavioural conventions, improving performance when deployed with novel teammates. We demonstrate this experimentally in two settings, and show that our approach improves upon previous ad hoc teamwork results in the challenging card game Hanabi. We also propose a general metric for estimating symmetry-dependency amongst a given set of policies.

agent, artificial intelligence, machine learning, (11 more...)

2402.09984

Country:

Oceania > Australia (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Asia > Thailand (0.14)

Genre: Research Report > Experimental Study (0.46)

Industry:

Leisure & Entertainment > Games (1.00)
Materials > Chemicals > Industrial Gases > Liquified Gas (0.46)
Materials > Chemicals > Commodity Chemicals > Petrochemicals > LNG (0.46)
Energy > Oil & Gas > Midstream (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)

Berrisford, Liam J, Barbosa, Hugo, Menezes, Ronaldo

A Data-Driven Supervised Machine Learning Approach to Estimating Global Ambient Air Pollution Concentrations With Associated Prediction Intervals

Global ambient air pollution, a transboundary challenge, is typically addressed through interventions relying on data from spatially sparse and heterogeneously placed monitoring stations. These stations often encounter temporal data gaps due to issues such as power outages. In response, we have developed a scalable, data-driven, supervised machine learning framework. This model is designed to impute missing temporal and spatial measurements, thereby generating a comprehensive dataset for pollutants including NO$_2$, O$_3$, PM$_{10}$, PM$_{2.5}$, and SO$_2$. The dataset, with a fine granularity of 0.25$^{\circ}$ at hourly intervals and accompanied by prediction intervals for each estimate, caters to a wide range of stakeholders relying on outdoor air pollution data for downstream assessments. This enables more detailed studies. Additionally, the model's performance across various geographical locations is examined, providing insights and recommendations for strategic placement of future monitoring stations to further enhance the model's accuracy.

artificial intelligence, decision tree learning, machine learning, (13 more...)

2402.10248

Country:

Asia > China (0.68)
Oceania (0.14)
Europe > United Kingdom > England (0.14)
(3 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation > Ground > Road (1.00)
Materials (1.00)
Health & Medicine (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.46)

arXiv.org Machine LearningFeb-15-2024

Mathematical Opportunities in Digital Twins (MATH-DT)

Antil, Harbir

The report describes the discussions from the Workshop on Mathematical Opportunities in Digital Twins (MATH-DT) from December 11-13, 2023, George Mason University. It illustrates that foundational Mathematical advances are required for Digital Twins (DTs) that are different from traditional approaches. A traditional model, in biology, physics, engineering or medicine, starts with a generic physical law (e.g., equations) and is often a simplification of reality. A DT starts with a specific ecosystem, object or person (e.g., personalized care) representing reality, requiring multi -scale, -physics modeling and coupling. Thus, these processes begin at opposite ends of the simulation and modeling pipeline, requiring different reliability criteria and uncertainty assessments. Additionally, unlike existing approaches, a DT assists humans to make decisions for the physical system, which (via sensors) in turn feeds data into the DT, and operates for the life of the physical system. While some of the foundational mathematical research can be done without a specific application context, one must also keep specific applications in mind for DTs. E.g., modeling a bridge or a biological system (a patient), or a socio-technical system (a city) is very different. The models range from differential equations (deterministic/uncertain) in engineering, to stochastic in biology, including agent-based. These are multi-scale hybrid models or large scale (multi-objective) optimization problems under uncertainty. There are no universal models or approaches. For e.g., Kalman filters for forecasting might work in engineering, but can fail in biomedical domain. Ad hoc studies, with limited systematic work, have shown that AI/ML methods can fail for simple engineering systems and can work well for biomedical problems. A list of `Mathematical Opportunities and Challenges' concludes the report.

artificial intelligence, machine learning, optimization problem, (16 more...)

arXiv.org Machine Learning

2402.10326

Country:

North America > United States > Maryland (0.46)
Europe > Netherlands (0.28)
Europe > Germany (0.14)
(6 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Energy > Oil & Gas > Upstream (1.00)
(8 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Self-consistent Validation for Machine Learning Electronic Structure

Hu, Gengyuan, Wei, Gengchen, Lou, Zekun, Torr, Philip H. S., Ouyang, Wanli, Zhong, Han-sen, Lin, Chen

Shanghai Artificial Intelligence Laboratory and Department of Engineering, University of Oxford (Dated: February 16, 2024) Machine learning has emerged as a significant approach to efficiently tackle electronic structure problems. Despite its potential, there is less guarantee for the model to generalize to unseen data that hinders its application in real-world scenarios. To address this issue, a technique has been proposed to estimate the accuracy of the predictions. This method integrates machine learning with self-consistent field methods to achieve both low validation cost and interpret-ability. This, in turn, enables exploration of the model's ability with active learning and instills confidence in its integration into real-world studies.

dataset, matrix, self-diis error, (15 more...)

2402.10186

Country:

Asia > China > Shanghai > Shanghai (0.24)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.24)
North America > United States > New York (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry: Materials > Chemicals > Commodity Chemicals > Petrochemicals (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Nastl, Vivian Y., Hardt, Moritz

Predictors from causal features do not generalize better to new domains

We study how well machine learning models trained on causal features generalize across domains. We consider 16 prediction tasks on tabular datasets covering applications in health, employment, education, social benefits, and politics. Each dataset comes with multiple domains, allowing us to test how well a model trained in one domain performs in another. For each prediction task, we select features that have a causal influence on the target of prediction. Our goal is to test the hypothesis that models trained on causal features generalize better across domains. Without exception, we find that predictors using all available features, regardless of causality, have better in-domain and out-of-domain accuracy than predictors using causal features. Moreover, even the absolute drop in accuracy from one domain to the other is no better for causal predictors than for models that use all features. If the goal is to generalize to new domains, practitioners might as well train the best possible model on all available features.

accuracy, causal feature, test 0, (12 more...)

2402.09891

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Alaska (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(5 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.65)

Industry:

Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(9 more...)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)
Information Technology > Information Management (0.92)

arXiv.org Artificial IntelligenceFeb-14-2024

How to Train Data-Efficient LLMs

Sachdeva, Noveen, Coleman, Benjamin, Kang, Wang-Cheng, Ni, Jianmo, Hong, Lichan, Chi, Ed H., Caverlee, James, McAuley, Julian, Cheng, Derek Zhiyuan

The training of large language models (LLMs) is expensive. In this paper, we study data-efficient approaches for pre-training LLMs, i.e., techniques that aim to optimize the Pareto frontier of model quality and training resource/data consumption. We seek to understand the tradeoffs associated with data selection routines based on (i) expensive-to-compute data-quality estimates, and (ii) maximization of coverage and diversity-based measures in the feature space. Our first technique, Ask-LLM, leverages the zero-shot reasoning capabilities of instruction-tuned LLMs to directly assess the quality of a training example. To target coverage, we propose Density sampling, which models the data distribution to select a diverse sample. In our comparison of 19 samplers, involving hundreds of evaluation tasks and pre-training runs, we find that Ask-LLM and Density are the best methods in their respective categories. Coverage sampling can recover the performance of the full data, while models trained on Ask-LLM data consistently outperform full-data training -- even when we reject 90% of the original dataset, while converging up to 70% faster.

perplexity, t5-large, token, (13 more...)

2402.09668

Country:

Asia > India > Tamil Nadu > Chennai (0.04)
North America > United States > South Carolina (0.04)
North America > United States > New Mexico (0.04)
(7 more...)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.65)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)
Health & Medicine (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)