AITopics | earning

Collaborating Authors

earning

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Meta to report quarterly earnings amid tariff uncertainty and AI investment

The GuardianApr-30-2025, 18:00:01 GMT

Meta is set to report its first quarter earnings on Wednesday after the bell, and investors will be looking for news on whether the company met its quarterly revenue goals of somewhere between 39.5bn and 41.8bn. Wall Street is projecting the company will post 41.36bn in revenue on 5.21 in earnings per share. While Meta has repeatedly beaten Wall Street expectations in the past few quarters, analysts were disappointed by the first quarter revenue outlook Meta chief executive Mark Zuckerberg shared at the end of 2024. The company is also planning on spending up to 65bn on AI infrastructure by the end of 2025. Uncertainty over Donald Trump's sweeping tariffs may yet roil ad markets, clouding the company's financial outlook for near future quarters.

artificial intelligence, meta, natural language, (9 more...)

The Guardian

Country: North America > United States > New York > New York County > New York City (0.48)

Genre: Financial News (1.00)

Industry: Information Technology > Services (0.56)

Technology:

Information Technology > Communications > Social Media (0.78)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.44)

Add feedback

Meta rides AI boom to stellar quarterly earnings, but slightly less than expected

The GuardianOct-30-2024, 21:36:57 GMT

Meta's blowout year continues after the company reported another stellar financial quarter on Wednesday. But shares fell in after-hours trading after the company missed Wall Street expectations for daily active users. Wall Street analysts had high expectations for the Instagram and WhatsApp parent company, projecting an 18% jump in sales year over year. The company reported 40.6bn in sales, a 19% increase year over year that outpaced investor expectations of 40.19bn. Meta, which saw a 25% jump in its share price over the past two months, reported 6.03 in earnings per share (EPS), surpassing Wall Street's expectations of an EPS of 5.29.

investment, meta, meta ride ai boom, (8 more...)

The Guardian

Country: North America > United States > New York > New York County > New York City (0.68)

Genre: Financial News (0.86)

Industry:

Law (1.00)
Government (1.00)
Banking & Finance > Trading (0.56)
Information Technology > Services (0.52)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback

Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks

Liu, Fanghui, Dadi, Leello, Cevher, Volkan

arXiv.org Machine LearningJun-25-2024

Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks as the curse of dimensionality (CoD) cannot be evaded when trying to approximate even a single ReLU neuron (Bach, 2017). In this paper, we study a suitable function space for over-parameterized two-layer neural networks with bounded norms (e.g., the path norm, the Barron norm) in the perspective of sample complexity and generalization properties. First, we show that the path norm (as well as the Barron norm) is able to obtain width-independence sample complexity bounds, which allows for uniform convergence guarantees. Based on this result, we derive the improved result of metric entropy for $\epsilon$-covering up to $O(\epsilon^{-\frac{2d}{d+2}})$ ($d$ is the input dimension and the depending constant is at most linear order of $d$) via the convex hull technique, which demonstrates the separation with kernel methods with $\Omega(\epsilon^{-d})$ to learn the target function in a Barron space. Second, this metric entropy result allows for building a sharper generalization bound under a general moment hypothesis setting, achieving the rate at $O(n^{-\frac{d+2}{2d+2}})$. Our analysis is novel in that it offers a sharper and refined estimation for metric entropy with a linear dimension dependence and unbounded sampling in the estimation of the sample error and the output error.

barron space, neural network, two-layer neural network, (14 more...)

arXiv.org Machine Learning

2404.18769

Country:

North America > United States (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Switzerland > Vaud > Lausanne (0.04)
Europe > United Kingdom > England > West Midlands > Coventry (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Government (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Compressed Online Learning of Conditional Mean Embedding

Hou, Boya, Sanjari, Sina, Koppel, Alec, Bose, Subhonmesh

arXiv.org Machine LearningMay-12-2024

The conditional mean embedding (CME) encodes Markovian stochastic kernels through their actions on probability distributions embedded within the reproducing kernel Hilbert spaces (RKHS). The CME plays a key role in several well-known machine learning tasks such as reinforcement learning, analysis of dynamical systems, etc. We present an algorithm to learn the CME incrementally from data via an operator-valued stochastic gradient descent. As is well-known, function learning in RKHS suffers from scalability challenges from large data. We utilize a compression mechanism to counter the scalability challenge. The core contribution of this paper is a finite-sample performance guarantee on the last iterate of the online compressed operator learning algorithm with fast-mixing Markovian samples, when the target CME may not be contained in the hypothesis space. We illustrate the efficacy of our algorithm by applying it to the analysis of an example dynamical system.

algorithm, earning, operator, (14 more...)

arXiv.org Machine Learning

2405.07432

Country:

North America > United States > Illinois > Champaign County > Urbana (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)

Genre: Research Report (0.50)

Industry: Education > Educational Setting > Online (0.41)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

Add feedback

Site-specific Deterministic Temperature and Humidity Forecasts with Explainable and Reliable Machine Learning

Han, MengMeng, Leeuwenburg, Tennessee, Murphy, Brad

arXiv.org Artificial IntelligenceApr-4-2024

Site-specific weather forecasts are essential to accurate prediction of power demand and are consequently of great interest to energy operators. However, weather forecasts from current numerical weather prediction (NWP) models lack the fine-scale detail to capture all important characteristics of localised real-world sites. Instead they provide weather information representing a rectangular gridbox (usually kilometres in size). Even after post-processing and bias correction, area-averaged information is usually not optimal for specific sites. Prior work on site optimised forecasts has focused on linear methods, weighted consensus averaging, time-series methods, and others. Recent developments in machine learning (ML) have prompted increasing interest in applying ML as a novel approach towards this problem. In this study, we investigate the feasibility of optimising forecasts at sites by adopting the popular machine learning model gradient boosting decision tree, supported by the Python version of the XGBoost package. Regression trees have been trained with historical NWP and site observations as training data, aimed at predicting temperature and dew point at multiple site locations across Australia. We developed a working ML framework, named 'Multi-SiteBoost' and initial testing results show a significant improvement compared with gridded values from bias-corrected NWP models. The improvement from XGBoost is found to be comparable with non-ML methods reported in literature. With the insights provided by SHapley Additive exPlanations (SHAP), this study also tests various approaches to understand the ML predictions and increase the reliability of the forecasts generated by ML.

feature value, forecast, prediction, (15 more...)

arXiv.org Artificial Intelligence

2404.0331

Country:

Oceania > Australia > Northern Territory > Alice Springs (0.05)
North America > United States > Tennessee (0.04)
Europe > United Kingdom (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.87)

Industry:

Transportation > Air (0.70)
Energy > Renewable (0.67)
Transportation > Infrastructure & Services > Airport (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Distributed Multi-Task Learning for Stochastic Bandits with Context Distribution and Stage-wise Constraints

Lin, Jiabin, Moothedath, Shana

arXiv.org Artificial IntelligenceJan-21-2024

We present the problem of conservative distributed multi-task learning in stochastic linear contextual bandits with heterogeneous agents. This extends conservative linear bandits to a distributed setting where M agents tackle different but related tasks while adhering to stage-wise performance constraints. The exact context is unknown, and only a context distribution is available to the agents as in many practical applications that involve a prediction mechanism to infer context, such as stock market prediction and weather forecast. We propose a distributed upper confidence bound (UCB) algorithm, DiSC-UCB. Our algorithm constructs a pruned action set during each round to ensure the constraints are met. Additionally, it includes synchronized sharing of estimates among agents via a central server using well-structured synchronization steps. We prove the regret and communication bounds on the algorithm. We extend the problem to a setting where the agents are unaware of the baseline reward. For this setting, we provide a modified algorithm, DiSC-UCB2, and we show that the modified algorithm achieves the same regret and communication bounds. We empirically validated the performance of our algorithm on synthetic data and real-world Movielens-100K data.

agent, algorithm, constraint, (15 more...)

arXiv.org Artificial Intelligence

2401.11563

Country:

North America > United States > Iowa > Story County > Ames (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.55)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Online Decision Making with History-Average Dependent Costs (Extended)

Hebbar, Vijeth, Langbort, Cedric

arXiv.org Artificial IntelligenceDec-11-2023

In many online sequential decision-making scenarios, a learner's choices affect not just their current costs but also the future ones. In this work, we look at one particular case of such a situation where the costs depend on the time average of past decisions over a history horizon. We first recast this problem with history dependent costs as a problem of decision making under stage-wise constraints. To tackle this, we then propose the novel Follow-The-Adaptively-Regularized-Leader (FTARL) algorithm. Our innovative algorithm incorporates adaptive regularizers that depend explicitly on past decisions, allowing us to enforce stage-wise constraints while simultaneously enabling us to establish tight regret bounds. We also discuss the implications of the length of history horizon on design of no-regret algorithms for our problem and present impossibility results when it is the full learning horizon.

algorithm, constraint, cost function, (16 more...)

arXiv.org Artificial Intelligence

2312.06641

Country:

North America > United States > Illinois > Champaign County > Urbana (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Interpretable Reinforcement Learning for Robotics and Continuous Control

Paleja, Rohan, Chen, Letian, Niu, Yaru, Silva, Andrew, Li, Zhaoxin, Zhang, Songan, Ritchie, Chace, Choi, Sugju, Chang, Kimberlee Chestnut, Tseng, Hongtei Eric, Wang, Yan, Nageshrao, Subramanya, Gombolay, Matthew

arXiv.org Artificial IntelligenceNov-16-2023

Interpretability in machine learning is critical for the safe deployment of learned policies across legally-regulated and safety-critical domains. While gradient-based approaches in reinforcement learning have achieved tremendous success in learning policies for continuous control problems such as robotics and autonomous driving, the lack of interpretability is a fundamental barrier to adoption. We propose Interpretable Continuous Control Trees (ICCTs), a tree-based model that can be optimized via modern, gradient-based, reinforcement learning approaches to produce high-performing, interpretable policies. The key to our approach is a procedure for allowing direct optimization in a sparse decision-tree-like representation. We validate ICCTs against baselines across six domains, showing that ICCTs are capable of learning policies that parity or outperform baselines by up to 33% in autonomous driving scenarios while achieving a 300x-600x reduction in the number of parameters against deep learning baselines. We prove that ICCTs can serve as universal function approximators and display analytically that ICCTs can be verified in linear time. Furthermore, we deploy ICCTs in two realistic driving domains, based on interstate Highway-94 and 280 in the US. Finally, we verify ICCT's utility with end-users and find that ICCTs are rated easier to simulate, quicker to validate, and more interpretable than neural networks.

decision node, icct, interpretability, (17 more...)

arXiv.org Artificial Intelligence

2311.10041

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Michigan > Wayne County > Dearborn (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry:

Transportation > Ground > Road (1.00)
Information Technology (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Exploring Active Learning in Meta-Learning: Enhancing Context Set Labeling

Bae, Wonho, Wang, Jing, Sutherland, Danica J.

arXiv.org Artificial IntelligenceNov-6-2023

Most meta-learning methods assume that the (very small) context set used to establish a new task at test time is passively provided. In some settings, however, it is feasible to actively select which points to label; the potential gain from a careful choice is substantial, but the setting requires major differences from typical active learning setups. We clarify the ways in which active meta-learning can be used to label a context set, depending on which parts of the meta-learning process use active learning. Within this framework, we propose a natural algorithm based on fitting Gaussian mixtures for selecting which points to label; though simple, the algorithm also has theoretical motivation. The proposed algorithm outperforms state-of-the-art active learning methods when used with various meta-learning algorithms across several benchmark datasets. Meta-learning has gained significant prominence as a substitute for traditional "plain" supervised learning tasks, with the aim to adapt or generalize to new tasks given extremely limited data. There has been enormous success compared to learning "from scratch" on each new problem, but could we do even better, with even less data? One major way to improve data-efficiency in standard supervised learning settings is to move to an active learning paradigm, where typically a model can request a small number of labels from a pool of unlabeled data; these are collected, used to further train the model, and the process is repeated. Although each of these lines of research are quite developed, their combination - active meta-learning - has seen comparatively little research attention. Given that both focus on improving data efficiency, it seems very natural to investigate further. How can a meta-learner exploit an active learning setup to learn the best model possible, using only a very small number of labels in its context sets? We are aware of two previous attempts at active selection of context sets in meta-learning: Müller et al. (2022) do so at meta-training time for text classification, while Boney & Ilin (2017) do it at meta-test time in semi-supervised few-shot image classification with ProtoNet (Snell et al., 2017). "Active meta-learning" thus means very different things in their procedures; these approaches are also entirely different from work on active selection of tasks during meta-training (as in Kaddour et al., 2020; Nikoloska & Simeone, 2022; Kumar et al., 2022). Our first contribution is therefore to clarify the different ways in which active learning can be applied to meta-learning, for differing purposes.

classification, earning, learning, (15 more...)

arXiv.org Artificial Intelligence

2311.02879

Country:

North America > Canada > British Columbia (0.04)
North America > United States > Virginia (0.04)
North America > United States > California (0.04)
(2 more...)

Genre:

Overview (0.87)
Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners

Ye, Seonghyeon, Kim, Doyoung, Jang, Joel, Shin, Joongbo, Seo, Minjoon

arXiv.org Artificial IntelligenceJun-6-2023

Meta-training, which fine-tunes the language model (LM) on various downstream tasks by maximizing the likelihood of the target label given the task instruction and input instance, has improved the zero-shot task generalization performance. However, meta-trained LMs still struggle to generalize to challenging tasks containing novel labels unseen during meta-training. In this paper, we propose Flipped Learning, an alternative method of meta-training which trains the LM to generate the task instruction given the input instance and label. During inference, the LM trained with Flipped Learning, referred to as Flipped, selects the label option that is most likely to generate the task instruction. On 14 tasks of the BIG-bench benchmark, the 11B-sized Flipped outperforms zero-shot T0-11B and even a 16 times larger 3-shot GPT-3 (175B) on average by 8.4% and 9.7% points, respectively. Flipped gives particularly large improvements on tasks with unseen labels, outperforming T0-11B by up to +20% average F1 score. This indicates that the strong task generalization of Flipped comes from improved generalization to novel labels. We release our code at https://github.com/seonghyeonye/Flipped-Learning.

artificial intelligence, large language model, natural language, (14 more...)

arXiv.org Artificial Intelligence

2210.02969

Country:

North America > Canada > Quebec > Centre-du-Québec Region > Drummondville (0.04)
North America > Canada > Quebec > Montreal (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (1.00)

Industry: Education (0.67)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback