AITopics | mean return

Collaborating Authors

mean return

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Meta-Reinforcement Learning (Meta-RL) aims to acquire meta-knowledge for quick adaptation to diverse tasks. However, applying these policies in real-world environments presents a significant challenge in balancing rapid adaptability with adherence to environmental constraints. Our novel approach, Constraint Model Agnostic Meta Learning (C-MAML), merges meta learning with constrained optimization to address this challenge. C-MAML enables rapid and efficient task adaptation by incorporating task-specific constraints directly into its meta-algorithm framework during the training phase. This fusion results in safer initial parameters for learning new tasks. We demonstrate the effectiveness of C-MAML in simulated locomotion with wheeled robot tasks of varying complexity, highlighting its practicality and robustness in dynamic environments.

constraint, mean cost, mean return, (15 more...)

arXiv.org Artificial Intelligence

2406.14047

Country:

Oceania > Australia > New South Wales > Sydney (0.14)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(2 more...)

Genre:

Research Report > Promising Solution (0.48)
Overview > Innovation (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

THaLLE: Text Hyperlocally Augmented Large Language Extension -- Technical Report

Labs, KBTG, Khamnuansin, Danupat, Petchsod, Atthakorn, Lertpiya, Anuruth, Balee, Pornchanan, Lodkaew, Thanawat, Chalothorn, Tawunrat, Pongthawornkamol, Thadpong, Lertsutthiwong, Monchai

arXiv.org Artificial IntelligenceJun-11-2024

Large Language Models (LLMs) have emerged as leading tools in Natural Language Processing (NLP) due to their exceptional performance across various tasks. The advent of open-source models such as Llama [1] from Meta, Gemma [2] from Google, and Qwen [3] from Alibaba has significantly enhanced public access to advanced LLMs. Additionally, low-cost techniques for LLM fine-tuning, such as Low-rank Adaptation (LoRA) [4], have enabled the fine-tuning of these models on consumer-grade hardware, thereby accelerating their development and adoption. LLMs are now utilized in a wide array of applications, ranging from personal assistants, i.e., ChatGPT, to specialized tasks in diverse domains. In the financial sector, BloombergGPT [5], a proprietary LLM trained from the ground up with an infusion of financial data, has demonstrated superior performance on financial benchmarks compared to other models in the market.

cfa exam, reasoning, zero shot, (15 more...)

arXiv.org Artificial Intelligence

2406.07505

Country:

Europe > Monaco (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Banking & Finance (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Challenges for Reinforcement Learning in Quantum Computing

Altmann, Philipp, Bärligea, Adelina, Stein, Jonas, Kölle, Michael, Gabor, Thomas, Phan, Thomy, Linnhoff-Popien, Claudia

arXiv.org Artificial IntelligenceDec-18-2023

Quantum computing (QC) in the current NISQ-era is still limited. To gain early insights and advantages, hybrid applications are widely considered mitigating those shortcomings. Hybrid quantum machine learning (QML) comprises both the application of QC to improve machine learning (ML), and the application of ML to improve QC architectures. This work considers the latter, focusing on leveraging reinforcement learning (RL) to improve current QC approaches. We therefore introduce various generic challenges arising from quantum architecture search and quantum circuit optimization that RL algorithms need to solve to provide benefits for more complex applications and combinations of those. Building upon these challenges we propose a concrete framework, formalized as a Markov decision process, to enable to learn policies that are capable Figure 1: Quantum Circuit Designer for qubits with depth of controlling a universal set of quantum gates. Furthermore, we provide benchmark results to assess shortcomings and strengths of current state-of-the-art algorithms.

action space, opération, qubit, (13 more...)

arXiv.org Artificial Intelligence

2312.11337

Country:

North America > United States > California (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Scaling Laws for Imitation Learning in NetHack

Tuyls, Jens, Madeka, Dhruv, Torkkola, Kari, Foster, Dean, Narasimhan, Karthik, Kakade, Sham

arXiv.org Artificial IntelligenceJul-18-2023

Imitation Learning (IL) is one of the most widely used methods in machine learning. Yet, while powerful, many works find it is often not able to fully recover the underlying expert behavior [1-3]. However, none of these works deeply investigate the role of scaling up the model and data size. Inspired by recent work in Natural Language Processing (NLP) [4, 5] where "scaling up" has resulted in increasingly more capable LLMs, we investigate whether carefully scaling up model and data size can bring similar improvements in the imitation learning setting. To demonstrate our findings, we focus on the game of NetHack, a challenging environment featuring procedural generation, stochasticity, long-term dependencies, and partial observability. We find IL loss and mean return scale smoothly with the compute budget and are strongly correlated, resulting in power laws for training compute-optimal IL agents with respect to model size and number of samples. We forecast and train several NetHack agents with IL and find they outperform prior state-of-the-art by at least 2x in all settings. Our work both demonstrates the scaling behavior of imitation learning in a challenging domain, as well as the viability of scaling up current approaches for increasingly capable agents in NetHack, a game that remains elusively hard for current AI systems.

artificial intelligence, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2307.09423

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Leisure & Entertainment > Games (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

DIRECT: Learning from Sparse and Shifting Rewards using Discriminative Reward Co-Training

Altmann, Philipp, Phan, Thomy, Ritz, Fabian, Gabor, Thomas, Linnhoff-Popien, Claudia

arXiv.org Artificial IntelligenceJan-18-2023

We propose discriminative reward co-training (DIRECT) as an extension to deep reinforcement learning algorithms. Building upon the concept of self-imitation learning (SIL), we introduce an imitation buffer to store beneficial trajectories generated by the policy determined by their return. A discriminator network is trained concurrently to the policy to distinguish between trajectories generated by the current policy and beneficial trajectories generated by previous policies. The discriminator's verdict is used to construct a reward signal for optimizing the policy. By interpolating prior experience, DIRECT is able to act as a surrogate, steering policy optimization towards more valuable regions of the reward landscape thus learning an optimal policy. Our results show that DIRECT outperforms state-of-the-art algorithms in sparse- and shifting-reward environments being able to provide a surrogate reward to the policy and direct the optimization towards valuable areas.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2301.07421

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > Alberta (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
(5 more...)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Deep Learning Statistical Arbitrage

Guijarro-Ordonez, Jorge, Pelger, Markus, Zanotti, Greg

arXiv.org Artificial IntelligenceOct-7-2022

Statistical arbitrage exploits temporal price differences between similar assets. We develop a unifying conceptual framework for statistical arbitrage and a novel data driven solution. First, we construct arbitrage portfolios of similar assets as residual portfolios from conditional latent asset pricing factors. Second, we extract their time series signals with a powerful machine-learning time-series solution, a convolutional transformer. Lastly, we use these signals to form an optimal trading policy, that maximizes risk-adjusted returns under constraints. Our comprehensive empirical study on daily US equities shows a high compensation for arbitrageurs to enforce the law of one price. Our arbitrage strategies obtain consistently high out-of-sample mean returns and Sharpe ratios, and substantially outperform all benchmark approaches.

artificial intelligence, machine learning, portfolio, (17 more...)

arXiv.org Artificial Intelligence

2106.04028

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Indonesia > Bali (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback