AITopics | Piękos, Piotr

Collaborating Authors

Piękos, Piotr

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention

Csordás, Róbert, Piękos, Piotr, Irie, Kazuki, Schmidhuber, Jürgen

arXiv.org Artificial IntelligenceDec-14-2023

The costly self-attention layers in modern Transformers require memory and compute quadratic in sequence length. Existing approximation methods usually underperform and fail to obtain significant speedups in practice. Here we present SwitchHead--a novel method that reduces both compute and memory requirements and achieves wall-clock speedup, while matching the language modeling performance of baseline Transformers with the same parameter budget. Switch-Head uses Mixture-of-Experts (MoE) layers for the value and output projections and requires 4 to 8 times fewer attention matrices than standard Transformers. Our novel attention can also be combined with MoE MLP layers, resulting in an efficient fully-MoE "SwitchAll" Transformer model. Large language models (LLMs) have shown remarkable capabilities (Radford et al., 2019; Brown et al., 2020; OpenAI, 2022; 2023) and great versatility (Bubeck et al., 2023). However, training enormous Transformers (Vaswani et al., 2017; Schmidhuber, 1992) requires a considerable amount of computing power and memory, which is not accessible to most researchers, academic institutions, and even companies. Even running them in inference mode, which is much less resource-intensive, requires significant engineering effort (Gerganov, 2023). Accelerating big Transformers remains an important open research question. However, in these works, the parameter efficiency of MoEs has not been studied; MoE models have been typically compared to dense baselines with the same number of FLOPs but with much less parameters.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2312.07987

Country:

Europe (1.00)
North America > United States > Louisiana (0.14)
North America > United States > California (0.14)
(2 more...)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.86)

Add feedback

Mindstorms in Natural Language-Based Societies of Mind

Zhuge, Mingchen, Liu, Haozhe, Faccio, Francesco, Ashley, Dylan R., Csordás, Róbert, Gopalakrishnan, Anand, Hamdi, Abdullah, Hammoud, Hasan Abed Al Kader, Herrmann, Vincent, Irie, Kazuki, Kirsch, Louis, Li, Bing, Li, Guohao, Liu, Shuming, Mai, Jinjie, Piękos, Piotr, Ramesh, Aditya, Schlag, Imanol, Shi, Weimin, Stanić, Aleksandar, Wang, Wenyi, Wang, Yuhui, Xu, Mengmeng, Fan, Deng-Ping, Ghanem, Bernard, Schmidhuber, Jürgen

arXiv.org Artificial IntelligenceMay-26-2023

Both Minsky's "society of mind" and Schmidhuber's "learning to think" inspire diverse societies of large multimodal neural networks (NNs) that solve problems by interviewing each other in a "mindstorm." Recent implementations of NN-based societies of minds consist of large language models (LLMs) and other NN-based experts communicating through a natural language interface. In doing so, they overcome the limitations of single LLMs, improving multimodal zero-shot reasoning. In these natural language-based societies of mind (NLSOMs), new agents -- all communicating through the same universal symbolic language -- are easily added in a modular fashion. To demonstrate the power of NLSOMs, we assemble and experiment with several of them (having up to 129 members), leveraging mindstorms in them to solve some practical AI tasks: visual question answering, image captioning, text-to-image synthesis, 3D generation, egocentric retrieval, embodied AI, and general language-based task solving. We view this as a starting point towards much larger NLSOMs with billions of agents-some of which may be humans. And with this emergence of great societies of heterogeneous minds, many new research questions have suddenly become paramount to the future of artificial intelligence. What should be the social structure of an NLSOM? What would be the (dis)advantages of having a monarchical rather than a democratic structure? How can principles of NN economies be used to maximize the total reward of a reinforcement learning NLSOM? In this work, we identify, discuss, and try to answer some of these questions.

machine learning, natural language, proposal, (19 more...)

arXiv.org Artificial Intelligence

2305.17066

Country:

Asia (1.00)
North America > United States > California (0.27)
Europe > Switzerland > Zürich > Zürich (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study > Negative Result (0.34)

Industry:

Transportation > Air (1.00)
Leisure & Entertainment (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
(10 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Fast and Precise: Adjusting Planning Horizon with Adaptive Subgoal Search

Zawalski, Michał, Tyrolski, Michał, Czechowski, Konrad, Odrzygóźdź, Tomasz, Stachura, Damian, Piękos, Piotr, Wu, Yuhuai, Kuciński, Łukasz, Miłoś, Piotr

arXiv.org Artificial IntelligenceApr-5-2023

Complex reasoning problems contain states that vary in the computational cost required to determine a good action plan. Taking advantage of this property, we propose Adaptive Subgoal Search (AdaSubS), a search method that adaptively adjusts the planning horizon. To this end, AdaSubS generates diverse sets of subgoals at different distances. A verification mechanism is employed to filter out unreachable subgoals swiftly, allowing to focus on feasible further subgoals. In this way, AdaSubS benefits from the efficiency of planning with longer subgoals and the fine control with the shorter ones, and thus scales well to difficult planning problems. We show that AdaSubS significantly surpasses hierarchical planning algorithms on three complex reasoning tasks: Sokoban, the Rubik's Cube, and inequality proving benchmark INT.

machine learning, natural language, reinforcement learning, (21 more...)

arXiv.org Artificial Intelligence

2206.00702

Country: North America > United States > California (0.28)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback