AITopics | timmy

Collaborating Authors

timmy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

PaPaformer: Language Model from Pre-trained Parallel Paths

Tapaninaho, Joonas, Oussala, Mourad

arXiv.org Artificial IntelligenceAug-11-2025

The training of modern large-language models requires an increasingly amount of computation power and time. Even smaller variants, such as small-language models (SLMs), take several days to train in the best-case scenarios, often requiring multiple GPUs. This paper explores methods to train and evaluate decoder-only transformer-based language models in hours instead of days/weeks. We introduces \textit{PaPaformer}, a decoder-only transformer architecture variant, whose lower-dimensional parallel paths are combined into larger model. The paper shows that these lower-dimensional paths can be trained individually with different types of training data and then combined into one larger model. This method gives the option to reduce the total number of model parameters and the training time with increasing performance. Moreover, the use of parallel path structure opens interesting possibilities to customize paths to accommodate specific task requirements.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2508.00544

Country: Europe > Finland (0.14)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Identifying Sparsely Active Circuits Through Local Loss Landscape Decomposition

Chrisman, Brianna, Bushnaq, Lucius, Sharkey, Lee

arXiv.org Artificial IntelligenceMar-31-2025

Much of mechanistic interpretability has focused on understanding the activation spaces of large neural networks. However, activation space-based approaches reveal little about the underlying circuitry used to compute features. To better understand the circuits employed by models, we introduce a new decomposition method called Local Loss Landscape Decomposition (L3D). L3D identifies a set of low-rank subnetworks: directions in parameter space of which a subset can reconstruct the gradient of the loss between any sample's output and a reference output vector. We design a series of progressively more challenging toy models with well-defined subnetworks and show that L3D can nearly perfectly recover the associated subnetworks. Additionally, we investigate the extent to which perturbing the model in the direction of a given subnetwork affects only the relevant subset of samples. Finally, we apply L3D to a real-world transformer model and a convolutional neural network, demonstrating its potential to identify interpretable and relevant circuits in parameter space.

artificial intelligence, machine learning, subnetwork, (19 more...)

arXiv.org Artificial Intelligence

2504.00194

Genre: Research Report (0.83)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Teaching Shortest Path Algorithms With a Robot and Overlaid Projections

Jolakoski, Pavel, Deja, Jordan Aiko, Pucihar, Klen Čopič, Kljun, Matjaž

arXiv.org Artificial IntelligenceNov-23-2024

Robots have the potential to enhance teaching of advanced computer science topics, making abstract concepts more tangible and interactive. In this paper, we present Timmy-a GoPiGo robot augmented with projections to demonstrate shortest path algorithms in an interactive learning environment. We integrated a JavaScript-based application that is projected around the robot, which allows users to construct graphs and visualise three different shortest path algorithms with colour-coded edges and vertices. Animated graph exploration and traversal are augmented by robot movements. To evaluate Timmy, we conducted two user studies. An initial study (= 10) to explore the feasibility of this type of teaching where participants were just observing both robot-synced and the on-screen-only visualisations. And a pilot study (= 6) where participants actively interacted with the system, constructed graphs and selected desired algorithms. In both studies we investigated the preferences towards the system and not the teaching outcome. Initial findings suggest that robots offer an engaging tool for teaching advanced algorithmic concepts, but highlight the need for further methodological refinements and larger-scale studies to fully evaluate their effectiveness.

algorithm, artificial intelligence, robot, (14 more...)

arXiv.org Artificial Intelligence

2411.15535

Country:

North America > United States > Missouri > Jackson County > Kansas City (0.14)
Asia > Philippines > Luzon > National Capital Region > City of Manila (0.14)
Asia > Middle East > Jordan (0.05)
(4 more...)

Genre: Research Report > New Finding (0.48)

Industry: Education > Educational Setting (1.00)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Implicit Geometry of Next-token Prediction: From Language Sparsity Patterns to Model Representations

Zhao, Yize, Behnia, Tina, Vakilian, Vala, Thrampoulidis, Christos

arXiv.org Artificial IntelligenceAug-27-2024

Next-token prediction (NTP) over large text corpora has become the go-to paradigm to train large language models. Yet, it remains unclear how NTP influences the mapping of linguistic patterns to geometric properties of the resulting model representations. We frame training of large language models as soft-label classification over sparse probabilistic label vectors, coupled with an analytical approximation that allows unrestricted generation of context embeddings. This approach links NTP training to rank-constrained, nuclear-norm regularized optimization in the logit domain, offering a framework for analyzing the geometry of word and context embeddings. In large embedding spaces, we find that NTP implicitly favors learning logits with a sparse plus low-rank structure. While the sparse component captures the co-occurrence frequency of context-word pairs, the orthogonal low-rank component, which becomes dominant as training progresses, depends solely on the sparsity pattern of the co-occurrence matrix. Consequently, when projected onto an appropriate subspace, representations of contexts that are followed by the same set of next-tokens collapse, a phenomenon we term subspace-collapse. We validate our findings on synthetic and small-scale real language datasets. Finally, we outline potential research directions aimed at deepening the understanding of NTP's influence on the learning of linguistic patterns and regularities.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2408.15417

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.86)

Add feedback

Assessing Language Models' Worldview for Fiction Generation

Khatun, Aisha, Brown, Daniel G.

arXiv.org Artificial IntelligenceAug-14-2024

The use of Large Language Models (LLMs) has become ubiquitous, with abundant applications in computational creativity. One such application is fictional story generation. Fiction is a narrative that occurs in a story world that is slightly different than ours. With LLMs becoming writing partners, we question how suitable they are to generate fiction. This study investigates the ability of LLMs to maintain a state of world essential to generate fiction. Through a series of questions to nine LLMs, we find that only two models exhibit consistent worldview, while the rest are self-conflicting. Subsequent analysis of stories generated by four models revealed a strikingly uniform narrative pattern. This uniformity across models further suggests a lack of `state' necessary for fiction. We highlight the limitations of current LLMs in fiction writing and advocate for future research to test and create story worlds for LLMs to reside in. All code, dataset, and the generated responses can be found in https://github.com/tanny411/llm-reliability-and-consistency-evaluation.

easter bunny, language model, llm, (15 more...)

arXiv.org Artificial Intelligence

2408.07904

Country:

South America > Argentina (0.05)
North America > United States (0.04)
North America > Greenland (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry:

Government (0.68)
Education (0.68)
Media (0.68)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

Lost in the Woods -- an AI Rescue Story!

#artificialintelligenceAug-20-2021, 13:20:33 GMT

Timmy's first experience with a project-based learning activity was exciting. He loved the idea that he was able to help the homeless people in his neighborhood. As part of the experience, he was able to meet with and talk to business owners, local area experts, and the city mayor. His continued efforts resulted in renewed community support for the homeless problem and the building of two new homeless shelters in his neighborhood. When his science teacher indicated that his class was about to embark on a new project-based learning activity, he was excited once again.

artificial intelligence, computer, timmy, (10 more...)

#artificialintelligence

Industry: Education (1.00)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback