AITopics | shape metric

Collaborating Authors

shape metric

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

AlphaLoRA: Assigning LoRA Experts Based on Layer Training Quality

Qing, Peijun, Gao, Chongyang, Zhou, Yefan, Diao, Xingjian, Yang, Yaoqing, Vosoughi, Soroush

arXiv.org Artificial IntelligenceOct-13-2024

Parameter-efficient fine-tuning methods, such as Low-Rank Adaptation (LoRA), are known to enhance training efficiency in Large Language Models (LLMs). Due to the limited parameters of LoRA, recent studies seek to combine LoRA with Mixture-of-Experts (MoE) to boost performance across various tasks. However, inspired by the observed redundancy in traditional MoE structures, previous studies identify similar redundancy among LoRA experts within the MoE architecture, highlighting the necessity for non-uniform allocation of LoRA experts across different layers. In this paper, we leverage Heavy-Tailed Self-Regularization (HT-SR) Theory to design a fine-grained allocation strategy. Our analysis reveals that the number of experts per layer correlates with layer training quality, which exhibits significant variability across layers. Based on this, we introduce AlphaLoRA, a theoretically principled and training-free method for allocating LoRA experts to further mitigate redundancy. Experiments on three models across ten language processing and reasoning benchmarks demonstrate that AlphaLoRA achieves comparable or superior performance over all baselines. Our code is available at https://github.com/morelife2017/alphalora.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.10054

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
Asia > China > Liaoning Province > Shenyang (0.04)

Genre: Research Report (1.00)

Industry: Education > Curriculum > Subject-Specific Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Shape-Graph Matching Network (SGM-net): Registration for Statistical Shape Analysis

Liang, Shenyuan, Segundo, Mauricio Pamplona, Aakur, Sathyanarayanan N., Sarkar, Sudeep, Srivastava, Anuj

arXiv.org Machine LearningAug-13-2023

This paper focuses on the statistical analysis of shapes of data objects called shape graphs, a set of nodes connected by articulated curves with arbitrary shapes. A critical need here is a constrained registration of points (nodes to nodes, edges to edges) across objects. This, in turn, requires optimization over the permutation group, made challenging by differences in nodes (in terms of numbers, locations) and edges (in terms of shapes, placements, and sizes) across objects. This paper tackles this registration problem using a novel neural-network architecture and involves an unsupervised loss function developed using the elastic shape metric for curves. This architecture results in (1) state-of-the-art matching performance and (2) an order of magnitude reduction in the computational cost relative to baseline approaches. We demonstrate the effectiveness of the proposed approach using both simulated data and real-world 2D and 3D shape graphs. Code and data will be made publicly available after review to foster research.

graph, registration, shape graph, (12 more...)

arXiv.org Machine Learning

2308.06869

Country:

North America > United States > Florida (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > Oklahoma > Payne County > Stillwater (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)

Add feedback

Evaluating natural language processing models with generalization metrics that do not need access to any training or testing data

Yang, Yaoqing, Theisen, Ryan, Hodgkinson, Liam, Gonzalez, Joseph E., Ramchandran, Kannan, Martin, Charles H., Mahoney, Michael W.

arXiv.org Artificial IntelligenceJun-4-2023

Selecting suitable architecture parameters and training hyperparameters is essential for enhancing machine learning (ML) model performance. Several recent empirical studies conduct large-scale correlational analysis on neural networks (NNs) to search for effective \emph{generalization metrics} that can guide this type of model selection. Effective metrics are typically expected to correlate strongly with test performance. In this paper, we expand on prior analyses by examining generalization-metric-based model selection with the following objectives: (i) focusing on natural language processing (NLP) tasks, as prior work primarily concentrates on computer vision (CV) tasks; (ii) considering metrics that directly predict \emph{test error} instead of the \emph{generalization gap}; (iii) exploring metrics that do not need access to data to compute. From these objectives, we are able to provide the first model selection results on large pretrained Transformers from Huggingface using generalization metrics. Our analyses consider (I) hundreds of Transformers trained in different settings, in which we systematically vary the amount of data, the model size and the optimization hyperparameters, (II) a total of 51 pretrained Transformers from eight families of Huggingface NLP models, including GPT2, BERT, etc., and (III) a total of 28 existing and novel generalization metrics. Despite their niche status, we find that metrics derived from the heavy-tail (HT) perspective are particularly useful in NLP tasks, exhibiting stronger correlations than other, more popular metrics. To further examine these metrics, we extend prior formulations relying on power law (PL) spectral distributions to exponential (EXP) and exponentially-truncated power law (E-TPL) families.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2202.02842

Country: North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Representational dissimilarity metric spaces for stochastic neural networks

Duong, Lyndon R., Zhou, Jingyang, Nassar, Josue, Berman, Jules, Olieslagers, Jeroen, Williams, Alex H.

arXiv.org Artificial IntelligenceFeb-3-2023

Quantifying similarity between neural representations -- e.g. hidden layer activation vectors -- is a perennial problem in deep learning and neuroscience research. Existing methods compare deterministic responses (e.g. artificial networks that lack stochastic layers) or averaged responses (e.g., trial-averaged firing rates in biological data). However, these measures of _deterministic_ representational similarity ignore the scale and geometric structure of noise, both of which play important roles in neural computation. To rectify this, we generalize previously proposed shape metrics (Williams et al. 2021) to quantify differences in _stochastic_ representations. These new distances satisfy the triangle inequality, and thus can be used as a rigorous basis for many supervised and unsupervised analyses. Leveraging this novel framework, we find that the stochastic geometries of neurobiological representations of oriented visual gratings and naturalistic scenes respectively resemble untrained and trained deep network representations. Further, we are able to more accurately predict certain network attributes (e.g. training hyperparameters) from its position in stochastic (versus deterministic) shape space.

artificial intelligence, machine learning, representation, (19 more...)

arXiv.org Artificial Intelligence

2211.11665

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Add feedback

Neural Networks as Paths through the Space of Representations

Lange, Richard D., Kwok, Devin, Matelsky, Jordan, Wang, Xinyue, Rolnick, David S., Kording, Konrad P.

arXiv.org Artificial IntelligenceNov-27-2022

Deep neural networks implement a sequence of layer-by-layer operations that are each relatively easy to understand, but the resulting overall computation is generally difficult to understand. We consider a simple hypothesis for interpreting the layer-by-layer construction of useful representations: perhaps the role of each layer is to reformat information to reduce the "distance" to the desired outputs. With this framework, the layer-wise computation implemented by a deep neural network can be viewed as a path through a high-dimensional representation space. We formalize this intuitive idea of a "path" by leveraging recent advances in *metric* representational similarity. We extend existing representational distance methods by computing geodesics, angles, and projections of representations, going beyond mere layer distances. We then demonstrate these tools by visualizing and comparing the paths taken by ResNet and VGG architectures on CIFAR-10. We conclude by sketching additional ways that this kind of representational geometry can be used to understand and interpret network training, and to describe novel kinds of similarities between different models.

artificial intelligence, machine learning, representation, (18 more...)

arXiv.org Artificial Intelligence

2206.10999

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
North America > Canada > Quebec > Montreal (0.14)
North America > United States > New York (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Shape Proportions and Sphericity in n Dimensions

Lamberti, William Franz

arXiv.org Machine LearningAug-12-2022

Shape metrics for objects in high dimensions remain sparse. Those that do exist, such as hyper-volume, remain limited to objects that are better understood such as Platonic solids and $n$-Cubes. Further, understanding objects of ill-defined shapes in higher dimensions is ambiguous at best. Past work does not provide a single number to give a qualitative understanding of an object. For example, the eigenvalues from principal component analysis results in $n$ metrics to describe the shape of an object. Therefore, we need a single number which can discriminate objects with different shape from one another. Previous work has developed shape metrics for specific dimensions such as two or three dimensions. However, there is an opportunity to develop metrics for any desired dimension. To that end, we present two new shape metrics for objects in a given number of dimensions: hyper-Sphericity and hyper-Shape Proportion (SP). We explore the proprieties of these metrics on a number of different shapes including $n$-balls. We then connect these metrics to applications of analyzing the shape of multidimensional data such as the popular Iris dataset.

artificial intelligence, machine learning, shape metric, (13 more...)

arXiv.org Machine Learning

2208.06292

Country:

North America > United States > Virginia > Albemarle County > Charlottesville (0.14)
North America > Canada > Ontario > National Capital Region > Ottawa (0.14)
North America > United States > Illinois > Cook County > Chicago (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.34)

Add feedback

Design Patterns: The Bridge Pattern

#artificialintelligenceSep-29-2021, 08:20:22 GMT

The bridge pattern is a very popular patterns that is used often in object-oriented architectures. It is a pattern that is difficult to grasp initially and can often be difficult to identify. It…

calculatearea, implementation, interface, (15 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.55)

Add feedback

Post-mortem on a deep learning contest: a Simpson's paradox and the complementary roles of scale metrics versus shape metrics

Martin, Charles H., Mahoney, Michael W.

arXiv.org Machine LearningJun-1-2021

To understand better the causes of good generalization performance in state-of-the-art neural network (NN) models, we analyze of a corpus of models that was made publicly-available for a contest to predict the generalization accuracy of NNs. These models include a wide range of qualities and were trained with a range of architectures and regularization hyperparameters. We identify what amounts to a Simpson's paradox: where "scale" metrics (from traditional statistical learning theory) perform well overall but perform poorly on subpartitions of the data of a given depth, when regularization hyperparameters are varied; and where "shape" metrics (from Heavy-Tailed Self Regularization theory) perform well on subpartitions of the data, when hyperparameters are varied for models of a given depth, but perform poorly overall when models with varying depths are aggregated. Our results highlight the subtly of comparing models when both architectures and hyperparameters are varied, as well as the complementary role of implicit scale versus implicit shape parameters in understanding NN model quality. Our results also suggest caution when one tries to extract causal insight with a single metric applied to aggregate data, and they highlight the need to go beyond one-size-fits-all metrics based on upper bounds from generalization theory to describe the performance of state-of-the-art NN models. Based on these findings, we present two novel shape metrics, one data-independent, and the other data-dependent, which can predict trends in the test accuracy of a series of NNs, of a fixed architecture/depth, when varying solver hyperparameters.

accuracy, logspectralnorm, metric, (16 more...)

arXiv.org Machine Learning

2106.00734

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre: Research Report > New Finding (0.86)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback