AITopics | Geisa, Ali

Plotting

Geisa, Ali

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Why do networks have inhibitory/negative connections?

Wang, Qingyang, Powell, Michael A., Geisa, Ali, Bridgeford, Eric, Priebe, Carey E., Vogelstein, Joshua T.

arXiv.org Artificial IntelligenceAug-17-2023

Why do brains have inhibitory connections? Why do deep networks have negative weights? We propose an answer from the perspective of representation capacity. We believe representing functions is the primary role of both (i) the brain in natural intelligence, and (ii) deep networks in artificial intelligence. Our answer to why there are inhibitory/negative weights is: to learn more functions. We prove that, in the absence of negative weights, neural networks with non-decreasing activation functions are not universal approximators. While this may be an intuitive result to some, to the best of our knowledge, there is no formal theory, in either machine learning or neuroscience, that demonstrates why negative weights are crucial in the context of representation capacity. Further, we provide insights on the geometric properties of the representation space that non-negative deep networks cannot represent. We expect these insights will yield a deeper understanding of more sophisticated inductive priors imposed on the distribution of weights that lead to more efficient biological and machine learning.

artificial intelligence, decision boundary, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2208.03211

Country: North America > United States > New York (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.89)
Government > Regional Government > North America Government > United States Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Polarity is all you need to learn and transfer faster

Wang, Qingyang, Powell, Michael A., Geisa, Ali, Bridgeford, Eric W., Vogelstein, Joshua T.

arXiv.org Artificial IntelligenceMay-30-2023

Natural intelligences (NIs) thrive in a dynamic world - they learn quickly, sometimes with only a few samples. In contrast, artificial intelligences (AIs) typically learn with a prohibitive number of training samples and computational power. What design principle difference between NI and AI could contribute to such a discrepancy? Here, we investigate the role of weight polarity: development processes initialize NIs with advantageous polarity configurations; as NIs grow and learn, synapse magnitudes update, yet polarities are largely kept unchanged. We demonstrate with simulation and image classification tasks that if weight polarities are adequately set a priori, then networks learn with less time and data. We also explicitly illustrate situations in which a priori setting the weight polarities is disadvantageous for networks. Our work illustrates the value of weight polarities from the perspective of statistical and computational efficiency during learning.

artificial intelligence, machine learning, polarity, (16 more...)

arXiv.org Artificial Intelligence

2303.17589

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Vision (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Out-of-distribution and in-distribution posterior calibration using Kernel Density Polytopes

Dey, Jayanta, De Silva, Ashwin, LeVine, Will, Shin, Jong M., Xu, Haoyin, Geisa, Ali, Chu, Tiffany, Isik, Leyla, Vogelstein, Joshua T.

arXiv.org Machine LearningFeb-14-2022

Any reasonable machine learning (ML) model should not only interpolate efficiently in between the training samples provided (in-distribution region), but also approach the extrapolative or out-of-distribution (OOD) region without being overconfident. Our experiment on human subjects justifies the aforementioned properties for human intelligence as well. Many state-of-the-art algorithms have tried to fix the overconfidence problem of ML models in the OOD region. However, in doing so, they have often impaired the in-distribution performance of the model. Our key insight is that ML models partition the feature space into polytopes and learn constant (random forests) or affine (ReLU networks) functions over those polytopes. This leads to the OOD overconfidence problem for the polytopes which lie in the training data boundary and extend to infinity. To resolve this issue, we propose kernel density methods that fit Gaussian kernel over the polytopes, which are learned using ML models. Specifically, we introduce two variants of kernel density polytopes: Kernel Density Forest (KDF) and Kernel Density Network (KDN) based on random forests and deep networks, respectively. Studies on various simulation settings show that both KDF and KDN achieve uniform confidence over the classes in the OOD region while maintaining good in-distribution accuracy compared to that of their respective parent models.

artificial intelligence, machine learning, out-of-distribution and in-distribution posterior calibration, (1 more...)

arXiv.org Machine Learning

2201.13001

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Prospective Learning: Back to the Future

Vogelstein, Joshua T., Verstynen, Timothy, Kording, Konrad P., Isik, Leyla, Krakauer, John W., Etienne-Cummings, Ralph, Ogburn, Elizabeth L., Priebe, Carey E., Burns, Randal, Kutten, Kwame, Knierim, James J., Potash, James B., Hartung, Thomas, Smirnova, Lena, Worley, Paul, Savonenko, Alena, Phillips, Ian, Miller, Michael I., Vidal, Rene, Sulam, Jeremias, Charles, Adam, Cowan, Noah J., Bichuch, Maxim, Venkataraman, Archana, Li, Chen, Thakor, Nitish, Kebschull, Justus M, Albert, Marilyn, Xu, Jinchong, Shuler, Marshall Hussain, Caffo, Brian, Ratnanather, Tilak, Geisa, Ali, Roh, Seung-Eon, Yezerets, Eva, Madhyastha, Meghana, How, Javier J., Tomita, Tyler M., Dey, Jayanta, Ningyuan, null, Huang, null, Shin, Jong M., Kinfu, Kaleab Alemayehu, Chaudhari, Pratik, Baker, Ben, Schapiro, Anna, Jayaraman, Dinesh, Eaton, Eric, Platt, Michael, Ungar, Lyle, Wehbe, Leila, Kepecs, Adam, Christensen, Amy, Osuagwu, Onyema, Brunton, Bing, Mensh, Brett, Muotri, Alysson R., Silva, Gabriel, Puppo, Francesca, Engert, Florian, Hillman, Elizabeth, Brown, Julia, White, Chris, Yang, Weiwei

arXiv.org Artificial IntelligenceJan-18-2022

Research on both natural intelligence (NI) and artificial intelligence (AI) generally assumes that the future resembles the past: intelligent agents or systems (what we call 'intelligence') observe and act on the world, then use this experience to act on future experiences of the same kind. We call this 'retrospective learning'. For example, an intelligence may see a set of pictures of objects, along with their names, and learn to name them. A retrospective learning intelligence would merely be able to name more pictures of the same objects. We argue that this is not what true intelligence is about. In many real world problems, both NIs and AIs will have to learn for an uncertain future. Both must update their internal models to be useful for future tasks, such as naming fundamentally new objects and using these objects effectively in a new context or to achieve previously unencountered goals. This ability to learn for the future we call 'prospective learning'. We articulate four relevant factors that jointly define prospective learning. Continual learning enables intelligences to remember those aspects of the past which it believes will be most useful in the future. Prospective constraints (including biases and priors) facilitate the intelligence finding general solutions that will be applicable to future problems. Curiosity motivates taking actions that inform future decision making, including in previously unmet situations. Causal estimation enables learning the structure of relations that guide choosing actions for specific outcomes, even when the specific action-outcome contingencies have never been observed before. We argue that a paradigm shift from retrospective to prospective learning will enable the communities that study intelligence to unite and overcome existing bottlenecks to more effectively explain, augment, and engineer intelligences.

machine learning, natural language, reinforcement learning, (22 more...)

arXiv.org Artificial Intelligence

2201.07372

Country:

Europe > United Kingdom > England (0.14)
North America > United States > Texas (0.14)
North America > United States > Pennsylvania (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.82)

Industry:

Leisure & Entertainment > Games (1.00)
Education (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
(4 more...)

Add feedback

Inducing a hierarchy for multi-class classification problems

Helm, Hayden S., Yang, Weiwei, Bharadwaj, Sujeeth, Lytvynets, Kate, Riva, Oriana, White, Christopher, Geisa, Ali, Priebe, Carey E.

arXiv.org Machine LearningFeb-20-2021

In applications where categorical labels follow a natural hierarchy, classification methods that exploit the label structure often outperform those that do not. Unfortunately, the majority of classification datasets do not come pre-equipped with a hierarchical structure and classical "flat" classifiers must be employed. In this paper, we investigate a class of methods that induce a hierarchy that can similarly improve classification performance over flat classifiers. The class of methods follows the structure of first clustering the conditional distributions and subsequently using a hierarchical classifier with the induced hierarchy. We demonstrate the effectiveness of the class of methods both for discovering a latent hierarchy and for improving accuracy in principled simulation settings and three real data applications. Machine learning practitioners are often challenged with the task of classifying an object as one of tens or hundreds of classes. To address these problems, algorithms originally designed for binary or small multi-class problems are applied and naively deployed. In some instances the large set of labels comes pre-equipped with a hierarchical structure - that is, some labels are known to be mutually semantically similar to various degrees.

artificial intelligence, health & medicine, hierarchy, (20 more...)

arXiv.org Machine Learning

2102.10263

Country: Europe > Denmark (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.94)

Add feedback

A partition-based similarity for classification distributions

Helm, Hayden S., Mehta, Ronak D., Duderstadt, Brandon, Yang, Weiwei, White, Christoper M., Geisa, Ali, Vogelstein, Joshua T., Priebe, Carey E.

arXiv.org Machine LearningNov-12-2020

Herein we define a measure of similarity between classification distributions that is both principled from the perspective of statistical pattern recognition and useful from the perspective of machine learning practitioners. In particular, we propose a novel similarity on classification distributions, dubbed task similarity, that quantifies how an optimally-transformed optimal representation for a source distribution performs when applied to inference related to a target distribution. The definition of task similarity allows for natural definitions of adversarial and orthogonal distributions. We highlight limiting properties of representations induced by (universally) consistent decision rules and demonstrate in simulation that an empirical estimate of task similarity is a function of the decision rule deployed for inference. We demonstrate that for a given target distribution, both transfer efficiency and semantic similarity of candidate source distributions correlate with empirical task similarity.

artificial intelligence, neural network, similarity, (22 more...)

arXiv.org Machine Learning

2011.06557

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback