diagonal line
- North America > United States > New York > Tompkins County > Ithaca (0.04)
- North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
- North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
- (2 more...)
- North America > United States > New York > Tompkins County > Ithaca (0.04)
- North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
- North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
- (2 more...)
GlyphPattern: An Abstract Pattern Recognition for Vision-Language Models
Wu, Zixuan, Kim, Yoolim, Anderson, Carolyn Jane
Vision-Language Models (VLMs) building upon the foundation of powerful large language models have made rapid progress in reasoning across visual and textual data. While VLMs perform well on vision tasks that they are trained on, our results highlight key challenges in abstract pattern recognition. We present GlyphPattern, a 954 item dataset that pairs 318 human-written descriptions of visual patterns from 40 writing systems with three visual presentation styles. GlyphPattern evaluates abstract pattern recognition in VLMs, requiring models to understand and judge natural language descriptions of visual patterns. GlyphPattern patterns are drawn from a large-scale cognitive science investigation of human writing systems; as a result, they are rich in spatial reference and compositionality. Our experiments show that GlyphPattern is challenging for state-of-the-art VLMs (GPT-4o achieves only 55% accuracy), with marginal gains from few-shot prompting. Our detailed error analysis reveals challenges at multiple levels, including visual processing, natural language understanding, and pattern generalization.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
- North America > United States > Massachusetts > Norfolk County > Wellesley (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > Italy > Tuscany > Florence (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.95)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
CONFINE: Conformal Prediction for Interpretable Neural Networks
Huang, Linhui, Lala, Sayeri, Jha, Niraj K.
Deep neural networks exhibit remarkable performance, yet their black-box nature limits their utility in fields like healthcare where interpretability is crucial. Existing explainability approaches often sacrifice accuracy and lack quantifiable measures of prediction uncertainty. In this study, we introduce Conformal Prediction for Interpretable Neural Networks (CONFINE), a versatile framework that generates prediction sets with statistically robust uncertainty estimates instead of point predictions to enhance model transparency and reliability. CONFINE not only provides example-based explanations and confidence estimates for individual predictions but also boosts accuracy by up to 3.6%. We define a new metric, correct efficiency, to evaluate the fraction of prediction sets that contain precisely the correct label and show that CONFINE achieves correct efficiency of up to 3.3% higher than the original accuracy, matching or exceeding prior methods. CONFINE's marginal and class-conditional coverages attest to its validity across tasks spanning medical image classification to language understanding. Being adaptable to any pre-trained classifier, CONFINE marks a significant advance towards transparent and trustworthy deep learning applications in critical domains.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
- Research Report (1.00)
- Instructional Material > Course Syllabus & Notes (0.66)
- Health & Medicine > Therapeutic Area (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (0.88)
How Far Are We from Intelligent Visual Deductive Reasoning?
Zhang, Yizhe, Bai, He, Zhang, Ruixiang, Gu, Jiatao, Zhai, Shuangfei, Susskind, Josh, Jaitly, Navdeep
Vision-Language Models (VLMs) such as GPT-4V have recently demonstrated incredible strides on diverse vision language tasks. We dig into vision-based deductive reasoning, a more sophisticated but less explored realm, and find previously unexposed blindspots in the current SOTA VLMs. Specifically, we leverage Raven's Progressive Matrices (RPMs), to assess VLMs' abilities to perform multi-hop relational and deductive reasoning relying solely on visual clues. We perform comprehensive evaluations of several popular VLMs employing standard strategies such as in-context learning, self-consistency, and Chain-of-thoughts (CoT) on three diverse datasets, including the Mensa IQ test, IntelligenceTest, and RAVEN. The results reveal that despite the impressive capabilities of LLMs in text-based reasoning, we are still far from achieving comparable proficiency in visual deductive reasoning. We found that certain standard strategies that are effective when applied to LLMs do not seamlessly translate to the challenges presented by visual reasoning tasks. Moreover, a detailed analysis reveals that VLMs struggle to solve these tasks mainly because they are unable to perceive and comprehend multiple, confounding abstract patterns in RPM examples.
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)
Forecasting VIX using Bayesian Deep Learning
Hortúa, Héctor J., Mora-Valencia, Andrés
Investors and regulators are concerned about financial market volatility and crashes. For this reason, the Volatility index (VIX) was introduced in 1993 by the Chicago Board Options Exchange (CBOE) with the aim of assessing the expected financial market volatility in the short-run, i.e. for the next 30 days, since it is calculated as an implied volatility from the options on the S&P 500 index on this time-to-maturity [1]. The VIX has been proven to be a good predictor of expected stock index shifts, and therefore as an early warning for investor sentiment and financial market turbulences (see e.g., [1], and more recently, [2]). Due to its importance for asset managers and regulators, it would be useful to foresee the values of the index; however, the VIX is very difficult to forecast [3]. There exist several proposals to predict time series found in the literature classified as conventional and modern methods (see e.g., [4] and the references therein).
- North America > United States > Illinois > Cook County > Chicago (0.24)
- South America > Colombia > Bogotá D.C. > Bogotá (0.04)
- North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
- (5 more...)
- Overview (1.00)
- Research Report > New Finding (0.46)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.95)
- Research Report > New Finding (0.40)
- Research Report > Experimental Study (0.40)
Expanding Holographic Embeddings for Knowledge Completion
Xue, Yexiang, Yuan, Yang, Xu, Zhitian, Sabharwal, Ashish
Neural models operating over structured spaces such as knowledge graphs require a continuous embedding of the discrete elements of this space (such as entities) as well as the relationships between them. Relational embeddings with high expressivity, however, have high model complexity, making them computationally difficult to train. We propose a new family of embeddings for knowledge graphs that interpolate between a method with high model complexity and one, namely Holographic embeddings (HolE), with low dimensionality and high training efficiency. This interpolation, termed HolEx, is achieved by concatenating several linearly perturbed copies of original HolE. We formally characterize the number of perturbed copies needed to provably recover the full entity-entity or entity-relation interaction matrix, leveraging ideas from Haar wavelets and compressed sensing. In practice, using just a handful of Haar-based or random perturbation vectors results in a much stronger knowledge completion system. On the Freebase FB15K dataset, HolEx outperforms originally reported HolE by 14.7\% on the HITS@10 metric, and the current path-based state-of-the-art method, PTransE, by 4\% (absolute).
- North America > United States > Washington > King County > Seattle (0.04)
- North America > United States > New York > Tompkins County > Ithaca (0.04)
- North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
- (3 more...)
Expanding Holographic Embeddings for Knowledge Completion
Xue, Yexiang, Yuan, Yang, Xu, Zhitian, Sabharwal, Ashish
Neural models operating over structured spaces such as knowledge graphs require a continuous embedding of the discrete elements of this space (such as entities) as well as the relationships between them. Relational embeddings with high expressivity, however, have high model complexity, making them computationally difficult to train. We propose a new family of embeddings for knowledge graphs that interpolate between a method with high model complexity and one, namely Holographic embeddings (HolE), with low dimensionality and high training efficiency. This interpolation, termed HolEx, is achieved by concatenating several linearly perturbed copies of original HolE. We formally characterize the number of perturbed copies needed to provably recover the full entity-entity or entity-relation interaction matrix, leveraging ideas from Haar wavelets and compressed sensing. In practice, using just a handful of Haar-based or random perturbation vectors results in a much stronger knowledge completion system. On the Freebase FB15K dataset, HolEx outperforms originally reported HolE by 14.7\% on the HITS@10 metric, and the current path-based state-of-the-art method, PTransE, by 4\% (absolute).
- North America > United States > Washington > King County > Seattle (0.04)
- North America > United States > New York > Tompkins County > Ithaca (0.04)
- North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
- (3 more...)
What is Deep Learning and How Does It Work? – Robotic Vision Resources Hub
Now you're thinking: Welcome to the world of machine learning and deep-neural networks Facebook automatically finds and tags friends in your photos. Google Deepmind's AlphaGo computer program trounced champions at the ancient game of Go last year. Skype translates spoken conversations in real time – and pretty accurately too. Behind all this is a type of artificial intelligence called deep learning. But what is deep learning and how does it work?
- North America > Canada > Ontario > Toronto (0.15)
- North America > United States > Maine (0.05)
- Leisure & Entertainment > Games > Go (0.55)
- Information Technology (0.50)