AITopics

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.70)

Neural Information Processing SystemsOct-10-2024, 11:22:31 GMT

First-Order Algorithms for Min-Max Optimization in Geodesic Metric Spaces

From optimal transport to robust dimensionality reduction, many machine learning applicationscan be cast into the min-max optimization problems over Riemannian manifolds. Though manymin-max algorithms have been analyzed in the Euclidean setting, it has been elusive how theseresults translate to the Riemannian case. Zhang et al. (2022) have recently identified that geodesic convexconcave Riemannian problems admit always Sion's saddle point solutions. Immediately, an importantquestion that arises is if a performance gap between the Riemannian and the optimal Euclidean spaceconvex concave algorithms is necessary. Our work is the first to answer the question in the negative:We prove that the Riemannian corrected extragradient (RCEG) method achieves last-iterate at alinear convergence rate at the geodesically strongly convex concave case, matching the euclidean one.Our results also extend to the stochastic or non-smooth case where RCEG & Riemanian gradientascent descent (RGDA) achieve respectively near-optimal convergence rates up to factors dependingon curvature of the manifold.

first-order algorithm, geodesic metric space, min-max optimization, (3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.40)

Neural Information Processing SystemsOct-10-2024, 06:21:18 GMT

Beyond Vector Spaces: Compact Data Representation as Differentiable Weighted Graphs

Learning useful representations is a key ingredient to the success of modern machine learning. Currently, representation learning mostly relies on embedding data into Euclidean space. However, recent work has shown that data in some domains is better modeled by non-euclidean metric spaces, and inappropriate geometry can result in inferior performance. In this paper, we aim to eliminate the inductive bias imposed by the embedding space geometry. Namely, we propose to map data into more general non-vector metric spaces: a weighted graph with a shortest path distance.

compact data representation, differentiable weighted graph, vector space, (3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (1.00)

Neural Information Processing SystemsOct-8-2024, 10:09:06 GMT

Reviews: PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space

As clearly indicated in the title, this paper submission is an extension of the PointNet work of [19], to appear at CVPR 2017. The goal is to classify and segment (3D) point clouds. Novel contributions over [19] are the use of a hierarchical network, leveraging neighbourhoods at different scales, and a mechanism to deal with varying sampling densities, effectively generating receptive fields that vary in a data dependent manner. All this leads to state-of-the-art results. PointNet seems an important extension over PointNet, in that it allows to properly exploit local spatial information.

deep hierarchical feature learning, metric space, point set, (3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.40)

Neural Information Processing SystemsOct-4-2024, 07:17:24 GMT

PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space

Charles Ruizhongtai Qi, Li Yi, Hao Su, Leonidas J. Guibas

Neural Information Processing Systems http://nips.cc/

neighborhood, point cloud, pointnet, (15 more...)

Country:

North America > United States > California > Los Angeles County > Long Beach (0.04)
Asia (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

arXiv.org Artificial IntelligenceSep-25-2024

Numerical Approximation Capacity of Neural Networks with Bounded Parameters: Do Limits Exist, and How Can They Be Measured?

Liu, Li, Yu, Tengchao, Yong, Heng

The Universal Approximation Theorem posits that neural networks can theoretically possess unlimited approximation capacity with a suitable activation function and a freely chosen or trained set of parameters. However, a more practical scenario arises when these neural parameters, especially the nonlinear weights and biases, are bounded. This leads us to question: \textbf{Does the approximation capacity of a neural network remain universal, or does it have a limit when the parameters are practically bounded? And if it has a limit, how can it be measured?} Our theoretical study indicates that while universal approximation is theoretically feasible, in practical numerical scenarios, Deep Neural Networks (DNNs) with any analytic activation functions (such as Tanh and Sigmoid) can only be approximated by a finite-dimensional vector space under a bounded nonlinear parameter space (NP space), whether in a continuous or discrete sense. Based on this study, we introduce the concepts of \textit{$\epsilon$ outer measure} and \textit{Numerical Span Dimension (NSdim)} to quantify the approximation capacity limit of a family of networks both theoretically and practically. Furthermore, drawing on our new theoretical study and adopting a fresh perspective, we strive to understand the relationship between back-propagation neural networks and random parameter networks (such as the Extreme Learning Machine (ELM)) with both finite and infinite width. We also aim to provide fresh insights into regularization, the trade-off between width and depth, parameter space, width redundancy, condensation, and other related important issues.

neural network, nsdim, parameter space, (12 more...)

2409.16697

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.34)

Wang, Fang, Gacesa, Marko

Semi-strong Efficient Market of Bitcoin and Twitter: an Analysis of Semantic Vector Spaces of Extracted Keywords and Light Gradient Boosting Machine Models

arXiv.org Artificial IntelligenceSep-24-2024

This study extends the examination of the Efficient-Market Hypothesis in Bitcoin market during a five year fluctuation period, from September 1 2017 to September 1 2022, by analyzing 28,739,514 qualified tweets containing the targeted topic "Bitcoin". Unlike previous studies, we extracted fundamental keywords as an informative proxy for carrying out the study of the EMH in the Bitcoin market rather than focusing on sentiment analysis, information volume, or price data. We tested market efficiency in hourly, 4-hourly, and daily time periods to understand the speed and accuracy of market reactions towards the information within different thresholds. A sequence of machine learning methods and textual analyses were used, including measurements of distances of semantic vector spaces of information, keywords extraction and encoding model, and Light Gradient Boosting Machine (LGBM) classifiers. Our results suggest that 78.06% (83.08%), 84.63% (87.77%), and 94.03% (94.60%) of hourly, 4-hourly, and daily bullish (bearish) market movements can be attributed to public information within organic tweets.

information, market movement, tweet, (15 more...)

2409.15988

Country:

Asia > China (0.04)
Africa > Nigeria (0.04)
Oceania > Australia (0.04)
(5 more...)

Genre: Research Report > New Finding (1.00)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > e-Commerce > Financial Technology (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.71)
(4 more...)

arXiv.org Artificial IntelligenceSep-19-2024

Universal approximation theorem for neural networks with inputs from a topological vector space

Ismailov, Vugar

We study feedforward neural networks with inputs from a topological vector space (TVS-FNNs). Unlike traditional feedforward neural networks, TVS-FNNs can process a broader range of inputs, including sequences, matrices, functions and more. We prove a universal approximation theorem for TVS-FNNs, which demonstrates their capacity to approximate any continuous function defined on this expanded input space.

activation function, approximation theorem, neural network, (12 more...)

2409.12913

Country:

North America > United States > New York (0.04)
Asia > Azerbaijan > Baku Economic Region > Baku (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.62)

arXiv.org Artificial IntelligenceSep-3-2024

Decompose the model: Mechanistic interpretability in image models with Generalized Integrated Gradients (GIG)

Kim, Yearim, Han, Sangyu, Han, Sangbum, Kwak, Nojun

In the field of eXplainable AI (XAI) in language models, the progression from local explanations of individual decisions to global explanations with high-level concepts has laid the groundwork for mechanistic interpretability, which aims to decode the exact operations. However, this paradigm has not been adequately explored in image models, where existing methods have primarily focused on classspecific interpretations. This paper introduces a novel approach to systematically trace the entire pathway from input through all intermediate layers to the final output within the whole dataset. We utilize Pointwise Feature Vectors (PFVs) and Effective Receptive Fields (ERFs) to decompose model embeddings into interpretable Concept Vectors. Then, we calculate the relevance between concept vectors with our Generalized Integrated Gradients (GIG), enabling a comprehensive, dataset-wide analysis of model behavior. In the field of eXplainable AI (XAI), efforts have historically transitioned from Local explanation to Global explanation to Mechanistic Interpretability. While local explanation methods including Selvaraju et al. (2016); Montavon et al. (2017); Sundararajan et al. (2017); Han et al. (2024) have focused on explaining specific decisions for individual instances, global explanation methods seek to uncover overall patterns and behaviors applicable across the entire dataset (Wu et al., 2022; Xuanyuan et al., 2023; Singh et al., 2024).

attribution, concept vector, explanation, (13 more...)

2409.0161

Country: Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report > Promising Solution (0.48)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

arXiv.org Artificial IntelligenceSep-2-2024

Efficient and Scalable Estimation of Tool Representations in Vector Space

Moon, Suhong, Jha, Siddharth, Erdogan, Lutfi Eren, Kim, Sehoon, Lim, Woosang, Keutzer, Kurt, Gholami, Amir

Recent advancements in function calling and tool use have significantly enhanced the capabilities of large language models (LLMs) by enabling them to interact with external information sources and execute complex tasks. However, the limited context window of LLMs presents challenges when a large number of tools are available, necessitating efficient methods to manage prompt length and maintain accuracy. Existing approaches, such as fine-tuning LLMs or leveraging their reasoning capabilities, either require frequent retraining or incur significant latency overhead. A more efficient solution involves training smaller models to retrieve the most relevant tools for a given query, although this requires high quality, domain-specific data. To address those challenges, we present a novel framework for generating synthetic data for tool retrieval applications and an efficient data-driven tool retrieval strategy using small encoder models. Empowered by LLMs, we create ToolBank, a new tool retrieval dataset that reflects real human user usages. For tool retrieval methodologies, we propose novel approaches: (1) Tool2Vec: usage-driven tool embedding generation for tool retrieval, (2) ToolRefiner: a staged retrieval method that iteratively improves the quality of retrieved tools, and (3) MLC: framing tool retrieval as a multi-label classification problem. With these new methods, we achieve improvements of up to 27.28 in Recall@K on the ToolBench dataset and 30.5 in Recall@K on ToolBank. Additionally, we present further experimental results to rigorously validate our methods. Our code is available at \url{https://github.com/SqueezeAILab/Tool2Vec}

efficient and scalable estimation, tool representation, vector space

2409.02141

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.40)