AITopics

2309.12226

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > India > Karnataka > Bengaluru (0.04)
(12 more...)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.48)

arXiv.org Artificial IntelligenceSep-20-2023

Black-box Generalization of Machine Teaching

Cao, Xiaofeng, Guo, Yaming, Tsang, Ivor W., Kwok, James T.

Hypothesis-pruning maximizes the hypothesis updates for active learning to find those desired unlabeled data. An inherent assumption is that this learning manner can derive those updates into the optimal hypothesis. However, its convergence may not be guaranteed well if those incremental updates are negative and disordered. In this paper, we introduce a black-box teaching hypothesis $h^\mathcal{T}$ employing a tighter slack term $\left(1+\mathcal{F}^{\mathcal{T}}(\widehat{h}_t)\right)\Delta_t$ to replace the typical $2\Delta_t$ for pruning. Theoretically, we prove that, under the guidance of this teaching hypothesis, the learner can converge into a tighter generalization error and label complexity bound than those non-educated learners who do not receive any guidance from a teacher:1) the generalization error upper bound can be reduced from $R(h^*)+4\Delta_{T-1}$ to approximately $R(h^{\mathcal{T}})+2\Delta_{T-1}$, and 2) the label complexity upper bound can be decreased from $4 \theta\left(TR(h^{*})+2O(\sqrt{T})\right)$ to approximately $2\theta\left(2TR(h^{\mathcal{T}})+3 O(\sqrt{T})\right)$. To be strict with our assumption, self-improvement of teaching is firstly proposed when $h^\mathcal{T}$ loosely approximates $h^*$. Against learning, we further consider two teaching scenarios: teaching a white-box and black-box learner. Experiments verify this idea and show better generalization performance than the fundamental active learning strategies, such as IWAL, IWAL-D, etc.

candidate hypothesis, hypothesis, learner, (13 more...)

2206.15205

Country:

North America > United States > New York (0.04)
Asia > Singapore (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Education (1.00)
Transportation > Air (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Machine LearningSep-20-2023

Multiclass Learnability Does Not Imply Sample Compression

Pabbaraju, Chirag

A hypothesis class admits a sample compression scheme, if for every sample labeled by a hypothesis from the class, it is possible to retain only a small subsample, using which the labels on the entire sample can be inferred. The size of the compression scheme is an upper bound on the size of the subsample produced. Every learnable binary hypothesis class (which must necessarily have finite VC dimension) admits a sample compression scheme of size only a finite function of its VC dimension, independent of the sample size. For multiclass hypothesis classes, the analog of VC dimension is the DS dimension. We show that the analogous statement pertaining to sample compression is not true for multiclass hypothesis classes: every learnable multiclass hypothesis class, which must necessarily have finite DS dimension, does not admit a sample compression scheme of size only a finite function of its DS dimension.

artificial intelligence, compression scheme, machine learning, (17 more...)

2308.06424

Country:

North America > United States > Indiana > Jackson County > Seymour (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.90)

Golowich, Noah, Moitra, Ankur, Rohatgi, Dhruv

Exploring and Learning in Sparse Linear MDPs without Computationally Intractable Oracles

arXiv.org Machine LearningSep-18-2023

The key assumption underlying linear Markov Decision Processes (MDPs) is that the learner has access to a known feature map $\phi(x, a)$ that maps state-action pairs to $d$-dimensional vectors, and that the rewards and transitions are linear functions in this representation. But where do these features come from? In the absence of expert domain knowledge, a tempting strategy is to use the ``kitchen sink" approach and hope that the true features are included in a much larger set of potential features. In this paper we revisit linear MDPs from the perspective of feature selection. In a $k$-sparse linear MDP, there is an unknown subset $S \subset [d]$ of size $k$ containing all the relevant features, and the goal is to learn a near-optimal policy in only poly$(k,\log d)$ interactions with the environment. Our main result is the first polynomial-time algorithm for this problem. In contrast, earlier works either made prohibitively strong assumptions that obviated the need for exploration, or required solving computationally intractable optimization problems. Along the way we introduce the notion of an emulator: a succinct approximate representation of the transitions that suffices for computing certain Bellman backups. Since linear MDPs are a non-parametric model, it is not even obvious whether polynomial-sized emulators exist. We show that they do exist and can be computed efficiently via convex programming. As a corollary of our main result, we give an algorithm for learning a near-optimal policy in block MDPs whose decoding function is a low-depth decision tree; the algorithm runs in quasi-polynomial time and takes a polynomial number of samples. This can be seen as a reinforcement learning analogue of classic results in computational learning theory. Furthermore, it gives a natural model where improving the sample complexity via representation learning is computationally feasible.

artificial intelligence, machine learning, reinforcement learning, (20 more...)

2309.09457

Country:

North America > United States (0.67)
Europe (0.14)

Genre: Workflow (0.93)

Industry:

Energy > Oil & Gas (0.67)
Health & Medicine (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Talwai, Prem, Simchi-Levi, David

Optimal Learning Rates for Regularized Least-Squares with a Fourier Capacity Condition

arXiv.org Machine LearningSep-15-2023

We derive minimax adaptive rates for a new, broad class of Tikhonov-regularized learning problems in Hilbert scales under general source conditions. Our analysis does not require the regression function to be contained in the hypothesis class, and most notably does not employ the conventional \textit{a priori} assumptions on kernel eigendecay. Using the theory of interpolation, we demonstrate that the spectrum of the Mercer operator can be inferred in the presence of ``tight'' $L^{\infty}(\mathcal{X})$ embeddings of suitable Hilbert scales. Our analysis utilizes a new Fourier isocapacitary condition, which captures the interplay of the kernel Dirichlet capacities and small ball probabilities via the optimal Hilbert scale function.

artificial intelligence, fourier capacity condition, machine learning, (2 more...)

2204.07856

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.40)

Zhong, Ziliang Samuel, Ling, Shuyang

Improved theoretical guarantee for rank aggregation via spectral method

arXiv.org Machine LearningSep-10-2023

Given pairwise comparisons between multiple items, how to rank them so that the ranking matches the observations? This problem, known as rank aggregation, has found many applications in sports, recommendation systems, and other web applications. As it is generally NP-hard to find a global ranking that minimizes the mismatch (known as the Kemeny optimization), we focus on the Erd\"os-R\'enyi outliers (ERO) model for this ranking problem. Here, each pairwise comparison is a corrupted copy of the true score difference. We investigate spectral ranking algorithms that are based on unnormalized and normalized data matrices. The key is to understand their performance in recovering the underlying scores of each item from the observed data. This reduces to deriving an entry-wise perturbation error bound between the top eigenvectors of the unnormalized/normalized data matrix and its population counterpart. By using the leave-one-out technique, we provide a sharper $\ell_{\infty}$-norm perturbation bound of the eigenvectors and also derive an error bound on the maximum displacement for each item, with only $\Omega(n\log n)$ samples. Our theoretical analysis improves upon the state-of-the-art results in terms of sample complexity, and our numerical experiments confirm these theoretical findings.

artificial intelligence, machine learning, snr 1, (15 more...)

2309.03808

Country:

Asia > China > Shanghai > Shanghai (0.04)
North America > United States > New York (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.34)

Hlavackova-Schindler, Katerina, Melnykova, Anna, Tubikanec, Irene

Granger Causal Inference in Multivariate Hawkes Processes by Minimum Message Length

arXiv.org Artificial IntelligenceSep-5-2023

Multivariate Hawkes processes (MHPs) are versatile probabilistic tools used to model various real-life phenomena: earthquakes, operations on stock markets, neuronal activity, virus propagation and many others. In this paper, we focus on MHPs with exponential decay kernels and estimate connectivity graphs, which represent the Granger causal relations between their components. We approach this inference problem by proposing an optimization criterion and model selection algorithm based on the minimum message length (MML) principle. MML compares Granger causal models using the Occam's razor principle in the following way: even when models have a comparable goodness-of-fit to the observed data, the one generating the most concise explanation of the data is preferred. While most of the state-of-art methods using lasso-type penalization tend to overfitting in scenarios with short time horizons, the proposed MML-based method achieves high F1 scores in these settings. We conduct a numerical study comparing the proposed algorithm to other related classical and state-of-art methods, where we achieve the highest F1 scores in specific sparse graph settings. We illustrate the proposed method also on G7 sovereign bond data and obtain causal connections, which are in agreement with the expert knowledge available in the literature.

connectivity graph, hawke process, message length, (15 more...)

2309.02027

Country:

Europe > Austria > Vienna (0.14)
North America > United States (0.14)
Europe > Germany (0.04)
(8 more...)

Genre: Research Report (0.64)

Industry: Banking & Finance (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)

arXiv.org Artificial IntelligenceSep-3-2023

Towards Low-Barrier Cybersecurity Research and Education for Industrial Control Systems

McGuan, Colman, Yu, Chansu, Lin, Qin

The protection of Industrial Control Systems (ICS) that are employed in public critical infrastructures is of utmost importance due to catastrophic physical damages cyberattacks may cause. The research community requires testbeds for validation and comparing various intrusion detection algorithms to protect ICS. However, there exist high barriers to entry for research and education in the ICS cybersecurity domain due to expensive hardware, software, and inherent dangers of manipulating real-world systems. To close the gap, built upon recently developed 3D high-fidelity simulators, we further showcase our integrated framework to automatically launch cyberattacks, collect data, train machine learning models, and evaluate for practical chemical and manufacturing processes. On our testbed, we validate our proposed intrusion detection model called Minimal Threshold and Window SVM (MinTWin SVM) that utilizes unsupervised machine learning via a one-class SVM in combination with a sliding window and classification threshold. Results show that MinTWin SVM minimizes false positives and is responsive to physical process anomalies. Furthermore, we incorporate our framework with ICS cybersecurity education by using our dataset in an undergraduate machine learning course where students gain hands-on experience in practicing machine learning theory with a practical ICS dataset. All of our implementations have been open-sourced.

dataset, factory, mintwin svm, (13 more...)

2308.16769

Country:

North America > United States > Tennessee (0.04)
North America > United States > Ohio > Cuyahoga County > Cleveland (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Asia (0.04)

Genre: Research Report > New Finding (0.34)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.54)

Mutreja, Saachi, Shafer, Jonathan

PAC Verification of Statistical Algorithms

arXiv.org Machine LearningSep-2-2023

Goldwasser et al. (2021) recently proposed the setting of PAC verification, where a hypothesis (machine learning model) that purportedly satisfies the agnostic PAC learning objective is verified using an interactive proof. In this paper we develop this notion further in a number of ways. First, we prove a lower bound of $\Omega\left(\sqrt{d}/\varepsilon^2\right)$ i.i.d.\ samples for PAC verification of hypothesis classes of VC dimension $d$. Second, we present a protocol for PAC verification of unions of intervals over $\mathbb{R}$ that improves upon their proposed protocol for that task, and matches our lower bound's dependence on $d$. Third, we introduce a natural generalization of their definition to verification of general statistical algorithms, which is applicable to a wider variety of settings beyond agnostic PAC learning. Showcasing our proposed definition, our final result is a protocol for the verification of statistical query algorithms that satisfy a combinatorial constraint on their queries.

algorithm, artificial intelligence, machine learning, (15 more...)

2211.17096

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > North Carolina > Wake County > Raleigh (0.04)
(5 more...)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Rad, Vahid Pashaei, Rad, Vahid Azimi, Sotubadi, Saleh Valizadeh

Spiking based Cellular Learning Automata (SCLA) algorithm for mobile robot motion formulation

arXiv.org Artificial IntelligenceSep-1-2023

In this paper a new method called SCLA which stands for Spiking based Cellular Learning Automata is proposed for a mobile robot to get to the target from any random initial point. The proposed method is a result of the integration of both cellular automata and spiking neural networks. The environment consists of multiple squares of the same size and the robot only observes the neighboring squares of its current square. It should be stated that the robot only moves either up and down or right and left. The environment returns feedback to the learning automata to optimize its decision making in the next steps resulting in cellular automata training. Simultaneously a spiking neural network is trained to implement long term improvements and reductions on the paths. The results show that the integration of both cellular automata and spiking neural network ends up in reinforcing the proper paths and training time reduction at the same time.

algorithm, cellular learning automata, mobile robot motion formulation, (2 more...)

2309.00241

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Learning Automata (0.80)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.64)
Information Technology > Artificial Intelligence > Robots > Locomotion (0.60)