AITopics | bootstrapping

Country:

North America > United States > South Carolina (0.04)
Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsDec-23-2025, 23:15:56 GMT

Bootstrapping the Error of Oja's Algorithm

We consider the problem of quantifying uncertainty for the estimation error of the leading eigenvector from Oja's algorithm for streaming principal component analysis, where the data are generated IID from some unknown distribution. By combining classical tools from the U-statistics literature with recent results on high-dimensional central limit theorems for quadratic forms of random vectors and concentration of matrix products, we establish a weighted $\chi^2$ approximation result for the $\sin^2$ error between the population eigenvector and the output of Oja's algorithm. Since estimating the covariance matrix associated with the approximating distribution requires knowledge of unknown model parameters, we propose a multiplier bootstrap algorithm that may be updated in an online manner. We establish conditions under which the bootstrap distribution is close to the corresponding sampling distribution with high probability, thereby establishing the bootstrap as a consistent inferential method in an appropriate asymptotic regime.

bootstrapping, name change, oja, (4 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.59)

Neural Information Processing SystemsAug-15-2025, 18:27:35 GMT

Neural Bootstrapper Minsuk Shin

Bootstrapping has been a primary tool for ensemble and uncertainty quantification in machine learning and statistics.

artificial intelligence, machine learning, neuboot, (17 more...)

Country:

North America > United States > South Carolina (0.04)
Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Machine LearningJul-15-2025

Information Must Flow: Recursive Bootstrapping for Information Bottleneck in Optimal Transport

Li, Xin

We present the Context-Content Uncertainty Principle (CCUP), a unified framework that models cognition as the directed flow of information between high-entropy context and low-entropy content. Inference emerges as a cycle of bidirectional interactions, bottom-up contextual disambiguation paired with top-down content reconstruction, which resolves the Information Bottleneck in Optimal Transport (iBOT). Implemented via Rao-Blackwellized variational entropy minimization, CCUP steers representations toward minimal joint uncertainty while preserving inferential directionality. Local cycle completion underpins temporal bootstrapping, chaining simulations to refine memory, and spatial bootstrapping, enabling compositional hierarchical inference. We prove a Delta Convergence Theorem showing that recursive entropy minimization yields delta-like attractors in latent space, stabilizing perceptual schemas and motor plans. Temporal bootstrapping through perception-action loops and sleep-wake consolidation further transforms episodic traces into semantic knowledge. Extending CCUP, each hierarchical level performs delta-seeded inference: low-entropy content seeds diffuse outward along goal-constrained paths shaped by top-down priors and external context, confining inference to task-relevant manifolds and circumventing the curse of dimensionality. Building on this, we propose that language emerges as a symbolic transport system, externalizing latent content to synchronize inference cycles across individuals. Together, these results establish iBOT as a foundational principle of information flow in both individual cognition and collective intelligence, positioning recursive inference as the structured conduit through which minds adapt, align, and extend.

artificial intelligence, inference, machine learning, (17 more...)

arXiv.org Machine Learning

2507.10443

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
North America > United States > New York > Albany County > Albany (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)

arXiv.org Artificial IntelligenceJun-16-2025

Bootstrapping your behavior: a new pretraining strategy for user behavior sequence data

Wu, Weichang, Zhang, Xiaolu, Zhou, Jun, Li, Yuchen, Xia, Wenwen

User Behavior Sequence (UBS) modeling is crucial in industrial applications. As data scale and task diversity grow, UBS pretraining methods have become increasingly pivotal. State-of-the-art UBS pretraining methods rely on predicting behavior distributions. The key step in these methods is constructing a selected behavior vocabulary. However, this manual step is labor-intensive and prone to bias. The limitation of vocabulary capacity also directly affects models' generalization ability. In this paper, we introduce Bootstrapping Your Behavior (\model{}), a novel UBS pretraining strategy that predicts an automatically constructed supervision embedding summarizing all behaviors' information within a future time window, eliminating the manual behavior vocabulary selection. In implementation, we incorporate a student-teacher encoder scheme to construct the pretraining supervision effectively. Experiments on two real-world industrial datasets and eight downstream tasks demonstrate that \model{} achieves an average improvement of 3.9\% in AUC and 98.9\% in training throughput. Notably, the model exhibits meaningful attention patterns and cluster representations during pretraining without any label supervision. In our online deployment over two months, the pretrained model improves the KS by about 2.7\% and 7.1\% over the baseline model for two financial overdue risk prediction tasks in the Alipay mobile application, which reduces bad debt risk by millions of dollars for Ant group.

artificial intelligence, machine learning, natural language, (19 more...)

2506.11053

Country:

Asia > China (0.04)
Asia > Singapore (0.04)

Genre: Research Report (0.82)

Industry:

Information Technology (0.68)
Banking & Finance (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.61)

Jacovella, Maxime, Keshavarzi, Ali, Angelini, Elsa

Curriculum Learning for Few-Shot Domain Adaptation in CT-based Airway Tree Segmentation

arXiv.org Artificial IntelligenceNov-8-2024

Despite advances with deep learning (DL), automated airway segmentation from chest CT scans continues to face challenges in segmentation quality and generalization across cohorts. To address these, we propose integrating Curriculum Learning (CL) into airway segmentation networks, distributing the training set into batches according to ad-hoc complexity scores derived from CT scans and corresponding ground-truth tree features. We specifically investigate few-shot domain adaptation, targeting scenarios where manual annotation of a full fine-tuning dataset is prohibitively expensive. Results are reported on two large open-cohorts (ATM22 and AIIB23) with high performance using CL for full training (Source domain) and few-shot fine-tuning (Target domain), but with also some insights on potential detrimental effects if using a classic Bootstrapping scoring function or if not using proper scan sequencing.

artificial intelligence, machine learning, segmentation, (19 more...)

2411.05779

Country:

Europe > United Kingdom > England > Greater London > London (0.05)
North America > United States > New York (0.04)
North America > Canada > Quebec > Montreal (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.91)
Health & Medicine > Therapeutic Area (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.36)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Neural Information Processing SystemsOct-9-2024, 23:31:11 GMT

Bootstrapping the Error of Oja's Algorithm

We consider the problem of quantifying uncertainty for the estimation error of the leading eigenvector from Oja's algorithm for streaming principal component analysis, where the data are generated IID from some unknown distribution. By combining classical tools from the U-statistics literature with recent results on high-dimensional central limit theorems for quadratic forms of random vectors and concentration of matrix products, we establish a weighted \chi 2 approximation result for the \sin 2 error between the population eigenvector and the output of Oja's algorithm. Since estimating the covariance matrix associated with the approximating distribution requires knowledge of unknown model parameters, we propose a multiplier bootstrap algorithm that may be updated in an online manner. We establish conditions under which the bootstrap distribution is close to the corresponding sampling distribution with high probability, thereby establishing the bootstrap as a consistent inferential method in an appropriate asymptotic regime.

algorithm, bootstrapping, oja

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.40)

Zubchenko, Anton, Middlebrooks, Danielle, Rasmussen, Torbjørn, Lausen, Lara, Kuemmeth, Ferdinand, Chatterjee, Anasua, Zwolak, Justyna P.

Autonomous Bootstrapping of Quantum Dot Devices

arXiv.org Artificial IntelligenceJul-29-2024

Semiconductor quantum dots (QD) are a promising platform for multiple different qubit implementations, all of which are voltage-controlled by programmable gate electrodes. However, as the QD arrays grow in size and complexity, tuning procedures that can fully autonomously handle the increasing number of control parameters are becoming essential for enabling scalability. We propose a bootstrapping algorithm for initializing a depletion mode QD device in preparation for subsequent phases of tuning. During bootstrapping, the QD device functionality is validated, all gates are characterized, and the QD charge sensor is made operational. We demonstrate the bootstrapping protocol in conjunction with a coarse tuning module, showing that the combined algorithm can efficiently and reliably take a cooled-down QD device to a desired global state configuration in under 8 minutes with a success rate of 96 %. Importantly, by following heuristic approaches to QD device initialization and combining the efficient ray-based measurement with the rapid radio-frequency reflectometry measurements, the proposed algorithm establishes a reference in terms of performance, reliability, and efficiency against which alternative algorithms can be benchmarked.

algorithm, module, voltage, (17 more...)

2407.20061

Country:

North America > United States > Maryland > Prince George's County > College Park (0.14)
Europe > Denmark > Capital Region > Copenhagen (0.05)
North America > United States > Maryland > Montgomery County > Gaithersburg (0.04)

Genre: Research Report (0.50)

Industry: Government > Regional Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

arXiv.org Artificial IntelligenceFeb-12-2024

Step-On-Feet Tuning: Scaling Self-Alignment of LLMs via Bootstrapping

Wang, Haoyu, Ma, Guozheng, Meng, Ziqiao, Qin, Zeyu, Shen, Li, Zhang, Zhong, Wu, Bingzhe, Liu, Liu, Bian, Yatao, Xu, Tingyang, Wang, Xueqian, Zhao, Peilin

Self-alignment is an effective way to reduce the cost of human annotation while ensuring promising model capability. However, most current methods complete the data collection and training steps in a single round, which may overlook the continuously improving ability of self-aligned models. This gives rise to a key query: What if we do multi-time bootstrapping self-alignment? Does this strategy enhance model performance or lead to rapid degradation? In this paper, our pioneering exploration delves into the impact of bootstrapping self-alignment on large language models. Our findings reveal that bootstrapping self-alignment markedly surpasses the single-round approach, by guaranteeing data diversity from in-context learning. To further exploit the capabilities of bootstrapping, we investigate and adjust the training order of data, which yields improved performance of the model. Drawing on these findings, we propose Step-On-Feet Tuning (SOFT) which leverages model's continuously enhanced few-shot ability to boost zero or one-shot performance. Based on easy-to-hard training recipe, we propose SOFT+ which further boost self-alignment's performance. Our experiments demonstrate the efficiency of SOFT (SOFT+) across various classification and generation tasks, highlighting the potential of bootstrapping self-alignment on continually enhancing model alignment performance.

iclexample, internal thought, reliable assistant, (12 more...)

2402.0761

Country:

North America > Canada (0.14)
Asia > China (0.05)
Europe > Spain (0.04)
(9 more...)

Genre:

Personal (1.00)
Research Report > New Finding (0.87)

Industry:

Leisure & Entertainment (1.00)
Law (1.00)
Health & Medicine > Consumer Health (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

arXiv.org Artificial IntelligenceDec-7-2023

NeuJeans: Private Neural Network Inference with Joint Optimization of Convolution and Bootstrapping

Ju, Jae Hyung, Park, Jaiyoung, Kim, Jongmin, Kim, Donghwan, Ahn, Jung Ho

Fully homomorphic encryption (FHE) is a promising cryptographic primitive for realizing private neural network inference (PI) services by allowing a client to fully offload the inference task to a cloud server while keeping the client data oblivious to the server. This work proposes NeuJeans, an FHE-based solution for the PI of deep convolutional neural networks (CNNs). NeuJeans tackles the critical problem of the enormous computational cost for the FHE evaluation of convolutional layers (conv2d), mainly due to the high cost of data reordering and bootstrapping. We first propose an encoding method introducing nested structures inside encoded vectors for FHE, which enables us to develop efficient conv2d algorithms with reduced data reordering costs. However, the new encoding method also introduces additional computations for conversion between encoding methods, which could negate its advantages. We discover that fusing conv2d with bootstrapping eliminates such computations while reducing the cost of bootstrapping. Then, we devise optimized execution flows for various types of conv2d and apply them to end-to-end implementation of CNNs. NeuJeans accelerates the performance of conv2d by up to 5.68 times compared to state-of-the-art FHE-based PI work and performs the PI of a CNN at the scale of ImageNet (ResNet18) within a mere few seconds

ciphertext, multiplication, opération, (16 more...)

2312.04356

Country:

Asia > South Korea > Seoul > Seoul (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)

Genre: Research Report (0.50)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)