cpu time
Supplementary Material " Fast Bayesian Estimation of Point Process Intensity as Function of Covariates "
Current affiliation is Y okohama City University. We detail the derivation of the predictive covariance shown in (19-20). We detail the derivation of the marginal likelihood, p (D), shown in (23). H |. Finally, we obtain the marginal likelihood in a tractable form, log p(D) = log |Z | 1 2 log |I We detail the derivation of the functional determinant of equivalent kernel, |H|, when the naive and degenerate approaches are applied. S4.1 Naive Approach The equivalent kernel is constructed under the naive approach as follows: h( y, y Mercer's theorem [ 5 ] states that the kernel function of finite rank M has a diagonal representation such that k ( y, y S5.1 Model Configuration Augmented Permanental Process (APP) Let the number of samples for quasi-Monte Carlo method be denoted by J, and the ranks of approximate kernel function for Random feature map [ 6 ] and Nyström approximation [ 8, 9 ] be denoted by M We employed a popular gradient descent algorithm, Adam [ 4 ], to perform the minimization problem (see Section 2.2), { ˆ v B was set as 10 in the experiments.
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.50)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.40)
Edge-Based Speech Transcription and Synthesis for Kinyarwanda and Swahili Languages
Mbonimpa, Pacome Simon, Tuyizere, Diane, Biyabani, Azizuddin Ahmed, Tonguz, Ozan K.
Abstract--This paper presents a novel framework for speech transcription and synthesis, leveraging edge-cloud parallelism to enhance processing speed and accessibility for Kinyarwanda and Swahili speakers. It addresses the scarcity of powerful language processing tools for these widely spoken languages in East African countries with limited technological infrastructure. The framework utilizes the Whisper and SpeechT5 pre-trained models to enable speech-to-text (STT) and text-to-speech (TTS) translation. The architecture uses a cascading mechanism that distributes the model inference workload between the edge device and the cloud, thereby reducing latency and resource usage, benefiting both ends. On the edge device, our approach achieves a memory usage compression of 9.5% for the SpeechT5 model and 14% for the Whisper model, with a maximum memory usage of 149 MB. Experimental results indicate that on a 1.7 GHz CPU edge device with a 1 MB/s network bandwidth, the system can process a 270-character text in less than a minute for both speech-to-text and text-to-speech transcription. Using real-world survey data from Kenya, it is shown that the cascaded edge-cloud architecture proposed could easily serve as an excellent platform for STT and TTS transcription with good accuracy and response time. I. INTRODUCTION In today's digital age, the need for accurate and efficient speech transcription and synthesis models has been increasing rapidly. These models play an important role in a variety of applications, such as learning new language(s), accessibility tools for people with difficulties in reading and hearing, as well as automated voice assistants [1]. Kinyarwanda and Swahili are two of the local languages spoken in East Africa. While Swahili is the most widely spoken language in Eastern Africa, the speakers range from 60 million to over 150 million [2].
- Africa > East Africa (0.54)
- Africa > Kenya (0.26)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
- (5 more...)
large scale canonical correlation analysis with iterative least squares
Canonical Correlation Analysis (CCA) is a widely used statistical tool with both well established theory and favorable performance for a wide range of machine learning problems. However, computing CCA for huge datasets can be very slow since it involves implementing QR decomposition or singular value decomposition of huge matrices. In this paper we introduce L-CCA, a iterative algorithm which can compute CCA fast on huge sparse datasets. Theory on both the asymptotic convergence and finite time accuracy of L-CCA are established. The experiments also show that L-CCA outperform other fast CCA approximation schemes on two real datasets.
- Asia > Middle East > Jordan (0.04)
- North America > United States > Pennsylvania (0.04)
- North America > United States > Maryland > Baltimore (0.04)
- (3 more...)
A Unified Optimization Framework for Multiclass Classification with Structured Hyperplane Arrangements
Blanco, Víctor, Kothari, Harshit, Luedtke, James
In this paper, we propose a new mathematical optimization model for multiclass classification based on arrangements of hyperplanes. Our approach preserves the core support vector machine (SVM) paradigm of maximizing class separation while minimizing misclassification errors, and it is computationally more efficient than a previous formulation. We present a kernel-based extension that allows it to construct nonlinear decision boundaries. Furthermore, we show how the framework can naturally incorporate alternative geometric structures, including classification trees, $\ell_p$-SVMs, and models with discrete feature selection. To address large-scale instances, we develop a dynamic clustering matheuristic that leverages the proposed MIP formulation. Extensive computational experiments demonstrate the efficiency of the proposed model and dynamic clustering heuristic, and we report competitive classification performance on both synthetic datasets and real-world benchmarks from the UCI Machine Learning Repository, comparing our method with state-of-the-art implementations available in scikit-learn.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.67)
- Europe > Ireland (0.04)
- North America > United States > California (0.04)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
R2 Explain the identitiy inf c [ 0,1] 1 2 (c + a c) null a + a/2
We thank the reviwers for the feedback. Please find the responses for other comments/queries below. Please be specific about what part of [31] is being referenced in line 166. It is Theorem-6.7 of page 336 of [31], We will be more precise in the final version. We mentioned the gist of the algorithms in line 66-69.
Accelerating Particle-based Energetic Variational Inference
Bao, Xuelian, Kang, Lulu, Liu, Chun, Wang, Yiwei
In this work, we propose a novel particle-based variational inference (ParVI) method that accelerates the EVI-Im, proposed in Ref. [41]. Inspired by energy quadratization (EQ) and operator splitting techniques for gradient flows, our approach efficiently drives particles towards the target distribution. Unlike EVI-Im, which employs the implicit Euler method to solve variational-preserving particle dynamics for minimizing the KL divergence, derived using a "discretize-then-variational" approach, the proposed algorithm avoids repeated evaluation of inter-particle interaction terms, significantly reducing computational cost. The framework is also extensible to other gradient-based sampling techniques. Through several numerical experiments, we demonstrate that our method outperforms existing ParVI approaches in efficiency, robustness, and accuracy.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
- North America > United States > California > Riverside County > Riverside (0.14)
- (9 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Solving the Best Subset Selection Problem via Suboptimal Algorithms
Best subset selection in linear regression is well known to be nonconvex and computationally challenging to solve, as the number of possible subsets grows rapidly with increasing dimensionality of the problem. As a result, finding the global optimal solution via an exact optimization method for a problem with dimensions of 1000s may take an impractical amount of CPU time. This suggests the importance of finding suboptimal procedures that can provide good approximate solutions using much less computational effort than exact methods. In this work, we introduce a new procedure and compare it with other popular suboptimal algorithms to solve the best subset selection problem. Extensive computational experiments using synthetic and real data have been performed. The results provide insights into the performance of these methods in different data settings. The new procedure is observed to be a competitive suboptimal algorithm for solving the best subset selection problem for high-dimensional data.
- North America > United States > Oklahoma > Oklahoma County > Edmond (0.14)
- North America > United States > Alabama > Tuscaloosa County > Tuscaloosa (0.14)
- North America > United States > California > Los Angeles County > Los Angeles (0.04)