AITopics | Oceania

Collaborating Authors

Oceania

An Asynchronous Decentralised Optimisation Algorithm for Nonconvex Problems

Mafakheri, Behnam, Manton, Jonathan H., Shames, Iman

arXiv.org Artificial IntelligenceJul-31-2025

In this paper, we consider nonconvex decentralised optimisation and learning over a network of distributed agents. We develop an ADMM algorithm based on the Randomised Block Coordinate Douglas-Rachford splitting method which enables agents in the network to distributedly and asynchronously compute a set of first-order stationary solutions of the problem. To the best of our knowledge, this is the first decentralised and asynchronous algorithm for solving nonconvex optimisation problems with convergence proof. The numerical examples demonstrate the efficiency of the proposed algorithm for distributed Phase Retrieval and sparse Principal Component Analysis problems.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2507.22311

Country: Oceania > Australia (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)

Add feedback

Fuzzing: Randomness? Reasoning! Efficient Directed Fuzzing via Large Language Models

Feng, Xiaotao, Zhu, Xiaogang, Hu, Kun, Wang, Jincheng, Cao, Yingjie, Gong, Guang, Pan, Jianfeng

arXiv.org Artificial IntelligenceJul-31-2025

Fuzzing is highly effective in detecting bugs due to the key contribution of randomness. However, randomness significantly reduces the efficiency of fuzzing, causing it to cost days or weeks to expose bugs. Even though directed fuzzing reduces randomness by guiding fuzzing towards target buggy locations, the dilemma of randomness still challenges directed fuzzers. Two critical components, which are seeds and mutators, contain randomness and are closely tied to the conditions required for triggering bugs. Therefore, to address the challenge of randomness, we propose to use large language models (LLMs) to remove the randomness in seeds and reduce the randomness in mutators. With their strong reasoning and code generation capabilities, LLMs can be used to generate reachable seeds that target pre-determined locations and to construct bug-specific mutators tailored for specific bugs. We propose RandLuzz, which integrates LLMs and directed fuzzing, to improve the quality of seeds and mutators, resulting in efficient bug exposure. RandLuzz analyzes function call chain or functionality to guide LLMs in generating reachable seeds. To construct bug-specific mutators, RandLuzz uses LLMs to perform bug analysis, obtaining information such as bug causes and mutation suggestions, which further help generate code that performs bug-specific mutations. We evaluate RandLuzz by comparing it with four state-of-the-art directed fuzzers, AFLGo, Beacon, WindRanger, and SelectFuzz. With RandLuzz-generated seeds, the fuzzers achieve an average speedup ranging from 2.1$\times$ to 4.8$\times$ compared to using widely-used initial seeds. Additionally, when evaluated on individual bugs, RandLuzz achieves up to a 2.7$\times$ speedup compared to the second-fastest exposure. On 8 bugs, RandLuzz can even expose them within 60 seconds.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2507.22065

Country:

Asia (0.46)
North America > United States (0.46)
Oceania > Australia (0.28)

Genre: Research Report > New Finding (0.68)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

ST-GDance: Long-Term and Collision-Free Group Choreography from Music

Xu, Jing, Wang, Weiqiang, Chen, Cunjian, Liu, Jun, Ke, Qiuhong

arXiv.org Artificial IntelligenceJul-31-2025

Group dance generation from music has broad applications in film, gaming, and animation production. However, it requires synchronizing multiple dancers while maintaining spatial coordination. As the number of dancers and sequence length increase, this task faces higher computational complexity and a greater risk of motion collisions. Existing methods often struggle to model dense spatial-temporal interactions, leading to scalability issues and multi-dancer collisions. To address these challenges, we propose ST-GDance, a novel framework that decouples spatial and temporal dependencies to optimize long-term and collision-free group choreography. We employ lightweight graph convolutions for distance-aware spatial modeling and accelerated sparse attention for efficient temporal modeling. This design significantly reduces computational costs while ensuring smooth and collision-free interactions. Experiments on the AIOZ-GDance dataset demonstrate that ST-GDance outperforms state-of-the-art baselines, particularly in generating long and coherent group dance sequences. Project page: https://yilliajing.github.io/ST-GDance-Website/.

artificial intelligence, dance generation, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2507.21518

Country: Oceania > Australia (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

New tech-focused MAHA initiatives will usher in 'new era of convenience,' improve health outcomes, Trump says

FOX NewsJul-30-2025, 19:48:20 GMT

Health and Human Services Secretary Robert F. Kennedy Jr. shares his journey to his official position and where his passion for health comes from on'My View with Lara Trump.' The White House revealed new details Wednesday regarding the Trump administration's efforts to advance healthcare technology and partnerships with private-sector technology companies. The "Make Health Tech Great Again" event was expected to provide more details on how the administration is advancing a "next-generation digital health ecosystem," after securing partnerships with companies including Amazon, Anthropic, Apple, Google, and OpenAI to better share information between patient and providers within Medicare and Medicaid services. U.S. Health and Human Services Secretary Robert F. Kennedy Jr., announced that the HHS will ban illegal immigrants from accessing taxpayer-funded programs. "For decades, bureaucrats and entrenched interests buried health data and blocked patients from taking control of their health," Department of Health and Human Services Secretary Robert F. Kennedy, Jr. said in a statement Wednesday ahead of the event.

artificial intelligence, bioinformatics, president trump, (15 more...)

FOX News

Country:

Asia > Indonesia (0.06)
North America > United States > District of Columbia > Washington (0.05)
Oceania > Australia (0.05)

Industry:

Health & Medicine > Health Care Providers & Services > Reimbursement (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence (1.00)
Information Technology > Biomedical Informatics > Clinical Informatics (0.37)

Add feedback

VN-MTEB: Vietnamese Massive Text Embedding Benchmark

Pham, Loc, Luu, Tung, Vo, Thu, Nguyen, Minh, Hoang, Viet

arXiv.org Artificial IntelligenceJul-30-2025

Vietnam ranks among the top countries in terms of both internet traffic and online toxicity. As a result, implementing embedding models for recommendation and content control duties in applications is crucial. However, a lack of large-scale test datasets, both in volume and task diversity, makes it tricky for scientists to effectively evaluate AI models before deploying them in real-world, large-scale projects. To solve this important problem, we introduce a Vietnamese benchmark, VN-MTEB for embedding models, which we created by translating a large number of English samples from the Massive Text Embedding Benchmark using our new automated framework. We leverage the strengths of large language models (LLMs) and cutting-edge embedding models to conduct translation and filtering processes to retain high-quality samples, guaranteeing a natural flow of language and semantic fidelity while preserving named entity recognition (NER) and code snippets. Our comprehensive benchmark consists of 41 datasets from six tasks specifically designed for Vietnamese text embeddings. In our analysis, we find that bigger and more complex models using Rotary Positional Embedding outperform those using Absolute Positional Embedding in embedding tasks. Datasets are available at HuggingFace: https://huggingface.co/collections/GreenNode/vn-mteb-68871433f0f7573b8e1a6686

huggingface, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2507.215

Country:

Asia > Vietnam (0.24)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
(13 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area (0.68)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Measuring Sample Quality with Copula Discrepancies

Aich, Agnideep, Aich, Ashit Baran, Wade, Bruce

arXiv.org Machine LearningJul-30-2025

The scalable Markov chain Monte Carlo (MCMC) algorithms that underpin modern Bayesian machine learning, such as Stochastic Gradient Langevin Dynamics (SGLD), sacrifice asymptotic exactness for computational speed, creating a critical diagnostic gap: traditional sample quality measures fail catastrophically when applied to biased samplers. While powerful Stein-based diagnostics can detect distributional mismatches, they provide no direct assessment of dependence structure, often the primary inferential target in multivariate problems. We introduce the Copula Discrepancy (CD), a principled and computationally efficient diagnostic that leverages Sklar's theorem to isolate and quantify the fidelity of a sample's dependence structure independent of its marginals. Our theoretical framework provides the first structure-aware diagnostic specifically designed for the era of approximate inference. Empirically, we demonstrate that a moment-based CD dramatically outperforms standard diagnostics like effective sample size for hyperparameter selection in biased MCMC, correctly identifying optimal configurations where traditional methods fail. Furthermore, our robust MLE-based variant can detect subtle but critical mismatches in tail dependence that remain invisible to rank correlation-based approaches, distinguishing between samples with identical Kendall's tau but fundamentally different extreme-event behavior. With computational overhead orders of magnitude lower than existing Stein discrepancies, the CD provides both immediate practical value for MCMC practitioners and a theoretical foundation for the next generation of structure-aware sample quality assessment.

artificial intelligence, bayesian inference, machine learning, (15 more...)

arXiv.org Machine Learning

2507.21434

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Louisiana > Lafayette Parish > Lafayette (0.04)
(7 more...)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

From Sublinear to Linear: Fast Convergence in Deep Networks via Locally Polyak-Lojasiewicz Regions

Aich, Agnideep, Aich, Ashit Baran, Wade, Bruce

arXiv.org Machine LearningJul-30-2025

The convergence of gradient descent (GD) on the non-convex loss landscapes of deep neural networks (DNNs) presents a fundamental theoretical challenge. While recent work has established that GD converges to a stationary point at a sublinear rate within locally quasi-convex regions (LQCRs), this fails to explain the exponential convergence rates consistently observed in practice. In this paper, we resolve this discrepancy by proving that under a mild assumption on Neural Tangent Kernel (NTK) stability, these same regions satisfy a local Polyak-Lojasiewicz (PL) condition. We introduce the concept of a Locally Polyak-Lojasiewicz Region (LPLR), where the squared gradient norm lower-bounds the suboptimality gap, prove that properly initialized finite-width networks admit such regions around initialization, and establish that GD achieves linear convergence within an LPLR, providing the first finite-width guarantee that matches empirically observed rates. We validate our theory across diverse settings, from controlled experiments on fully-connected networks to modern ResNet architectures trained with stochastic methods, demonstrating that LPLR structure emerges robustly in practical deep learning scenarios. By rigorously connecting local landscape geometry to fast optimization through the NTK framework, our work provides a definitive theoretical explanation for the remarkable efficiency of gradient-based optimization in deep learning.

artificial intelligence, convergence, machine learning, (18 more...)

arXiv.org Machine Learning

2507.21429

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Louisiana > Lafayette Parish > Lafayette (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Add feedback

A DPI-PAC-Bayesian Framework for Generalization Bounds

Guan, Muhan, Farokhi, Farhad, Zhu, Jingge

arXiv.org Machine LearningJul-30-2025

We develop a unified Data Processing Inequality PAC-Bayesian framework -- abbreviated DPI-PAC-Bayesian -- for deriving generalization error bounds in the supervised learning setting. By embedding the Data Processing Inequality (DPI) into the change-of-measure technique, we obtain explicit bounds on the binary Kullback-Leibler generalization gap for both Rényi divergence and any $f$-divergence measured between a data-independent prior distribution and an algorithm-dependent posterior distribution. We present three bounds derived under our framework using Rényi, Hellinger $p$ and Chi-Squared divergences. Additionally, our framework also demonstrates a close connection with other well-known bounds. When the prior distribution is chosen to be uniform, our bounds recover the classical Occam's Razor bound and, crucially, eliminate the extraneous $\log(2\sqrt{n})/n$ slack present in the PAC-Bayes bound, thereby achieving tighter results. The framework thus bridges data-processing and PAC-Bayesian perspectives, providing a flexible, information-theoretic tool to construct generalization guarantees.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Machine Learning

2507.14795

Country:

Oceania > Australia > Victoria (0.04)
North America > United States (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

On Explaining Visual Captioning with Hybrid Markov Logic Networks

Shah, Monika, Sarkhel, Somdeb, Venugopal, Deepak

arXiv.org Artificial IntelligenceJul-30-2025

Deep Neural Networks (DNNs) have made tremendous progress in multimodal tasks such as image captioning. However, explaining/interpreting how these models integrate visual information, language information and knowledge representation to generate meaningful captions remains a challenging problem. Standard metrics to measure performance typically rely on comparing generated captions with human-written ones that may not provide a user with a deep insights into this integration. In this work, we develop a novel explanation framework that is easily interpretable based on Hybrid Markov Logic Networks (HMLNs) - a language that can combine symbolic rules with real-valued functions - where we hypothesize how relevant examples from the training data could have influenced the generation of the observed caption. To do this, we learn a HMLN distribution over the training instances and infer the shift in distributions over these instances when we condition on the generated sample which allows us to quantify which examples may have been a source of richer information to generate the observed caption. Our experiments on captions generated for several state-of-the-art captioning models using Amazon Mechanical Turk illustrate the interpretability of our explanations, and allow us to compare these models along the dimension of explainability.

artificial intelligence, caption, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2507.21246

Country: Oceania > Australia (0.16)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Uncovering Gradient Inversion Risks in Practical Language Model Training

Feng, Xinguo, Ma, Zhongkui, Wang, Zihan, Chegne, Eu Joe, Ma, Mengyao, Abuadbba, Alsharif, Bai, Guangdong

arXiv.org Artificial IntelligenceJul-30-2025

The gradient inversion attack has been demonstrated as a significant privacy threat to federated learning (FL), particularly in continuous domains such as vision models. In contrast, it is often considered less effective or highly dependent on impractical training settings when applied to language models, due to the challenges posed by the discrete nature of tokens in text data. As a result, its potential privacy threats remain largely underestimated, despite FL being an emerging training method for language models. In this work, we propose a domain-specific gradient inversion attack named Grab (gradient inversion with hybrid optimization). Grab features two alternating optimization processes to address the challenges caused by practical training settings, including a simultaneous optimization on dropout masks between layers for improved token recovery and a discrete optimization for effective token sequencing. Grab can recover a significant portion (up to 92.9% recovery rate) of the private training data, outperforming the attack strategy of utilizing discrete optimization with an auxiliary model by notable improvements of up to 28.9% recovery rate in benchmark settings and 48.5% recovery rate in practical settings. Grab provides a valuable step forward in understanding this privacy threat in the emerging FL training mode of language models.

machine learning, natural language, optimization, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3658644.3690292

2507.21198

Country:

North America > United States (0.30)
Oceania > Australia > Queensland (0.14)

Genre: Research Report > New Finding (0.67)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback