Goto

Collaborating Authors

 South America


OpenAI bans Chinese accounts using ChatGPT to edit code for social media surveillance

Engadget

OpenAI has banned the accounts of a group of Chinese users who had attempted to use ChatGPT to debug and edit code for an AI social media surveillance tool, the company said Friday. The campaign, which OpenAI calls Peer Review, saw the group prompt ChatGPT to generate sales pitches for a program those documents suggest was designed to monitor anti-Chinese sentiment on X, Facebook, YouTube, Instagram and other platforms. The operation appears to have been particularly interested in spotting calls for protests against human rights violations in China, with the intent of sharing those insights with the country's authorities. "This network consisted of ChatGPT accounts that operated in a time pattern consistent with mainland Chinese business hours, prompted our models in Chinese, and used our tools with a volume and variety consistent with manual prompting, rather than automation," said OpenAI. "The operators used our models to proofread claims that their insights had been sent to Chinese embassies abroad, and to intelligence agents monitoring protests in countries including the United States, Germany and the United Kingdom."


China, Iran-based threat actors have found new ways to to use American AI models for covert influence: Report

FOX News

Threat actors, some likely based in China and Iran, are formulating new ways to hijack and utilize American artificial intelligence (AI) models for malicious intent, including covert influence operations, according to a new report from OpenAI. The February report includes two disruptions involving threat actors that appear to have originated from China. According to the report, these actors have used, or at least attempted to use, models built by OpenAI and Meta. In one example, OpenAI banned a ChatGPT account that generated comments critical of Chinese dissident Cai Xia. The comments were posted on social media by accounts that claimed to be people based in India and the U.S.


Artificial Intelligence as Catalyst for Biodiversity Understanding

Communications of the ACM

Artificial intelligence (AI) is not a panacea for effortlessly solving the planet's environmental problems. AI still sparks passionate and dystopian predictions within some parts of the academic community, especially in the natural sciences. For some, the existence of AI tools means an existential threat to human creativity.10 Concerns about the increasing environmental costs of carbon emissions1 and water use demanded by information and communication technologies are also on the horizon. These viewpoints, however, overlook the advantages of employing AI in biodiversity research.


ChatGPT's AI agent Operator is now available for most Pro users

Engadget

Operator is now out in Australia, Brazil, Canada, India, Japan, Singapore, South Korea, the UK and most places where ChatGPT is also available, OpenAI has announced. The company launched Operator in the US back in January, introducing it as an "agent that can go to the web to perform tasks" for the user. Operator can handle various browser-based tasks for users, such as filling out forms, making restaurant reservations and ordering groceries. At the moment, it's still a research preview in its early stages that comes with limitations, but the company said it hopes to roll out improvements based on user feedback. Operator is now rolling out to Pro users in Australia, Brazil, Canada, India, Japan, Singapore, South Korea, the UK, and most places ChatGPT is available.


SALSA-RL: Stability Analysis in the Latent Space of Actions for Reinforcement Learning

arXiv.org Artificial Intelligence

Modern deep reinforcement learning (DRL) methods have made significant advances in handling continuous action spaces. However, real-world control systems--especially those requiring precise and reliable performance--often demand formal stability, and existing DRL approaches typically lack explicit mechanisms to ensure or analyze stability. To address this limitation, we propose SALSA-RL (Stability Analysis in the Latent Space of Actions), a novel RL framework that models control actions as dynamic, time-dependent variables evolving within a latent space. By employing a pre-trained encoder-decoder and a state-dependent linear system, our approach enables both stability analysis and interpretability. We demonstrated that SALSA-RL can be deployed in a non-invasive manner for assessing the local stability of actions from pretrained RL agents without compromising on performance across diverse benchmark environments. By enabling a more interpretable analysis of action generation, SALSA-RL provides a powerful tool for advancing the design, analysis, and theoretical understanding of RL systems.


AttentionEngine: A Versatile Framework for Efficient Attention Mechanisms on Diverse Hardware Platforms

arXiv.org Artificial Intelligence

Transformers and large language models (LLMs) have revolutionized machine learning, with attention mechanisms at the core of their success. As the landscape of attention variants expands, so too do the challenges of optimizing their performance, particularly across different hardware platforms. Current optimization strategies are often narrowly focused, requiring extensive manual intervention to accommodate changes in model configurations or hardware environments. In this paper, we introduce AttentionEngine, a comprehensive framework designed to streamline the optimization of attention mechanisms across heterogeneous hardware backends. By decomposing attention computation into modular operations with customizable components, AttentionEngine enables flexible adaptation to diverse algorithmic requirements. The framework further automates kernel optimization through a combination of programmable templates and a robust cross-platform scheduling strategy. Empirical results reveal performance gains of up to 10x on configurations beyond the reach of existing methods. AttentionEngine offers a scalable, efficient foundation for developing and deploying attention mechanisms with minimal manual tuning. Our code has been open-sourced and is available at https://github.com/microsoft/AttentionEngine.


Solving Inverse Problems with Deep Linear Neural Networks: Global Convergence Guarantees for Gradient Descent with Weight Decay

arXiv.org Artificial Intelligence

Machine learning methods are commonly used to solve inverse problems, wherein an unknown signal must be estimated from few measurements generated via a known acquisition procedure. In particular, neural networks perform well empirically but have limited theoretical guarantees. In this work, we study an underdetermined linear inverse problem that admits several possible solution mappings. A standard remedy (e.g., in compressed sensing) establishing uniqueness of the solution mapping is to assume knowledge of latent low-dimensional structure in the source signal. We ask the following question: do deep neural networks adapt to this low-dimensional structure when trained by gradient descent with weight decay regularization? We prove that mildly overparameterized deep linear networks trained in this manner converge to an approximate solution that accurately solves the inverse problem while implicitly encoding latent subspace structure. To our knowledge, this is the first result to rigorously show that deep linear networks trained with weight decay automatically adapt to latent subspace structure in the data under practical stepsize and weight initialization schemes. Our work highlights that regularization and overparameterization improve generalization, while overparameterization also accelerates convergence during training.


Mixup Model Merge: Enhancing Model Merging Performance through Randomized Linear Interpolation

arXiv.org Artificial Intelligence

Model merging integrates the parameters of multiple models into a unified model, combining their diverse capabilities. Existing model merging methods are often constrained by fixed parameter merging ratios. In this study, we propose Mixup Model Merge (M$^3$), an innovative approach inspired by the Mixup data augmentation technique. This method merges the parameters of two large language models (LLMs) by randomly generating linear interpolation ratios, allowing for a more flexible and comprehensive exploration of the parameter space. Extensive experiments demonstrate the superiority of our proposed M$^3$ method in merging fine-tuned LLMs: (1) it significantly improves performance across multiple tasks, (2) it enhances LLMs' out-of-distribution (OOD) robustness and adversarial robustness, (3) it achieves superior results when combined with sparsification techniques such as DARE, and (4) it offers a simple yet efficient solution that does not require additional computational resources. In conclusion, M$^3$ is a simple yet effective model merging method that significantly enhances the performance of the merged model by randomly generating contribution ratios for two fine-tuned LLMs. The code is available at https://github.com/MLGroupJLU/MixupModelMerge.


Integrating Personality into Digital Humans: A Review of LLM-Driven Approaches for Virtual Reality

arXiv.org Artificial Intelligence

The integration of large language models (LLMs) into virtual reality (VR) environments has opened new pathways for creating more immersive and interactive digital humans. By leveraging the generative capabilities of LLMs alongside multimodal outputs such as facial expressions and gestures, virtual agents can simulate human-like personalities and emotions, fostering richer and more engaging user experiences. This paper provides a comprehensive review of methods for enabling digital humans to adopt nuanced personality traits, exploring approaches such as zero-shot, few-shot, and fine-tuning. Additionally, it highlights the challenges of integrating LLM-driven personality traits into VR, including computational demands, latency issues, and the lack of standardized evaluation frameworks for multimodal interactions. By addressing these gaps, this work lays a foundation for advancing applications in education, therapy, and gaming, while fostering interdisciplinary collaboration to redefine human-computer interaction in VR.


Examining the Dynamics of Local and Transfer Passenger Share Patterns in Air Transportation

arXiv.org Artificial Intelligence

The air transportation local share, defined as the proportion of local passengers relative to total passengers, serves as a critical metric reflecting how economic growth, carrier strategies, and market forces jointly influence demand composition. This metric is particularly useful for examining industry structure changes and large-scale disruptive events such as the COVID-19 pandemic. This research offers an in-depth analysis of local share patterns on more than 3900 Origin and Destination (O&D) pairs across the U.S. air transportation system, revealing how economic expansion, the emergence of low-cost carriers (LCCs), and strategic shifts by legacy carriers have collectively elevated local share. To efficiently identify the local share characteristics of thousands of O&Ds and to categorize the O&Ds that have the same behavior, a range of time series clustering methods were used. Evaluation using visualization, performance metrics, and case-based examination highlighted distinct patterns and trends, from magnitude-based stratification to trend-based groupings. The analysis also identified pattern commonalities within O&D pairs, suggesting that macro-level forces (e.g., economic cycles, changing demographics, or disruptions such as COVID-19) can synchronize changes between disparate markets. These insights set the stage for predictive modeling of local share, guiding airline network planning and infrastructure investments. This study combines quantitative analysis with flexible clustering to help stakeholders anticipate market shifts, optimize resource allocation strategies, and strengthen the air transportation system's resilience and competitiveness.