Goto

Collaborating Authors

 Generative AI


Large-Scale AI in Telecom: Charting the Roadmap for Innovation, Scalability, and Enhanced Digital Experiences

arXiv.org Artificial Intelligence

The rise of generative artificial intelligence (AI) as a novel frontier that uniquely merges advanced levels of intelligence with revolutionary user experiences is redefining the AI landscape for future cellular networks. In particular, the transition towards 6G systems has introduced a myriad of challenges inherent to their AI-native network design, requiring innovative solutions to enable real-time network orchestration, intelligent decision-making, and adaptive dynamic configurations. Meanwhile, the envisioned user experiences for 6G are growing increasingly complex, exceeding the capabilities offered by vintage wireless technologies and conventional AI solutions to satisfy their advanced demands. With its disruptive impact evident across diverse fields, generative AI possesses immense potential to tackle these challenges, leveraging its exceptional capabilities to manage complex tasks, operate autonomously, and adapt seamlessly to scenarios beyond its training domain. Remarkably, generative AI provides a transformative opportunity for telecom and cellular networks to bridge this defined gap in 6G systems, thereby shifting towards a new era with cutting-edge AI innovations across the different system and user levels.


InfoSEM: A Deep Generative Model with Informative Priors for Gene Regulatory Network Inference

arXiv.org Machine Learning

Inferring Gene Regulatory Networks (GRNs) from gene expression data is crucial for understanding biological processes. While supervised models are reported to achieve high performance for this task, they rely on costly ground truth (GT) labels and risk learning gene-specific biases, such as class imbalances of GT interactions, rather than true regulatory mechanisms. To address these issues, we introduce InfoSEM, an unsupervised generative model that leverages textual gene embeddings as informative priors, improving GRN inference without GT labels. InfoSEM can also integrate GT labels as an additional prior when available, avoiding biases and further enhancing performance. Additionally, we propose a biologically motivated benchmarking framework that better reflects real-world applications such as biomarker discovery and reveals learned biases of existing supervised methods. InfoSEM outperforms existing models by 38.5% across four datasets using textual embeddings prior and further boosts performance by 11.1% when integrating labeled data as priors.


LLMs' Reshaping of People, Processes, Products, and Society in Software Development: A Comprehensive Exploration with Early Adopters

arXiv.org Artificial Intelligence

Large language models (LLMs) like OpenAI ChatGPT, Google Gemini, and GitHub Copilot are rapidly gaining traction in the software industry, but their full impact on software engineering remains insufficiently explored. Despite their growing adoption, there is a notable lack of formal, qualitative assessments of how LLMs are applied in real-world software development contexts. To fill this gap, we conducted semi-structured interviews with sixteen early-adopter professional developers to explore their use of LLMs throughout various stages of the software development life cycle. Our investigation examines four dimensions: people - how LLMs affect individual developers and teams; process - how LLMs alter software engineering workflows; product - LLM impact on software quality and innovation; and society - the broader socioeconomic and ethical implications of LLM adoption. Thematic analysis of our data reveals that while LLMs have not fundamentally revolutionized the development process, they have substantially enhanced routine coding tasks, including code generation, refactoring, and debugging. Developers reported the most effective outcomes when providing LLMs with clear, well-defined problem statements, indicating that LLMs excel with decomposed problems and specific requirements. Furthermore, these early-adopters identified that LLMs offer significant value for personal and professional development, aiding in learning new languages and concepts. Early-adopters, highly skilled in software engineering and how LLMs work, identified early and persisting challenges for software engineering, such as inaccuracies in generated content and the need for careful manual review before integrating LLM outputs into production environments. Our study provides a nuanced understanding of how LLMs are shaping the landscape of software development, with their benefits, limitations, and ongoing implications.


Memory Is All You Need: Testing How Model Memory Affects LLM Performance in Annotation Tasks

arXiv.org Artificial Intelligence

Generative Large Language Models (LLMs) have shown promising results in text annotation using zero-shot and few-shot learning. Yet these approaches do not allow the model to retain information from previous annotations, making each response independent from the preceding ones. This raises the question of whether model memory -- the LLM having knowledge about its own previous annotations in the same task -- affects performance. In this article, using OpenAI's GPT-4o and Meta's Llama 3.1 on two political science datasets, we demonstrate that allowing the model to retain information about its own previous classifications yields significant performance improvements: between 5 and 25\% when compared to zero-shot and few-shot learning. Moreover, memory reinforcement, a novel approach we propose that combines model memory and reinforcement learning, yields additional performance gains in three out of our four tests. These findings have important implications for applied researchers looking to improve performance and efficiency in LLM annotation tasks.


Simulating the Real World: A Unified Survey of Multimodal Generative Models

arXiv.org Artificial Intelligence

Understanding and replicating the real world is a critical challenge in Artificial General Intelligence (AGI) research. To achieve this, many existing approaches, such as world models, aim to capture the fundamental principles governing the physical world, enabling more accurate simulations and meaningful interactions. However, current methods often treat different modalities, including 2D (images), videos, 3D, and 4D representations, as independent domains, overlooking their interdependencies. Additionally, these methods typically focus on isolated dimensions of reality without systematically integrating their connections. In this survey, we present a unified survey for multimodal generative models that investigate the progression of data dimensionality in real-world simulation. Specifically, this survey starts from 2D generation (appearance), then moves to video (appearance+dynamics) and 3D generation (appearance+geometry), and finally culminates in 4D generation that integrate all dimensions. To the best of our knowledge, this is the first attempt to systematically unify the study of 2D, video, 3D and 4D generation within a single framework. To guide future research, we provide a comprehensive review of datasets, evaluation metrics and future directions, and fostering insights for newcomers. This survey serves as a bridge to advance the study of multimodal generative models and real-world simulation within a unified framework.


Dedicated Feedback and Edit Models Empower Inference-Time Scaling for Open-Ended General-Domain Tasks

arXiv.org Artificial Intelligence

Inference-Time Scaling has been critical to the success of recent models such as OpenAI o1 and DeepSeek R1. However, many techniques used to train models for inference-time scaling require tasks to have answers that can be verified, limiting their application to domains such as math, coding and logical reasoning. We take inspiration from how humans make first attempts, ask for detailed feedback from others and make improvements based on such feedback across a wide spectrum of open-ended endeavors. To this end, we collect data for and train dedicated Feedback and Edit Models that are capable of performing inference-time scaling for open-ended general-domain tasks. In our setup, one model generates an initial response, which are given feedback by a second model, that are then used by a third model to edit the response. We show that performance on Arena Hard, a benchmark strongly predictive of Chatbot Arena Elo can be boosted by scaling the number of initial response drafts, effective feedback and edited responses. When scaled optimally, our setup based on 70B models from the Llama 3 family can reach SoTA performance on Arena Hard at 92.7 as of 5 Mar 2025, surpassing OpenAI o1-preview-2024-09-12 with 90.4 and DeepSeek R1 with 92.3.


Prompt Programming: A Platform for Dialogue-based Computational Problem Solving with Generative AI Models

arXiv.org Artificial Intelligence

Computing students increasingly rely on generative AI tools for programming assistance, often without formal instruction or guidance. This highlights a need to teach students how to effectively interact with AI models, particularly through natural language prompts, to generate and critically evaluate code for solving computational tasks. To address this, we developed a novel platform for prompt programming that enables authentic dialogue-based interactions, supports problems involving multiple interdependent functions, and offers on-request execution of generated code. Data analysis from over 900 students in an introductory programming course revealed high engagement, with the majority of prompts occurring within multi-turn dialogues. Problems with multiple interdependent functions encouraged iterative refinement, with progression graphs highlighting several common strategies. Students were highly selective about the code they chose to test, suggesting that on-request execution of generated code promoted critical thinking. Given the growing importance of learning dialogue-based programming with AI, we provide this tool as a publicly accessible resource, accompanied by a corpus of programming problems for educational use.


Approaching the Limits to EFL Writing Enhancement with AI-generated Text and Diverse Learners

arXiv.org Artificial Intelligence

Generative artificial intelligence (AI) chatbots, such as ChatGPT, are reshaping how English as a foreign language (EFL) students write since students can compose texts by integrating their own words with AI-generated text. This study investigated how 59 Hong Kong secondary school students with varying levels of academic achievement interacted with AI-generated text to compose a feature article, exploring whether any interaction patterns benefited the overall quality of the article. Through content analysis, multiple linear regression and cluster analysis, we found the overall number of words -- whether AI- or human-generated -- is the main predictor of writing quality. However, the impact varies by students' competence to write independently, for instance, by using their own words accurately and coherently to compose a text, and to follow specific interaction patterns with AI-generated text. Therefore, although composing texts with human words and AI-generated text may become prevalent in EFL writing classrooms, without educators' careful attention to EFL writing pedagogy and AI literacy, high-achieving students stand to benefit more from using AI-generated text than low-achieving students.


Chatbots Are Cheating on Their Benchmark Tests

The Atlantic - Technology

Generative-AI companies have been selling a narrative of unprecedented, endless progress. Just last week, OpenAI introduced GPT-4.5 as its "largest and best model for chat yet." Earlier in February, Google called its latest version of Gemini "the world's best AI model." And in January, the Chinese company DeekSeek touted its R1 model as being just as powerful as OpenAI's o1 model--which Sam Altman had called "the smartest model in the world" the previous month. Yet there is growing evidence that progress is slowing down and that the LLM-powered chatbot may already be near its peak.


UK competition watchdog drops Microsoft-OpenAI probe

BBC News

Critics though say the decision is linked to the changed political environment the CMA is now operating in. The government has instructed the country's regulators to suggest ways of stimulating economic growth. In January, the government removed the then chair of the CMA, Marcus Bokkerink, because it was unhappy with his response to that call. He was replaced on an interim basis by Doug Gurr, former boss of Amazon UK. "The CMA has sat on this decision for over a year, yet within just a few weeks of a former Amazon boss being installed as chair, it has decided everything was absolutely fine all along, nothing to see here," said Foxglove co-executive director Rosa Curling. "This is a bad sign that Big Tech has successfully convinced the prime minister to defang our competition regulator and let Big Tech gobble up the current generation of cutting-edge tech – just like they did the last one," she told the BBC.