South America
A Survey on Recent Advances in Self-Organizing Maps
Guérin, Axel, Chauvet, Pierre, Saubion, Frédéric
The Self-Organising Map algorithm is a well-known approach for unsupervised learning, designed to distill a high-dimensional dataset into a more manageable, typically two-dimensional, representation. Imagine a dataset full of p measured variables across n observations. A Self-Organising Map elegantly organises similar observations into groups and visually displays them on a map. This model, also known as Kohonen maps or Kohonen networks, has been introduced by Teuvo Kohonen [Koh82, Koh97]. Unlike conventional neural networks, which rely on error correction, SOM training relies on competitive principles. Kohonen drew inspiration from biological paradigms, in particular the neural models [MP69] and Alan Turing's pioneering theories of morphogenesis [Tur52]. Basically, self-organising maps serve as powerful tools for dissecting and visualising complex data landscapes, facilitating a deeper understanding of the intricate structures and relationships that permeate multidimensional datasets. Self-organising maps, like most artificial neural network architectures, operate in two distinct modes: training and mapping.
The 50 greatest innovations of 2024
In 1988, we launched the Best of What's New Awards. The original list highlighted "the very things that make our lives more comfortable, more rewarding, more exciting, and more fun," to quote then-Publisher Grant A. Burnett. Now, in 2024, we continue our decades-old tradition of honoring big ideas. We even see hints of our original honorees in this year's list: Sea-Doo and Ford made both lists, 36 years apart. We're proud to bring you promising innovations--from things that make life at home easier to literal out-of-this-world explorations. This is the Best of What's New 2024. Had you asked me at the beginning of 2024 what our best gadgets list would look like, I'd have guessed it would be filled with quirky AI-driven devices like the rabbit R1 or the Humane Ai Pin. "Now with AI" is a phrase that has dominated consumer electronics in the 2020s. These devices promised unadulterated access to the power of neural networks in ways that would seamlessly integrate into our lives without relying on phones or smart fridges. Then, the devices came out. The software is slow and buggy, and the hardware is clunky. Maybe the stand-alone AI device will still have its year, and we'll look back and chuckle at these humble beginnings. In reality, 2024's big breakthrough came from Apple in the form of its long-rumored Vision Pro headset. The device has its own hurdles to clear, but after just a few minutes of using it, it was clear that it's something different, important, and honestly pretty amazing. The list also includes Sony's innovative pro-grade camera, the most accessible drone we've ever used, and a no-fun phone--no fun in a good way, of course. Credible rumors of Apple's VR bounced around the gadget blogs and tech sites for nearly a decade. It was consumer tech's sasquatch in that people claimed to have seen it, but no one knew if it even existed. Then, the Vision Pro emerged from the proverbial forest in February with a surprising design and a massive 3,500 price tag. It also came toting a new R-series chip and a dedicated OS meant for spatial computing.
Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice
Cooper, A. Feder, Choquette-Choo, Christopher A., Bogen, Miranda, Jagielski, Matthew, Filippova, Katja, Liu, Ken Ziyu, Chouldechova, Alexandra, Hayes, Jamie, Huang, Yangsibo, Mireshghallah, Niloofar, Shumailov, Ilia, Triantafillou, Eleni, Kairouz, Peter, Mitchell, Nicole, Liang, Percy, Ho, Daniel E., Choi, Yejin, Koyejo, Sanmi, Delgado, Fernando, Grimmelmann, James, Shmatikov, Vitaly, De Sa, Christopher, Barocas, Solon, Cyphert, Amy, Lemley, Mark, boyd, danah, Vaughan, Jennifer Wortman, Brundage, Miles, Bau, David, Neel, Seth, Jacobs, Abigail Z., Terzis, Andreas, Wallach, Hanna, Papernot, Nicolas, Lee, Katherine
We articulate fundamental mismatches between technical methods for machine unlearning in Generative AI, and documented aspirations for broader impact that these methods could have for law and policy. These aspirations are both numerous and varied, motivated by issues that pertain to privacy, copyright, safety, and more. For example, unlearning is often invoked as a solution for removing the effects of targeted information from a generative-AI model's parameters, e.g., a particular individual's personal data or in-copyright expression of Spiderman that was included in the model's training data. Unlearning is also proposed as a way to prevent a model from generating targeted types of information in its outputs, e.g., generations that closely resemble a particular individual's data or reflect the concept of "Spiderman." Both of these goals--the targeted removal of information from a model and the targeted suppression of information from a model's outputs--present various technical and substantive challenges. We provide a framework for thinking rigorously about these challenges, which enables us to be clear about why unlearning is not a general-purpose solution for circumscribing generative-AI model behavior in service of broader positive impact. We aim for conceptual clarity and to encourage more thoughtful communication among machine learning (ML), law, and policy experts who seek to develop and apply technical methods for compliance with policy objectives.
Political-LLM: Large Language Models in Political Science
Li, Lincan, Li, Jiaqi, Chen, Catherine, Gui, Fred, Yang, Hongjia, Yu, Chenxiao, Wang, Zhengguang, Cai, Jianing, Zhou, Junlong Aaron, Shen, Bolin, Qian, Alex, Chen, Weixin, Xue, Zhongkai, Sun, Lichao, He, Lifang, Chen, Hanjie, Ding, Kaize, Du, Zijian, Mu, Fangzhou, Pei, Jiaxin, Zhao, Jieyu, Swayamdipta, Swabha, Neiswanger, Willie, Wei, Hua, Hu, Xiyang, Zhu, Shixiang, Chen, Tianlong, Lu, Yingzhou, Shi, Yang, Qin, Lianhui, Fu, Tianfan, Tu, Zhengzhong, Yang, Yuzhe, Yoo, Jaemin, Zhang, Jiaheng, Rossi, Ryan, Zhan, Liang, Zhao, Liang, Ferrara, Emilio, Liu, Yan, Huang, Furong, Zhang, Xiangliang, Rothenberg, Lawrence, Ji, Shuiwang, Yu, Philip S., Zhao, Yue, Dong, Yushun
In recent years, large language models (LLMs) have been widely adopted in political science tasks such as election prediction, sentiment analysis, policy impact assessment, and misinformation detection. Meanwhile, the need to systematically understand how LLMs can further revolutionize the field also becomes urgent. In this work, we--a multidisciplinary team of researchers spanning computer science and political science--present the first principled framework termed Political-LLM to advance the comprehensive understanding of integrating LLMs into computational political science. Specifically, we first introduce a fundamental taxonomy classifying the existing explorations into two perspectives: political science and computational methodologies. In particular, from the political science perspective, we highlight the role of LLMs in automating predictive and generative tasks, simulating behavior dynamics, and improving causal inference through tools like counterfactual generation; from a computational perspective, we introduce advancements in data preparation, fine-tuning, and evaluation methods for LLMs that are tailored to political contexts. We identify key challenges and future directions, emphasizing the development of domain-specific datasets, addressing issues of bias and fairness, incorporating human expertise, and redefining evaluation criteria to align with the unique requirements of computational political science. Political-LLM seeks to serve as a guidebook for researchers to foster an informed, ethical, and impactful use of Artificial Intelligence in political science. Our online resource is available at: http://political-llm.org/. Corresponding authors: Yushun Dong (yd24f@fsu.edu) is with the Department of Computer Science, Florida State University; Yue Zhao (yzhao010@usc.edu) is with the Department of Computer Science, University of Southern California; Fred Gui (pgui@lsu.edu) is with the Department of Political Science, Louisiana State University; Catherine Chen (catherinechen@lsu.edu) is with the Manship School of Mass Communication and the Department of Political Science, Louisiana State University.
Social Media Informatics for Sustainable Cities and Societies: An Overview of the Applications, associated Challenges, and Potential Solutions
Khan, Jebran, Ahmad, Kashif, Jagatheesaperumal, Senthil Kumar, Ahmad, Nasir, Sohn, Kyung-Ah
In the modern world, our cities and societies face several technological and societal challenges, such as rapid urbanization, global warming & climate change, the digital divide, and social inequalities, increasing the need for more sustainable cities and societies. Addressing these challenges requires a multifaceted approach involving all the stakeholders, sustainable planning, efficient resource management, innovative solutions, and modern technologies. Like other modern technologies, social media informatics also plays its part in developing more sustainable and resilient cities and societies. Despite its limitations, social media informatics has proven very effective in various sustainable cities and society applications. In this paper, we review and analyze the role of social media informatics in sustainable cities and society by providing a detailed overview of its applications, associated challenges, and potential solutions. This work is expected to provide a baseline for future research in the domain.
Generative Adversarial Reviews: When LLMs Become the Critic
Bougie, Nicolas, Watanabe, Narimasa
The peer review process is fundamental to scientific progress, determining which papers meet the quality standards for publication. Yet, the rapid growth of scholarly production and increasing specialization in knowledge areas strain traditional scientific feedback mechanisms. In light of this, we introduce Generative Agent Reviewers (GAR), leveraging LLM-empowered agents to simulate faithful peer reviewers. To enable generative reviewers, we design an architecture that extends a large language model with memory capabilities and equips agents with reviewer personas derived from historical data. Central to this approach is a graph-based representation of manuscripts, condensing content and logically organizing information - linking ideas with evidence and technical details. GAR's review process leverages external knowledge to evaluate paper novelty, followed by detailed assessment using the graph representation and multi-round assessment. Finally, a meta-reviewer aggregates individual reviews to predict the acceptance decision. Our experiments demonstrate that GAR performs comparably to human reviewers in providing detailed feedback and predicting paper outcomes. Beyond mere performance comparison, we conduct insightful experiments, such as evaluating the impact of reviewer expertise and examining fairness in reviews. By offering early expert-level feedback, typically restricted to a limited group of researchers, GAR democratizes access to transparent and in-depth evaluation.
SafeWatch: An Efficient Safety-Policy Following Video Guardrail Model with Transparent Explanations
Chen, Zhaorun, Pinto, Francesco, Pan, Minzhou, Li, Bo
With the rise of generative AI and rapid growth of high-quality video generation, video guardrails have become more crucial than ever to ensure safety and security across platforms. Current video guardrails, however, are either overly simplistic, relying on pure classification models trained on simple policies with limited unsafe categories, which lack detailed explanations, or prompting multimodal large language models (MLLMs) with long safety guidelines, which are inefficient and impractical for guardrailing real-world content. To bridge this gap, we propose SafeWatch, an efficient MLLM-based video guardrail model designed to follow customized safety policies and provide multi-label video guardrail outputs with content-specific explanations in a zero-shot manner. In particular, unlike traditional MLLM-based guardrails that encode all safety policies autoregressively, causing inefficiency and bias, SafeWatch uniquely encodes each policy chunk in parallel and eliminates their position bias such that all policies are attended simultaneously with equal importance. In addition, to improve efficiency and accuracy, SafeWatch incorporates a policy-aware visual token pruning algorithm that adaptively selects the most relevant video tokens for each policy, discarding noisy or irrelevant information. This allows for more focused, policy-compliant guardrail with significantly reduced computational overhead. Considering the limitations of existing video guardrail benchmarks, we propose SafeWatch-Bench, a large-scale video guardrail benchmark comprising over 2M videos spanning six safety categories which covers over 30 tasks to ensure a comprehensive coverage of all potential safety scenarios. SafeWatch outperforms SOTA by 28.2% on SafeWatch-Bench, 13.6% on benchmarks, cuts costs by 10%, and delivers top-tier explanations validated by LLM and human reviews.
Monet: Mixture of Monosemantic Experts for Transformers
Park, Jungwoo, Ahn, Young Jin, Kim, Kee-Eung, Kang, Jaewoo
Understanding the internal computations of large language models (LLMs) is crucial for aligning them with human values and preventing undesirable behaviors like toxic content generation. However, mechanistic interpretability is hindered by polysemanticity -- where individual neurons respond to multiple, unrelated concepts. While Sparse Autoencoders (SAEs) have attempted to disentangle these features through sparse dictionary learning, they have compromised LLM performance due to reliance on post-hoc reconstruction loss. To address this issue, we introduce Mixture of Monosemantic Experts for Transformers (Monet) architecture, which incorporates sparse dictionary learning directly into end-to-end Mixture-of-Experts pretraining. Our novel expert decomposition method enables scaling the expert count to 262,144 per layer while total parameters scale proportionally to the square root of the number of experts. Our analyses demonstrate mutual exclusivity of knowledge across experts and showcase the parametric knowledge encapsulated within individual experts. Moreover, Monet allows knowledge manipulation over domains, languages, and toxicity mitigation without degrading general performance. Our pursuit of transparent LLMs highlights the potential of scaling expert counts to enhance mechanistic interpretability and directly resect the internal knowledge to fundamentally adjust model behavior. The source code and pretrained checkpoints are available at https://github.com/dmis-lab/Monet.
Creativity in AI: Progresses and Challenges
Ismayilzada, Mete, Paul, Debjit, Bosselut, Antoine, van der Plas, Lonneke
Creativity is the ability to produce novel, useful, and surprising ideas, and has been widely studied as a crucial aspect of human cognition. Machine creativity on the other hand has been a long-standing challenge. With the rise of advanced generative AI, there has been renewed interest and debate regarding AI's creative capabilities. Therefore, it is imperative to revisit the state of creativity in AI and identify key progresses and remaining challenges. In this work, we survey leading works studying the creative capabilities of AI systems, focusing on creative problem-solving, linguistic, artistic, and scientific creativity. Our review suggests that while the latest AI models are largely capable of producing linguistically and artistically creative outputs such as poems, images, and musical pieces, they struggle with tasks that require creative problem-solving, abstract thinking and compositionality and their generations suffer from a lack of diversity, originality, long-range incoherence and hallucinations. We also discuss key questions concerning copyright and authorship issues with generative models. Furthermore, we highlight the need for a comprehensive evaluation of creativity that is process-driven and considers several dimensions of creativity. Finally, we propose future research directions to improve the creativity of AI outputs, drawing inspiration from cognitive science and psychology.
What If We Had Used a Different App? Reliable Counterfactual KPI Analysis in Wireless Systems
Hou, Qiushuo, Park, Sangwoo, Zecchin, Matteo, Cai, Yunlong, Yu, Guanding, Simeone, Osvaldo
In modern wireless network architectures, such as Open Radio Access Network (O-RAN), the operation of the radio access network (RAN) is managed by applications, or apps for short, deployed at intelligent controllers. These apps are selected from a given catalog based on current contextual information. For instance, a scheduling app may be selected on the basis of current traffic and network conditions. Once an app is chosen and run, it is no longer possible to directly test the key performance indicators (KPIs) that would have been obtained with another app. In other words, we can never simultaneously observe both the actual KPI, obtained by the selected app, and the counterfactual KPI, which would have been attained with another app, for the same network condition, making individual-level counterfactual KPIs analysis particularly challenging. This what-if analysis, however, would be valuable to monitor and optimize the network operation, e.g., to identify suboptimal app selection strategies. This paper addresses the problem of estimating the values of KPIs that would have been obtained if a different app had been implemented by the RAN. To this end, we propose a conformal-prediction-based counterfactual analysis method for wireless systems that provides reliable error bars for the estimated KPIs, despite the inherent covariate shift between logged and test data. Experimental results for medium access control-layer apps and for physical-layer apps demonstrate the merits of the proposed method.