cristianini
DPO Kernels: A Semantically-Aware, Kernel-Enhanced, and Divergence-Rich Paradigm for Direct Preference Optimization
Das, Amitava, Trivedy, Suranjana, Khanna, Danush, Roy, Rajarshi, Singh, Gurpreet, Ghosh, Basab, Narsupalli, Yaswanth, Jain, Vinija, Sharma, Vasu, Reganti, Aishwarya Naresh, Chadha, Aman
The rapid rise of large language models (LLMs) has unlocked many applications but also underscores the challenge of aligning them with diverse values and preferences. Direct Preference Optimization (DPO) is central to alignment but constrained by fixed divergences and limited feature transformations. We propose DPO-Kernels, which integrates kernel methods to address these issues through four key contributions: (i) Kernelized Representations with polynomial, RBF, Mahalanobis, and spectral kernels for richer transformations, plus a hybrid loss combining embedding-based and probability-based objectives; (ii) Divergence Alternatives (Jensen-Shannon, Hellinger, Renyi, Bhattacharyya, Wasserstein, and f-divergences) for greater stability; (iii) Data-Driven Selection metrics that automatically choose the best kernel-divergence pair; and (iv) a Hierarchical Mixture of Kernels for both local precision and global modeling. Evaluations on 12 datasets demonstrate state-of-the-art performance in factuality, safety, reasoning, and instruction following. Grounded in Heavy-Tailed Self-Regularization, DPO-Kernels maintains robust generalization for LLMs, offering a comprehensive resource for further alignment research.
Compositional Fusion of Signals in Data Embedding
Guo, Zhijin, Xu, Zhaozhen, Lewis, Martha, Cristianini, Nello
Embeddings in AI convert symbolic structures into fixed-dimensional vectors, effectively fusing multiple signals. However, the nature of this fusion in real-world data is often unclear. To address this, we introduce two methods: (1) Correlation-based Fusion Detection, measuring correlation between known attributes and embeddings, and (2) Additive Fusion Detection, viewing embeddings as sums of individual vectors representing attributes. Applying these methods, word embeddings were found to combine semantic and morphological signals. BERT sentence embeddings were decomposed into individual word vectors of subject, verb and object. In the knowledge graph-based recommender system, user embeddings, even without training on demographic data, exhibited signals of demographics like age and gender. This study highlights that embeddings are fusions of multiple signals, from Word2Vec components to demographic hints in graph embeddings.
Listen: Google's music-writing AI bot that 'could trick exam setters'
Nello Cristianini, a professor of AI at the University of Bath, said AI has been associated with the creation of music as far back as 1980, however MusicLM is the "most advanced" yet. "It is clearly going to be used, and useful, and controversial too," Prof Cristianini said. "This technology is still unexplored, and we have not tested its legal ramifications." "If you're two musicians who are tasked with composing a piece, and they both happen to stumble across the same AI and happen to put in the same form of words, then presumably they're going to come up with the same product," he said. "I just don't know how you'd unpick AI versus AI?"
Shortcuts to artificial intelligence – a tale
The current paradigm of artificial intelligence emerged as the result of a series of cultural innovations, some technical and some social. Among them are seemingly small design decisions, that led to a subtle reframing of some of the field's original goals, and are now accepted as standard. They correspond to technical shortcuts, aimed at bypassing problems that were otherwise too complicated or too expensive to solve, while still delivering a viable version of AI. Far from being a series of separate problems, recent cases of unexpected effects of AI are the consequences of those very choices that enabled the field to succeed, and this is why it will be difficult to solve them. Research at the University of Bristol has considered three of these choices, investigating their connection to some of today's challenges in AI, including those relating to bias, value alignment, privacy and explainability.
The Anatomy of a Modular System for Media Content Analysis
Flaounas, Ilias, Lansdall-Welfare, Thomas, Antonakaki, Panagiota, Cristianini, Nello
Intelligent systems for the annotation of media content are increasingly being used for the automation of parts of social science research. In this domain the problem of integrating various Artificial Intelligence (AI) algorithms into a single intelligent system arises spontaneously. As part of our ongoing effort in automating media content analysis for the social sciences, we have built a modular system by combining multiple AI modules into a flexible framework in which they can cooperate in complex tasks. Our system combines data gathering, machine translation, topic classification, extraction and annotation of entities and social networks, as well as many other tasks that have been perfected over the past years of AI research. Over the last few years, it has allowed us to realise a series of scientific studies over a vast range of applications including comparative studies between news outlets and media content in different countries, modelling of user preferences, and monitoring public mood. The framework is flexible and allows the design and implementation of modular agents, where simple modules cooperate in the annotation of a large dataset without central coordination.
India Khabar Google AI defeats human Go champion
AlphaGo secured the victory after winning the second game in a three-part match. Following the defeat, Ke Jie told reporters: "I'm a little bit sad, it's a bit of a regret because I think I played pretty well." In Go, players take turns placing stones on a 19-by-19 grid, competing to take control of the most territory. It is considered to be one of the world's most complex games, and is much more challenging for computers than chess. AlphaGo has built up its expertise by studying older matches and playing thousands of games against itself.
The Machines are Coming: China's role in the future of artificial intelligence
Try typing "the machines" into Google and chances are that one of the top results the artificial intelligence-powered search engine will return is the phrase: "The Machines are Coming". After a 2016 filled with high-profile advances in artificial intelligence (AI), leading technologists say this could be a breakout year in the development of intelligent machines that emulate humans. Asia, until now lagging Silicon Valley in AI, will play a bigger role as the field cements itself at the pinnacle of the technology world in 2017, the experts say. AI – technically, a computing field that involves the analysis of large troves of data to predict outcomes and patterns – is as old as modern computers but its esoteric nature means it has long endured caricatures of its actual potential – think for example, the 1960s space age cartoon The Jetsons, which featured a sentient robot maid and automated flying cars (both of which we are still waiting for, even 50 years on). Now, a confluence of factors has given rise to hopes that computers with human-like cognitive ability may soon be a reality.
Here are the top moments in modern British history according to artificial intelligence
What historian has time to read tens of millions of news articles from more than a century of British history? So computer scientists and historians have taught computers how to do the job instead, analysing billions of words of news reports to take a new look at the 19th and early 20th centuries. The study, published in the journal PNAS, marks the early steps of the emerging field of "culturomics". Computers analysed a total of 28.6 billion words from 35 million British regional news stories published between 1800 and 1950, which made up about 14% of the total output of the regional press in that period. For comparison, the average adult has a reading speed of about 300 words per minute. At that rate, it would take someone about 180 solid years to do all that reading, not including a lunch break.
The Machines are Coming: China's role in the future of artificial intelligence
Try typing "the machines" into Google and chances are that one of the top results the artificial intelligence-powered search engine will return is the phrase: "The Machines are Coming". After a 2016 filled with high-profile advances in artificial intelligence (AI), leading technologists say this could be a breakout year in the development of intelligent machines that emulate humans. Asia, until now lagging Silicon Valley in AI, will play a bigger role as the field cements itself at the pinnacle of the technology world in 2017, the experts say. AI – technically, a computing field that involves the analysis of large troves of data to predict outcomes and patterns – is as old as modern computers but its esoteric nature means it has long endured caricatures of its actual potential – think for example, the 1960s space age cartoon The Jetsons, which featured a sentient robot maid and automated flying cars (both of which we are still waiting for, even 50 years on). Now, a confluence of factors has given rise to hopes that computers with human-like cognitive ability may soon be a reality.