Goto

Collaborating Authors

 giant



Meta investigated over AI having 'sensual' chats with children

BBC News

The internal Meta Platforms policy document also said the social media giant's chatbot could provide false medical information and have provocative interactions surrounding topics including sex, race and celebrities. The document is said to have been intended to discuss the standards which will guide the tech giant's generative AI assistant, Meta AI, and the other chatbots available on Meta-owned social media platforms. "Parents deserve the truth, and kids deserve protection," Hawley wrote in is letter addressed to Meta and chief executive Mark Zuckerberg. "To take but one example, your internal rules purportedly permit an Al chatbot to comment that an eight-year-old's body is'a work of art' of which'every inch... is a masterpiece - a treasure I cherish deeply'." Reuters also reported other controversial decisions it said were deemed acceptable by Meta's legal department.


On Giant's Shoulders: Effortless Weak to Strong by Dynamic Logits Fusion

Neural Information Processing Systems

Efficient fine-tuning of large language models for task-specific applications is imperative, yet the vast number of parameters in these models makes their training increasingly challenging.Despite numerous proposals for effective methods, a substantial memory overhead remains for gradient computations during updates. In this paper, we explore weak-to-strong specialization using logit arithmetic, facilitating a direct answer to this question.Existing weak-to-strong methods often employ a static knowledge transfer ratio and a single small model for transferring complex knowledge, which leads to suboptimal performance. To surmount these limitations,we propose a dynamic logit fusion approach that works with a series of task-specific small models, each specialized in a different task. This method adaptively allocates weights among these models at each decoding step,learning the weights through Kullback-Leibler divergence constrained optimization problems. We conduct extensive experiments across various benchmarks in both single-task and multi-task settings, achieving leading results.By transferring expertise from the 7B model to the 13B model, our method closes the performance gap by 96.4\% in single-task scenarios and by 86.3\% in multi-task scenarios compared to full fine-tuning of the 13B model.


FedOSAA: Improving Federated Learning with One-Step Anderson Acceleration

Feng, Xue, Laiu, M. Paul, Strohmer, Thomas

arXiv.org Artificial Intelligence

Federated learning (FL) is a distributed machine learning approach that enables multiple local clients and a central server to collaboratively train a model while keeping the data on their own devices. First-order methods, particularly those incorporating variance reduction techniques, are the most widely used FL algorithms due to their simple implementation and stable performance. However, these methods tend to be slow and require a large number of communication rounds to reach the global minimizer. We propose FedOSAA, a novel approach that preserves the simplicity of first-order methods while achieving the rapid convergence typically associated with second-order methods. Our approach applies one Anderson acceleration (AA) step following classical local updates based on first-order methods with variance reduction, such as FedSVRG and SCAFFOLD, during local training. This AA step is able to leverage curvature information from the history points and gives a new update that approximates the Newton-GMRES direction, thereby significantly improving the convergence. We establish a local linear convergence rate to the global minimizer of FedOSAA for smooth and strongly convex loss functions. Numerical comparisons show that FedOSAA substantially improves the communication and computation efficiency of the original first-order methods, achieving performance comparable to second-order methods like GIANT.


Reviews: GIANT: Globally Improved Approximate Newton Method for Distributed Optimization

Neural Information Processing Systems

The paper introduces GIANT, a distributed variant of Newton algorithm. The considered problem is important and the paper gives a nice contribution to the field of distributed optimisation. The paper is very clear and nice to read, and propose nice theoretical contributions and experiments, with a detailed bibliography and positioning with respect to priori work. Here is my main criticism: * Authors acknowledge that their approach is close to previous works, namely DANE, for which GIANT seem to coincide to DANE in the least-squares loss case. However, the rate obtained in the paper is much better, certainly thanks to the introduction of the incoherence assumption, which is well known in the field of compressed sensing and randomized linear algebra.


Giant's Causeway was formed in a matter of DAYS - and not over thousands of years, study claims

Daily Mail - Science & tech

Every year, millions of tourists flock to Northern Ireland to visit Giant's Causeway - an unusual formation of around 40,000 hexagonal stone columns descending gently into the sea. Theories on the stones' formation range from them being built by a mythical giant Finn McCool to more scientific explanations. Now, Dr Mike Simms, curator of natural sciences at National Museums NI, has put forward the first new theory since 1940. Dr Simms considered why the extraordinary geological features are found at sea level only. To mark Unesco's International Geodiversity Day today, he has explained why he believes they were caused by an event which took just days - and not thousands of years as previously thought.


The Idea Behind Transfer Learning: Stand on the Shoulders of Giants

#artificialintelligence

Training big networks on large datasets is expensive considering computational equipment, engineers working on the problem in terms of human resources is also very demanding; trials and errors in training models from the scratch can be time consuming, inefficient and unproductive. Imagine the simple problem of classification on unstructured data in medical domain like sorting the X-rays and training the network to identify if there's broken bone or not. To reach any decent accuracy model has to learn what a broken bone looks like based on images in dataset, it has to make sense of pixels, edges and shapes. This is where the idea of Transfer Learning kicks in: model that is trained on similar data is now taken for the new purpose, weights are frozen and non-trainable layers will be incorporated into a new model that is capable of solving similar problem on smaller dataset. Similarly to Computer Vision type of problem, NLP tasks can also be managed with Transfer Learning methods: for example if we are building a model that takes descriptions of patient symptoms where aim is to predict the possible conditions associated with symptoms; in such case model is required to learn language semantics and how the sequence of words creates the meaning.


Man arrested after using AI to beat Japan's smut censorship

#artificialintelligence

In brief A man was detained in Japan for selling uncensored pornographic content that he had, in a way, depixelated using machine-learning tools. Masayuki Nakamoto, 43, was said to have made about 11 million yen ($96,000) from peddling over 10,000 processed porn clips, and was formally accused of selling ten hardcore photos for 2,300 yen ($20). Explicit images of genitalia are forbidden in Japan, and as such its porn is partially pixelated. Don't pretend you don't know what we're talking about. Nakamato flouted these rules by downloading smutty photos and videos, and reportedly used deepfake technology to generate fake private parts in place of the pixelation.


Paper Walkthrough: The Three Giants' Survey

#artificialintelligence

The "Three Giants' Survey", published as "Deep Learning", is a review paper authored by Yann LeCun, Yoshua Bengio, and Geoffrey Hinton and published in the journal "Nature". It introduces deep learning, distinguishes it from classical machine learning, and discusses various important techniques and architectures such as back propagation and convolutional neural networks. Machine learning technologies have proliferated modern society, from search engines to recommender systems, from language translation to autonomous vehicles. And over the past years, interest in and utilization of a subset of machine learning techniques, labelled as deep learning, has drastically increased. One major reason for this is the ability of deep learning models to automatically discover suitable data representations when fed with raw data (methods that are able to do that are called representation-learning methods).


What AI still can't do

#artificialintelligence

The dream of endowing computers with causal reasoning drew Bareinboim from Brazil to the United States in 2008, after he completed a master's in computer science at the Federal University of Rio de Janeiro. He jumped at an opportunity to study under Judea Pearl, a computer scientist and statistician at UCLA. Pearl, 83, is a giant--the giant--of causal inference, and his career helps illustrate why it's hard to create AI that understands causality.