AITopics | Baluja, Shumeet

Collaborating Authors

Baluja, Shumeet

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Evolving Deeper LLM Thinking

Lee, Kuang-Huei, Fischer, Ian, Wu, Yueh-Hua, Marwood, Dave, Baluja, Shumeet, Schuurmans, Dale, Chen, Xinyun

arXiv.org Artificial IntelligenceJan-16-2025

We explore an evolutionary search strategy for scaling inference time compute in Large Language Models. The proposed approach, Mind Evolution, uses a language model to generate, recombine and refine candidate responses. The proposed approach avoids the need to formalize the underlying inference problem whenever a solution evaluator is available. Controlling for inference cost, we find that Mind Evolution significantly outperforms other inference strategies such as Best-of-N and Sequential Revision in natural language planning tasks. In the TravelPlanner and Natural Plan benchmarks, Mind Evolution solves more than 98% of the problem instances using Gemini 1.5 Pro without the use of a formal solver.

evolutionary algorithm, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2501.09891

Country:

North America > United States > New York (0.14)
North America > United States > California (0.14)
North America > Canada > Alberta (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Consumer Products & Services > Travel (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Add feedback

Making Images from Images: Interleaving Denoising and Transformation

Baluja, Shumeet, Marwood, David, Baluja, Ashwin

arXiv.org Artificial IntelligenceNov-24-2024

Simply by rearranging the regions of an image, we can create a new image of any subject matter. The definition of regions is user definable, ranging from regularly and irregularly-shaped blocks, concentric rings, or even individual pixels. Our method extends and improves recent work in the generation of optical illusions by simultaneously learning not only the content of the images, but also the parameterized transformations required to transform the desired images into each other. By learning the image transforms, we allow any source image to be pre-specified; any existing image (e.g. the Mona Lisa) can be transformed to a novel subject. We formulate this process as a constrained optimization problem and address it through interleaving the steps of image diffusion with an energy minimization step. Unlike previous methods, increasing the number of regions actually makes the problem easier and improves results. We demonstrate our approach in both pixel and latent spaces. Creative extensions, such as using infinite copies of the source image and employing multiple source images, are also given.

artificial intelligence, machine learning, source image, (20 more...)

arXiv.org Artificial Intelligence

2411.15925

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Diversity and Diffusion: Observations on Synthetic Image Distributions with Stable Diffusion

Marwood, David, Baluja, Shumeet, Alon, Yair

arXiv.org Artificial IntelligenceOct-31-2023

Recent progress in text-to-image (TTI) systems, such as StableDiffusion, Imagen, and DALL-E 2, have made it possible to create realistic images with simple text prompts. It is tempting to use these systems to eliminate the manual task of obtaining natural images for training a new machine learning classifier. However, in all of the experiments performed to date, classifiers trained solely with synthetic images perform poorly at inference, despite the images used for training appearing realistic. Examining this apparent incongruity in detail gives insight into the limitations of the underlying image generation processes. Through the lens of diversity in image creation vs.accuracy of what is created, we dissect the differences in semantic mismatches in what is modeled in synthetic vs. natural images. This will elucidate the roles of the image-languag emodel, CLIP, and the image generation model, diffusion. We find four issues that limit the usefulness of TTI systems for this task: ambiguity, adherence to prompt, lack of diversity, and inability to represent the underlying concept. We further present surprising insights into the geometry of CLIP embeddings.

artificial intelligence, diffusion, machine learning, (5 more...)

arXiv.org Artificial Intelligence

2311.00056

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.53)

Add feedback

Hiding Images in Plain Sight: Deep Steganography

Baluja, Shumeet

Neural Information Processing SystemsFeb-14-2020, 09:41:16 GMT

deep learning, neural network, steganography, (5 more...)

Neural Information Processing Systems

Genre: Research Report (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.45)

Add feedback

Table-Based Neural Units: Fully Quantizing Networks for Multiply-Free Inference

Covell, Michele, Marwood, David, Baluja, Shumeet, Johnston, Nick

arXiv.org Machine LearningJun-11-2019

In this work, we propose to quantize all parts of standard classification networks and replace the activation-weight--multiply step with a simple table-based lookup. This approach results in networks that are free of floating-point operations and free of multiplications, suitable for direct FPGA and ASIC implementations. It also provides us with two simple measures of per-layer and network-wide compactness as well as insight into the distribution characteristics of activationoutput and weight values. We run controlled studies across different quantization schemes, both fixed and adaptive and, within the set of adaptive approaches, both parametric and model-free. We implement our approach to quantization with minimal, localized changes to the training process, allowing us to benefit from advances in training continuous-valued network architectures. We apply our approach successfully to AlexNet, ResNet, and MobileNet. We show results that are within 1.6% of the reported, non-quantized performance on MobileNet using only 40 entries in our table. This performance gap narrows to zero when we allow tables with 320 entries. Our results give the best accuracies among multiply-free networks.

deep learning, neural network, quantization, (19 more...)

arXiv.org Machine Learning

1906.04798

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Communications > Networks (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

No Multiplication? No Floating Point? No Problem! Training Networks for Efficient Inference

Baluja, Shumeet, Marwood, David, Covell, Michele, Johnston, Nick

arXiv.org Machine LearningSep-28-2018

A different body of research has focused on quantizing and clustering network weights (Yi et al., 2008; Courbariaux et al., 2016; Rastegari et al., 2016; Deng et al., 2017; Wu et al., 2018). For successful deployment of deep neural networks on highly resource constrained devices (hearing aids, earbuds, wearables), we must simplify the types of operations and the memory/power resources required during inference. Completely avoiding inference-time floating point operations is one of the simplest ways to design networks for these highly constrained environments. By quantizing both our in-network non-linearities and our network weights, we can move to simple, compact networks without floating point operations, without multiplications, and without nonlinear function computations. Our approach allows us to explore the spectrum of possible networks, ranging from fully continuous versions down to networks with bi-level weights and activations. Our results show that quantization can be done with little or no loss of performance on both regression tasks (auto-encoding) and multi-class classification tasks (ImageNet). The memory needed to deploy our quantized networks is less than one-third of the equivalent architecture that uses floating-point operations. The activations in our networks emit only a small number of predefined, quantized values (typically 32) and all of the network's weight are drawn from a small number of unique values (typically 100-1000) found by employing a novel periodic adaptive clustering step during training. Almost all recent neural-network training algorithms rely on gradient-based learning. This has moved the research field away from using discrete-valued inference, with hard thresholds, to smooth, continuous-valued activation functions (Werbos, 1974; Rumelhart et al., 1986). Unfortunately, this causes inference to be done with floating-point operations, making it difficult to deploy on an increasinglylarge set of low-cost, limited-memory, low-power hardware in both commercial (Lane et al., 2015) and research settings (Bourzac, 2017). Avoiding all floating point operations allows the inference network to realize the power-saving gains available with fixed-point processing (Finnerty & Ratigner, 2017).

deep learning, neural network, quantization, (20 more...)

arXiv.org Machine Learning

1809.09244

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.86)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Learning to Attack: Adversarial Transformation Networks

Baluja, Shumeet (Google, Inc.) | Fischer, Ian (Google, Inc.)

AAAI ConferencesFeb-8-2018

With the rapidly increasing popularity of deep neural networks for image recognition tasks, a parallel interest in generating adversarial examples to attack the trained models has arisen. To date, these approaches have involved either directly computing gradients with respect to the image pixels or directly solving an optimization on the image pixels. We generalize this pursuit in a novel direction: can a separate network be trained to efficiently attack another fully trained network? We demonstrate that it is possible, and that the generated attacks yield startling insights into the weaknesses of the target network. We call such a network an Adversarial Transformation Network (ATN). ATNs transform any input into an adversarial attack on the target network, while being minimally perturbing to the original inputs and the target network's outputs. Further, we show that ATNs are capable of not only causing the target network to make an error, but can be constructed to explicitly control the type of misclassification made. We demonstrate ATNs on both simple MNIST-digit classifiers and state-of-the-art ImageNet classifiers deployed by Google, Inc.: Inception ResNet-v2.

atn, deep learning, neural network, (21 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Industry:

Government (0.35)
Information Technology > Security & Privacy (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Hiding Images in Plain Sight: Deep Steganography

Baluja, Shumeet

Neural Information Processing SystemsDec-31-2017

Steganography is the practice of concealing a secret message within another, ordinary, message. Commonly, steganography is used to unobtrusively hide a small message within the noisy regions of a larger image. In this study, we attempt to place a full size color image within another image of the same size. Deep neural networks are simultaneously trained to create the hiding and revealing processes and are designed to specifically work as a pair. The system is trained on images drawn randomly from the ImageNet database, and works well on natural images from a wide variety of sources. Beyond demonstrating the successful application of deep learning to hiding images, we carefully examine how the result is achieved and explore extensions. Unlike many popular steganographic methods that encode the secret message within the least significant bits of the carrier image, our approach compresses and distributes the secret image's representation across all of the available bits.

deep learning, neural network, secret image, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Labeling the Features Not the Samples: Efficient Video Classification with Minimal Supervision

Leordeanu, Marius (Institute of Mathematics of the Romanian Academy) | Radu, Alexandra (Institute of Mathematics of the Romanian Academy) | Baluja, Shumeet (Google Research) | Sukthankar, Rahul (Google Research)

AAAI ConferencesApr-19-2016

Feature selection is essential for effective visual recognition. We propose an efficient joint classifier learning and feature selection method that discovers sparse, compact representations of input features from a vast sea of candidates, with an almost unsupervised formulation. Our method requires only the following knowledge, which we call the feature sign - whether or not a particular feature has on average stronger values over positive samples than over negatives. We show how this can be estimated using as few as a single labeled training sample per class. Then, using these feature signs, we extend an initial supervised learning problem into an (almost) unsupervised clustering formulation that can incorporate new data without requiring ground truth labels. Our method works both as a feature selection mechanism and as a fully competitive classifier. It has important properties, low computational cost annd excellent accuracy, especially in difficult cases of very limited training data. We experiment on large-scale recognition in video and show superior speed and performance to established feature selection approaches such as AdaBoost, Lasso, greedy forward-backward selection, and powerful classifiers such as SVM.

artificial intelligence, classifier, machine learning, (17 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country:

Europe > Romania (0.14)
North America > United States (0.14)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Probabilistic Modeling for Face Orientation Discrimination: Learning from Labeled and Unlabeled Data

Baluja, Shumeet

Neural Information Processing SystemsDec-31-1999

This paper presents probabilistic modeling methods to solve the problem of discriminating between five facial orientations with very little labeled data.

artificial intelligence, dependency, machine learning, (13 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback