Goto

Collaborating Authors

 sarkar


Minimax Rates for Hyperbolic Hierarchical Learning

arXiv.org Machine Learning

We prove an exponential separation in sample complexity between Euclidean and hyperbolic representations for learning on hierarchical data under standard Lipschitz regularization. For depth-$R$ hierarchies with branching factor $m$, we first establish a geometric obstruction for Euclidean space: any bounded-radius embedding forces volumetric collapse, mapping exponentially many tree-distant points to nearby locations. This necessitates Lipschitz constants scaling as $\exp(ฮฉ(R))$ to realize even simple hierarchical targets, yielding exponential sample complexity under capacity control. We then show this obstruction vanishes in hyperbolic space: constant-distortion hyperbolic embeddings admit $O(1)$-Lipschitz realizability, enabling learning with $n = O(mR \log m)$ samples. A matching $ฮฉ(mR \log m)$ lower bound via Fano's inequality establishes that hyperbolic representations achieve the information-theoretic optimum. We also show a geometry-independent bottleneck: any rank-$k$ prediction space captures only $O(k)$ canonical hierarchical contrasts.


TuneGenie: Reasoning-based LLM agents for preferential music generation

arXiv.org Artificial Intelligence

Recently, Large language models (LLMs) have shown great promise across a diversity of tasks, ranging from generating images to reasoning spatially. Considering their remarkable (and growing) textual reasoning capabilities, we investigate LLMs' potency in conducting analyses of an individual's preferences in music (based on playlist metadata, personal write-ups, etc.) and producing effective prompts (based on these analyses) to be passed to Suno AI (a generative AI tool for music production). Our proposition of a novel LLM-based textual representation to music model (which we call TuneGenie) and the various methods we develop to evaluate & benchmark similar models add to the increasing (and increasingly controversial) corpus of research on the use of AI in generating art.


Adversarial Robustness without Adversarial Training: A Teacher-Guided Curriculum Learning Approach

Neural Information Processing Systems

Current SOTA adversarially robust models are mostly based on adversarial training (AT) and differ only by some regularizers either at inner maximization or outer minimization steps. Being repetitive in nature during the inner maximization step, they take a huge time to train. We propose a non-iterative method that enforces the following ideas during training. Attribution maps are more aligned to the actual object in the image for adversarially robust models compared to naturally trained models. Also, the allowed set of pixels to perturb an image (that changes model decision) should be restricted to the object pixels only, which reduces the attack strength by limiting the attack space.


The true story of the devastating 2015 Mariana dam disaster

The Guardian

Who is behind the most notorious "deepfake" app on the internet? Trying to answer that question these past few months, for a new Guardian podcast series, Black Box, has been like wandering through a hall of mirrors. The app, ClothOff, has hundreds of thousands of followers and has already been used in a least two cases to generate dozens of images of underage girls โ€“ pictures that have left the girls traumatised, their parents outraged and the police baffled at how to stop it. Producers Josh Kelly, Alex Atack and I have followed ClothOff's trail to nondescript addresses in central London that appear to be unoccupied. We have encountered sham businesses, distorted voices and photographs of fake employees.


Weisfeiler and Leman go Hyperbolic: Learning Distance Preserving Node Representations

arXiv.org Artificial Intelligence

In recent years, graph neural networks (GNNs) have emerged as a promising tool for solving machine learning problems on graphs. Most GNNs are members of the family of message passing neural networks (MPNNs). There is a close connection between these models and the Weisfeiler-Leman (WL) test of isomorphism, an algorithm that can successfully test isomorphism for a broad class of graphs. Recently, much research has focused on measuring the expressive power of GNNs. For instance, it has been shown that standard MPNNs are at most as powerful as WL in terms of distinguishing non-isomorphic graphs. However, these studies have largely ignored the distances between the representations of nodes/graphs which are of paramount importance for learning tasks. In this paper, we define a distance function between nodes which is based on the hierarchy produced by the WL algorithm, and propose a model that learns representations which preserve those distances between nodes. Since the emerging hierarchy corresponds to a tree, to learn these representations, we capitalize on recent advances in the field of hyperbolic neural networks. We empirically evaluate the proposed model on standard node and graph classification datasets where it achieves competitive performance with state-of-the-art models.


Tech companies tapping artificial intelligence to treat and predict mental health disorders

#artificialintelligence

Behavioural health tech provider Holmusk is banking on that, partnering authorities in Singapore to develop a suite of digital tools for hospitals and clinics. One solution the firm is looking to introduce is a "smart pill" to track when patients forget or skip their medication. How that works, is through a small, grain-sized biosensor embedded within the pill, and a sticky patch on the patient's body that can detect when the pill is ingested. The technology is approved in the United States. "Let's say schizophrenia, depression patients with some psychosis โ€“ not taking the pill for a few days can be bad enough to drive them off the cliff. And if you knew that they have stopped taking the pill two days in a row, you can intervene. You can catch them early", chief analytics officer of Holmusk, Joydeep Sarkar, told CNA.


A Biologically Inspired CMOS Image Sensor (Studies in Computational Intelligence, 461): Sarkar, Mukul, Theuwissen, Albert: 9783642349003: Amazon.com: Books

#artificialintelligence

The CMOS metal layer is used to create an embedded micro-polarizer able to sense polarization information. This polarization information is shown to be useful in applications like real time material classification and autonomous agent navigation. Further the sensor is equipped with in pixel analog and digital memories which allow variation of the dynamic range and in-pixel binarization in real time. The binary output of the pixel tries to replicate the flickering effect of the insect's eye to detect smallest possible motion based on the change in state. An inbuilt counter counts the changes in states for each row to estimate the direction of the motion.


Hands-On Transfer Learning with Python: Implement advanced deep learning and neural network models using TensorFlow and Keras: Sarkar, Dipanjan, Bali, Raghav, Ghosh, Tamoghna: 9781788831307: Amazon.com: Books

#artificialintelligence

Dipanjan Sarkar is a Data Scientist at Intel, on a mission to make the world more connected and productive. He primarily works on data science, analytics, business intelligence, application development, and building large-scale intelligent systems. He holds a master of technology degree in Information Technology with specializations in Data Science and Software Engineering from the International Institute of Information Technology, Bangalore. He is also an avid supporter of self-learning, especially Massive Open Online Courses and also holds a Data Science Specialization from Johns Hopkins University on Coursera. Dipanjan has been an analytics practitioner for several years now, specializing in statistical, predictive, and text analytics.


tile2tile: Learning Game Filters for Platformer Style Transfer

arXiv.org Artificial Intelligence

We present tile2tile, an approach for style transfer between levels of tile-based platformer games. Our method involves training models that translate levels from a lower-resolution sketch representation based on tile affordances to the original tile representation for a given game. This enables these models, which we refer to as filters, to translate level sketches into the style of a specific game. Moreover, by converting a level of one game into sketch form and then translating the resulting sketch into the tiles of another game, we obtain a method of style transfer between two games. We use Markov random fields and autoencoders for learning the game filters and apply them to demonstrate style transfer between levels of Super Mario Bros, Kid Icarus, Mega Man and Metroid.


How 'digital twin' AI will transform sustainable farming

#artificialintelligence

This story is part of Fix's What's Next Issue,which looks ahead to the ideas and innovations that will shape the climate conversation in 2022, and asks what it means to have hope now. Check out the full issue here. Imagine you're standing at the edge of a soybean field in Iowa. In the distance, a combine harvester guided by GPS rolls across a field that has been leveled with the aid of a laser, as the farmer at the wheel monitors weather data on her phone. These tools, part of an approach to agronomy called precision agriculture, promise to increase yields and reduce costs by maximizing efficiency.