Goto

Collaborating Authors

 forehead


SCALEX: Scalable Concept and Latent Exploration for Diffusion Models

arXiv.org Artificial Intelligence

Image generation models frequently encode social biases, including stereotypes tied to gender, race, and profession. Existing methods for analyzing these biases in diffusion models either focus narrowly on predefined categories or depend on manual interpretation of latent directions. These constraints limit scalability and hinder the discovery of subtle or unanticipated patterns. W e introduce SCALEX, a framework for scalable and automated exploration of diffusion model latent spaces. SCALEX extracts semantically meaningful directions from H-space using only natural language prompts, enabling zero-shot interpretation without retraining or labelling. This allows systematic comparison across arbitrary concepts and large-scale discovery of internal model associations. W e show that SCALEX detects gender bias in profession prompts, ranks semantic alignment across identity descriptors, and reveals clustered conceptual structure without supervision. By linking prompts to latent directions directly, SCALEX makes bias analysis in diffusion models more scalable, interpretable, and extensible than prior approaches.


'We were all pretty privileged': Allison Williams on Girls, nepo babies and toxic momfluencers

The Guardian

If you had wandered the set of the film M3gan 2.0 last year, chances are you would have stumbled into M3gan, the terrifying humanoid doll, staring lifelessly while she waited to be called for her next scene. Sometimes she would stand in the corner of the soundstage, says Allison Williams with a nervy laugh. "The dilemma is: do you turn her around so she's facing the wall, or do you let her face the room? In the sequel to the sci-fi horror M3gan, Williams resumes her role as Gemma, a roboticist who has become a crusader against rampant and reckless AI development after her creation โ€“ developed for her orphaned niece โ€“ became murderous. Acting opposite M3gan was unsettling, says Williams, speaking over a video call from a hotel room in New York. Sometimes she was played by the 15-year-old dancer Amie Donald, but often she was a robotic doll, animated by a small team. "When she's been working for a while, her eyelids can get sticky," says Williams. M3gan's handlers would paint lubricant on to her eyeballs with a brush and Williams would have to catch herself: "She's not flinching and for a second you're like: 'Ugh.' Then you remember: this is not a live thing." Still best known for her first role as Marnie in Lena Dunham's landmark TV series Girls, Williams has gravitated towards comedy-tinged horror in recent years. Her first post-Girls film role was in the Oscar-winning dark comedy horror Get Out. It and M3gan were relatively low-budget projects that became cultural phenomena โ€“ Get Out for its commentary on racial politics, M3gan for what it says about the dangers of AI (as well as the uncanniness of M3gan herself). Williams has long been interested in AI โ€“ she knows Sam Altman, the co-founder and CEO of OpenAI, which created ChatGPT, who put her in touch with robotics experts when she was researching the role of Gemma. The film raises questions not only about the danger of rogue AI, but about the ethical concerns โ€“including how we should feel about the "rights" of devices. "It's easy to imbue anything that has AI in it with humanity.


Northern India's elusive snow leopards get their close up

Popular Science

Adapted to live in some of our planet's most inhospitable mountainous regions, snow leopards (Panthera uncia) are the ultimate mountain climbers and an iconic big cat. A recent camera trapping study found that India is home to the most dense population of the black and white carnivores on Earth and most live in a remote northern region of the subcontinent. Here, they also appear to co-exist alongside rural communities, where they are respected by local human populations. The findings are detailed in a study published May 7 in the open-access journal PLOS One. Snow leopards are found in mountainous regions across 12 Asian countries: Afghanistan, Bhutan, China, India, Kazakhstan, Kyrgyz Republic, Mongolia, Nepal, Pakistan, Russia, Tajikistan, and Uzbekistan.


DotLumen's haptic headset could help blind people navigate

Engadget

DotLumen founder Cornel Amariei describes his product as a "self driving" system to enable blind and low-vision people a way to get around. It's essentially the electronic equivalent to a guide dog, helping users avoid obstacles when walking around. The Romanian company turned up to CES 2025 in Las Vegas armed with prototypes of its headset that it hopes will make blind people's lives a lot easier. The headset looks like a chunky piece of VR gear, with a front unit sitting on your forehead just above your eyes. There's a chunky power and processing pack on the rear that keeps the bulky device's weight balanced while walking around.


PEEB: Part-based Image Classifiers with an Explainable and Editable Language Bottleneck

arXiv.org Artificial Intelligence

CLIP-based classifiers rely on the prompt containing a {class name} that is known to the text encoder. Therefore, they perform poorly on new classes or the classes whose names rarely appear on the Internet (e.g., scientific names of birds). For fine-grained classification, we propose PEEB - an explainable and editable classifier to (1) express the class name into a set of text descriptors that describe the visual parts of that class; and (2) match the embeddings of the detected parts to their textual descriptors in each class to compute a logit score for classification. In a zero-shot setting where the class names are unknown, PEEB outperforms CLIP by a huge margin (~10x in top-1 accuracy). Compared to part-based classifiers, PEEB is not only the state-of-the-art (SOTA) on the supervised-learning setting (88.80% and 92.20% accuracy on CUB-200 and Dogs-120, respectively) but also the first to enable users to edit the text descriptors to form a new classifier without any re-training. Compared to concept bottleneck models, PEEB is also the SOTA in both zero-shot and supervised-learning settings.


How Suboptimal is Training rPPG Models with Videos and Targets from Different Body Sites?

arXiv.org Artificial Intelligence

Remote camera measurement of the blood volume pulse via photoplethysmography (rPPG) is a compelling technology for scalable, low-cost, and accessible assessment of cardiovascular information. Neural networks currently provide the state-of-the-art for this task and supervised training or fine-tuning is an important step in creating these models. However, most current models are trained on facial videos using contact PPG measurements from the fingertip as targets/ labels. One of the reasons for this is that few public datasets to date have incorporated contact PPG measurements from the face. Yet there is copious evidence that the PPG signals at different sites on the body have very different morphological features. Is training a facial video rPPG model using contact measurements from another site on the body suboptimal? Using a recently released unique dataset with synchronized contact PPG and video measurements from both the hand and face, we can provide precise and quantitative answers to this question. We obtain up to 40 % lower mean squared errors between the waveforms of the predicted and the ground truth PPG signals using state-of-the-art neural models when using PPG signals from the forehead compared to using PPG signals from the fingertip. We also show qualitatively that the neural models learn to predict the morphology of the ground truth PPG signal better when trained on the forehead PPG signals. However, while models trained from the forehead PPG produce a more faithful waveform, models trained from a finger PPG do still learn the dominant frequency (i.e., the heart rate) well.


MindGames: Targeting Theory of Mind in Large Language Models with Dynamic Epistemic Modal Logic

arXiv.org Artificial Intelligence

Theory of Mind (ToM) is a critical component of intelligence but its assessment remains the subject of heated debates. Prior research applied human ToM assessments to natural language processing models using either human-created standardized tests or rule-based templates. However, these methods primarily focus on simplistic reasoning and require further validation. Here, we leverage dynamic epistemic logic to isolate a particular component of ToM and to generate controlled problems. We also introduce new verbalization techniques to express these problems in English natural language. Our findings indicate that some language model scaling (from 70M to 6B and 350M to 174B) does not consistently yield results better than random chance. While GPT-4 demonstrates superior epistemic reasoning capabilities, there is still room for improvement. Our code and datasets are publicly available (https://huggingface.co/datasets/sileod/mindgames , https://github.com/sileod/llm-theory-of-mind )


Top Most Interesting Machine Learning Apps

#artificialintelligence

Machine Learning is the branch of science that studies how computers can learn without being explicitly programmed. As the name implies, it provides the computer with a feature that makes it more human-like: the ability to learn. Machine learning is being used actively today, possibly in many more places than one would expect. The same factors that have fueled the resurgence of interest in machine learning have also made data mining and Bayesian analysis more popular than ever before. All of this means that you can create models quickly and automatically that can analyze larger, more complex data and provide faster, more accurate results - even on a very large scale.


Real Time Video based Heart and Respiration Rate Monitoring

arXiv.org Artificial Intelligence

In recent years, research about monitoring vital signs by smartphones grows significantly. There are some special sensors like Electrocardiogram (ECG) and Photoplethysmographic (PPG) to detect heart rate (HR) and respiration rate (RR). Smartphone cameras also can measure HR by detecting and processing imaging Photoplethysmographic (iPPG) signals from the video of a user's face. Indeed, the variation in the intensity of the green channel can be measured by the iPPG signals of the video. This study aimed to provide a method to extract heart rate and respiration rate using the video of individuals' faces. The proposed method is based on measuring fluctuations in the Hue, and can therefore extract both HR and RR from the video of a user's face. The proposed method is evaluated by performing on 25 healthy individuals. For each subject, 20 seconds video of his/her face is recorded. Results show that the proposed approach of measuring iPPG using Hue gives more accurate rates than the Green channel.


Analyzing Brain Activity to Detect and Treat Patient Pain Even When Unconscious

#artificialintelligence

Researchers from MIT and elsewhere have developed a system that detects pain in patients by analyzing brain activity from a wearable neuroimaging device, which could help doctors diagnose and treat pain in unconscious and noncommunicative patients. Researchers from MIT and elsewhere have developed a system that measures a patient's pain level by analyzing brain activity from a portable neuroimaging device. The system could help doctors diagnose and treat pain in unconscious and noncommunicative patients, which could reduce the risk of chronic pain that can occur after surgery. Pain management is a surprisingly challenging, complex balancing act. Overtreating pain, for example, runs the risk of addicting patients to pain medication. Undertreating pain, on the other hand, may lead to long-term chronic pain and other complications.