narayanan
Can a new book crack one of neuroscience's hardest problems? Not quite
The ideas presented in George Lakoff and Srini Narayanan's The Neural Mind are fascinating, but the writing is far less compelling This is a book review in two parts. The first is about the ideas presented in The Neural Mind: How brains think, which are fascinating. The second is about the actual experience of reading it. The book tackles one of the biggest questions in neuroscience: how do neurons perform all the different kinds of human thought possible, from planning motor actions to composing sentences and musing about philosophy? The authors have very different perspectives.
- Europe > Switzerland > Zürich > Zürich (0.15)
- North America > United States > Illinois > Cook County > Chicago (0.05)
- North America > United States > California > Alameda County > Berkeley (0.05)
- Europe > United Kingdom > England > Devon (0.05)
A-MHA*: Anytime Multi-Heuristic A*
Natarajan, Ramkumar, Saleem, Muhammad Suhail, Xiao, William, Aine, Sandip, Choset, Howie, Likhachev, Maxim
Designing good heuristic functions for graph search requires adequate domain knowledge. It is often easy to design heuristics that perform well and correlate with the underlying true cost-to-go values in certain parts of the search space but these may not be admissible throughout the domain thereby affecting the optimality guarantees of the search. Bounded suboptimal search using several such partially good but inadmissible heuristics was developed in Multi-Heuristic A* (MHA*). Although MHA* leverages multiple inadmissible heuristics to potentially generate a faster suboptimal solution, the original version does not improve the solution over time. It is a one shot algorithm that requires careful setting of inflation factors to obtain a desired one time solution. In this work, we tackle this issue by extending MHA* to an anytime version that finds a feasible suboptimal solution quickly and continually improves it until time runs out. Our work is inspired from the Anytime Repairing A* (ARA*) algorithm. We prove that our precise adaptation of ARA* concepts in the MHA* framework preserves the original suboptimal and completeness guarantees and enhances MHA* to perform in an anytime fashion. Furthermore, we report the performance of A-MHA* in 3-D path planning domain and sliding tiles puzzle and compare against MHA* and other anytime algorithms.
Can We Trust Machine Learning? The Reliability of Features from Open-Source Speech Analysis Tools for Speech Modeling
Chowdhury, Tahiya, Romero, Veronica
Machine learning-based behavioral models rely on features extracted from audio-visual recordings. The recordings are processed using open-source tools to extract speech features for classification models. These tools often lack validation to ensure reliability in capturing behaviorally relevant information. This gap raises concerns about reproducibility and fairness across diverse populations and contexts. Speech processing tools, when used outside of their design context, can fail to capture behavioral variations equitably and can then contribute to bias. We evaluate speech features extracted from two widely used speech analysis tools, OpenSMILE and Praat, to assess their reliability when considering adolescents with autism. We observed considerable variation in features across tools, which influenced model performance across context and demographic groups. We encourage domain-relevant verification to enhance the reliability of machine learning models in clinical applications.
- North America > United States (0.05)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Research Report > New Finding (0.47)
- Research Report > Experimental Study (0.47)
Two Paths for A.I.
Last spring, Daniel Kokotajlo, an A.I.-safety researcher working at OpenAI, quit his job in protest. He'd become convinced that the company wasn't prepared for the future of its own technology, and wanted to sound the alarm. After a mutual friend connected us, we spoke on the phone. I found Kokotajlo affable, informed, and anxious. Advances in "alignment," he told me--the suite of techniques used to insure that A.I. acts in accordance with human commands and values--were lagging behind gains in intelligence.
- Asia > China (0.05)
- North America > United States > California (0.05)
- Asia > Taiwan (0.05)
On Generalization Bounds for Projective Clustering
Given a set of points, clustering consists of finding a partition of a point set into k clusters such that the center to which a point is assigned is as close as possible. Most commonly, centers are points themselves, which leads to the famous k -median and k -means objectives. One may also choose centers to be j dimensional subspaces, which gives rise to subspace clustering. In this paper, we consider learning bounds for these problems. That is, given a set of n samples P drawn independently from some unknown, but fixed distribution \mathcal{D}, how quickly does a solution computed on P converge to the optimal clustering of \mathcal{D}?We give several near optimal results.
The Veln(ia)s is in the Details: Evaluating LLM Judgment on Latvian and Lithuanian Short Answer Matching
Kostiuk, Yevhen, Vitman, Oxana, Gagała, Łukasz, Kiulian, Artur
In this work, we address the challenge of evaluating large language models (LLMs) on the short answer matching task for Latvian and Lithuanian languages. We introduce novel datasets consisting of 502 Latvian and 690 Lithuanian question-answer pairs. For each question-answer pair, we generated matched and non-matched answers using a set of alteration rules specifically designed to introduce small but meaningful changes in the text. These generated answers serve as test cases to assess the ability of LLMs to detect subtle differences in matching of the original answers. A subset of the datasets was manually verified for quality and accuracy. Our results show that while larger LLMs, such as QWEN2.5 72b and LLaMa3.1 70b, demonstrate near-perfect performance in distinguishing matched and non-matched answers, smaller models show more variance. For instance, LLaMa3.1 8b and EuroLLM 9b benefited from few-shot examples, while Mistral Nemo 12b underperformed on detection of subtle text alteration, particularly in Lithuanian, even with additional examples. QWEN2.5 7b and Mistral 7b were able to obtain a strong and comparable performance to the larger 70b models in zero and few shot experiments. Moreover, the performance of Mistral 7b was weaker in few shot experiments.
- Asia > Middle East > Jordan (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- (9 more...)
Generative AI Hype Feels Inescapable. Tackle It Head On With Education
Arvind Narayanan, a computer science professor at Princeton University, is best known for calling out the hype surrounding artificial intelligence in his Substack, AI Snake Oil, written with PhD candidate Sayash Kapoor. The two authors recently released a book based on their popular newsletter about AI's shortcomings. But don't get it twisted--they aren't against using new technology. "It's easy to misconstrue our message as saying that all of AI is harmful or dubious," Narayanan says. He makes clear, during a conversation with WIRED, that his rebuke is not aimed at the software per say, but rather the culprits who continue to spread misleading claims about artificial intelligence.
ML-EAT: A Multilevel Embedding Association Test for Interpretable and Transparent Social Science
Wolfe, Robert, Hiniker, Alexis, Howe, Bill
This research introduces the Multilevel Embedding Association Test (ML-EAT), a method designed for interpretable and transparent measurement of intrinsic bias in language technologies. The ML-EAT addresses issues of ambiguity and difficulty in interpreting the traditional EAT measurement by quantifying bias at three levels of increasing granularity: the differential association between two target concepts with two attribute concepts; the individual effect size of each target concept with two attribute concepts; and the association between each individual target concept and each individual attribute concept. Using the ML-EAT, this research defines a taxonomy of EAT patterns describing the nine possible outcomes of an embedding association test, each of which is associated with a unique EAT-Map, a novel four-quadrant visualization for interpreting the ML-EAT. Empirical analysis of static and diachronic word embeddings, GPT-2 language models, and a CLIP language-and-image model shows that EAT patterns add otherwise unobservable information about the component biases that make up an EAT; reveal the effects of prompting in zero-shot models; and can also identify situations when cosine similarity is an ineffective metric, rendering an EAT unreliable. Our work contributes a method for rendering bias more observable and interpretable, improving the transparency of computational investigations into human minds and societies.
- North America > United States (0.14)
- Africa > Kenya (0.04)
- Africa > Eswatini > Manzini > Manzini (0.04)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.68)
- Law (0.93)
- Health & Medicine > Therapeutic Area (0.68)
- Media (0.67)
Exploring Speech Foundation Models for Speaker Diarization in Child-Adult Dyadic Interactions
Xu, Anfeng, Huang, Kevin, Feng, Tiantian, Shen, Lue, Tager-Flusberg, Helen, Narayanan, Shrikanth
Speech foundation models, trained on vast datasets, have opened unique opportunities in addressing challenging low-resource speech understanding, such as child speech. In this work, we explore the capabilities of speech foundation models on child-adult speaker diarization. We show that exemplary foundation models can achieve 39.5% and 62.3% relative reductions in Diarization Error Rate and Speaker Confusion Rate, respectively, compared to previous speaker diarization methods. In addition, we benchmark and evaluate the speaker diarization results of the speech foundation models with varying the input audio window size, speaker demographics, and training data ratio. Our results highlight promising pathways for understanding and adopting speech foundation models to facilitate child speech understanding.
- North America > United States > California (0.14)
- Europe > Sweden > Stockholm > Stockholm (0.04)
Beyond mirkwood: Enhancing SED Modeling with Conformal Predictions
Traditional spectral energy distribution (SED) fitting techniques face uncertainties due to assumptions in star formation histories and dust attenuation curves. We propose an advanced machine learning-based approach that enhances flexibility and uncertainty quantification in SED fitting. Unlike the fixed NGBoost model used in mirkwood, our approach allows for any sklearn-compatible model, including deterministic models. We incorporate conformalized quantile regression to convert point predictions into error bars, enhancing interpretability and reliability. Using CatBoost as the base predictor, we compare results with and without conformal prediction, demonstrating improved performance using metrics such as coverage and interval width. Our method offers a more versatile and accurate tool for deriving galaxy physical properties from observational data.
- North America > United States > Hawaii (0.04)
- North America > Canada (0.04)
- Europe > France (0.04)