methodology
Bayesian Mixture-of-Experts: Towards Making LLMs Know What They Don't Know
The Mixture-of-Experts (MoE) architecture has enabled the creation of massive yet efficient Large Language Models (LLMs). However, the standard deterministic routing mechanism presents a significant limitation: its inherent brittleness is a key contributor to model miscalibration and overconfidence, resulting in systems that often do not know what they don't know. This thesis confronts this challenge by proposing a structured \textbf{Bayesian MoE routing framework}. Instead of forcing a single, deterministic expert selection, our approach models a probability distribution over the routing decision itself. We systematically investigate three families of methods that introduce this principled uncertainty at different stages of the routing pipeline: in the \textbf{weight-space}, the \textbf{logit-space}, and the final \textbf{selection-space}. Through a series of controlled experiments on a 3-billion parameter MoE model, we demonstrate that this framework significantly improves routing stability, in-distribution calibration, and out-of-distribution (OoD) detection. The results show that by targeting this core architectural component, we can create a more reliable internal uncertainty signal. This work provides a practical and computationally tractable pathway towards building more robust and self-aware LLMs, taking a crucial step towards making them know what they don't know.
A Day in the Life of a Data Scientist
A number of weeks ago I solicited feedback from my LinkedIn connections regarding what their typical day in the life of a data scientist consisted of. The response was genuinely overwhelming! Sure, no data scientist role is the same, and that's the reason for the inquiry. So many potential data scientists are interested in knowing what it is that those on the other side keep themselves busy with all day, and so I thought that having a few connections provide their insight might be a useful endeavor. What follows is some of the great feedback I received via email and LinkedIn messages from those who were interested in providing a few paragraphs on their daily professional tasks.
Fight Against Cancer with Artificial Intelligence and Big Data - OpenMind
From anywhere and with just a mobile phone, anyone can become an air traffic controller, or at least a virtual air traffic controller. One can follow the world traffic flow of airplanes live and find out where an aircraft is coming from and where it is headed. One just has to take advantage of the millions of pieces of data that fly across the Internet. This is the magic power of Big Data. Artificial intelligence then enters the picture to find patterns and give meaning to the massive and heterogeneous information stream.
424
Editor: On "Learning Language" I was dismayed by the inclusion of William Katke's article ("Learning Language Using A Pattern Recognition Approach," Spring 1985). Usually you do an excellent job of representing "the current state of the art in Artificial Intelligence" (to quote your Editorial Policy), but I consider this article an exception. First of all, although the article claims to be on "Learning Language," what it presents is at best a knowledge-free approach to learning syntax. I saw no evidence that the induced syntax is useful for anything, and good reasons to believe that it is not, such as the unmnemonic category names and the intrinsic limitations of finite state grammars. Second, this kind of stuff has been done before, and it didn't work too well then either; for a useful overview of the field and pointers into the literature, see the article on "Grammatical Inference" in Volume 3 of The Handbook of The plete specifications and the verification of proposed impleideas and issues presented were firmly focused on a conven-mentations, we should concentrate more on incremental tional view of the design process-a view I can caricaturize development of specifications as a result of assessment of as the SPIV methodology: performance.
Techniques and Methodology
Should Artificial Intelligence strive to model and understand human cognitive and perceptual systems? Should it operate at a more abstract mathematical level of characterizing possible intelligent action, independent of human performance? Or, should it focus on building working programs that exhibit increasingly expert behavior, irrespective of theoretical or psychological conccrlls? These questions lie at the heart of most current, debate on whether AI is a science, an art, or a new branch of engineering In fact, some researchers believe it is all three and consequently build systems that perform some interesting task, arguing for the "theoretical significance" and "psychological validity" of the approach. In fact, it assumes the cognitive psychology paradigm as central and suggests that AI research would benefit from closer adherence to the data and methods of psychological research We welcome contributions in support of other research methodologies in AI, as well as discussions com-Rcscarch for this paper was conducted at the LJniversity of Chicago Center for Cognitive Science under a grant.
Letters
However, I believe that the distinction of "neats" and "scruffies" raised at Cog Sci in '81 didn't define scruffies as people who built expert systems [they didn't really exist as a "real" part of MAD. Instead, I believe AI These are the researchers who read Hawkings and say "gee, if his model of the lo-23 second big bang is right, then the distribution of intergalactic gases should be relatively even. Let's go see if that's true. However, to run our experiments we'll need a more sensitive space-based sensing device, so let's work with the engineers to design one." I think one could make the case (although not from the data collected in Cohen's survey) that the two methodologies are not informed and influenced by each other to the extent they should or could be.
Knowledge And Experience In Artificial Intelligence
Via G. Galilei 5, 21027 Ispra (VA), Italy The period since the last conference in this series has been characterized by the explosive expansion of AI out of the confines of institutions of basic research like university departments into the worlds of industry, business, and government (a development I had long expected). But it seems to me that there are plenty-perhaps an overabundance-of other occasions, other conferences, other workshops, and the like, at which the applications of AI would appropriately be considered. In fact, it is ironic-though perhaps it may be understandable-that precisely now, when the outside world has discovered and started showing its appreciation of AI and its potential, there is a widespread malaise among research workers in the field about the health of their subject. This malaise has to do not only with logistic issues such as the drain of very good people from research into applications, or some of the gross inadequacies of structural and funding support by governments. It has to do also with the very heart and methodology of the subject.
The 1999 Asia-Pacific Conference on Intelligent-Agent Technology
IAT'99 was the first meeting in this new series and was held in Hong Kong from 14 to 17 December. It was sponsored by Hong Kong Baptist University, the Croucher Foundation, the Epson Foundation, The MIT Press, the Association for Computing Machinery (ACM) Hong Kong, and the Institute of Electrical and Electronics Engineers Hong Kong Section Computer Chapter and in cooperation with ACM Special Interest Groups in Artificial Intelligence (SIGART), Knowledge Discovery in Data (SIGKDD), and Computer-Human Interaction (SIGCHI). Jiming Liu (Hong Kong Baptist University) and Ning Zhong (Yamaguchi University, Japan) were the program chairs, and Setsuo Ohsuga (Waseda University) and Ernest Lam (Hong Kong Baptist University) were the general chairs. IAT'99 successfully brought together over 150 researchers and practitioners to share their original research results and practical development experiences in intelligent-agent technology. The participants were from Australia, Austria, Belgium, ...
Review of Intelligent Scheduling
Intelligent Scheduling is a system-oriented book on scheduling systems. Each chapter describes a scheduling system in terms of the particular scheduling problems being addressed, design assumptions, and the overall paradigm being used. The book is divided into two sections: (1) scheduling methodologies and (2) application case studies. The methodology chapters focus on research systems and scheduling techniques. The application chapters focus on fielded embedded scheduling systems and describe difficulties and lessons learned.
The Nature of AI: A Reply to Schank
In fact, there are enough opinions for four men. That is, the views advanced are contradictory. I agree with one of the A fifth answer is also advanced, but is immediately withdrawn. Roger Schanks, and disagree with the other three. Schank hoped that his article would start a debate on As & hank points out, this is unsatisfactory because it leads the issues he raised.