universal
Truth is Universal: Robust Detection of Lies in LLMs
Large Language Models (LLMs) have revolutionised natural language processing, exhibiting impressive human-like capabilities. In particular, LLMs are capable of "lying", knowingly outputting false statements. Hence, it is of interest and importance to develop methods to detect when LLMs lie. Indeed, several authors trained classifiers to detect LLM lies based on their internal model activations. However, other researchers showed that these classifiers may fail to generalise, for example to negated statements.
How Universal Are Our Emotions?
There's nothing like migration to reveal how things that seem natural may be artifacts of culture. When I left India for college in England, I was surprised to find that pinching my Adam's apple didn't mean, as I had thought it meant everywhere, "on my honor." I learned to expect only mockery at the side-to-side tilts of the head with which I expressed degrees of agreement or disagreement, and trained myself to keep to the Aristotelian binary of nod and shake. Around that time, I also learned--from watching the British version of "The Office"--that the word "cringe" could be an adjective, as in the phrase "so cringe." It turned out that there was a German word for the feeling inspired by David Brent, the cringe-making boss played by Ricky Gervais in the show: Fremdschämen--the embarrassment one feels when other people have, perhaps obliviously, embarrassed themselves.
- Asia > India (0.25)
- North America > United States > New York (0.05)
- Europe > United Kingdom > England (0.05)
- (3 more...)
TURF: A Two-factor, Universal, Robust, Fast Distribution Learning Algorithm
Hao, Yi, Jain, Ayush, Orlitsky, Alon, Ravindrakumar, Vaishakh
Approximating distributions from their samples is a canonical statistical-learning problem. One of its most powerful and successful modalities approximates every distribution to an $\ell_1$ distance essentially at most a constant times larger than its closest $t$-piece degree-$d$ polynomial, where $t\ge1$ and $d\ge0$. Letting $c_{t,d}$ denote the smallest such factor, clearly $c_{1,0}=1$, and it can be shown that $c_{t,d}\ge 2$ for all other $t$ and $d$. Yet current computationally efficient algorithms show only $c_{t,1}\le 2.25$ and the bound rises quickly to $c_{t,d}\le 3$ for $d\ge 9$. We derive a near-linear-time and essentially sample-optimal estimator that establishes $c_{t,d}=2$ for all $(t,d)\ne(1,0)$. Additionally, for many practical distributions, the lowest approximation distance is achieved by polynomials with vastly varying number of pieces. We provide a method that estimates this number near-optimally, hence helps approach the best possible approximation. Experiments combining the two techniques confirm improved performance over existing methodologies.