state
Dyslexia and the Reading Wars
Proven methods for teaching the readers who struggle most have been known for decades. Why do we often fail to use them? "There's a window of opportunity to intervene," Mark Seidenberg, a cognitive neuroscientist, said. "You don't want to let that go." In 2024, my niece Caroline received a Ph.D. in gravitational-wave physics. Her research interests include "the impact of model inaccuracies on biases in parameters recovered from gravitational wave data" and "Petrov type, principal null directions, and Killing tensors of slowly rotating black holes in quadratic gravity." I watched a little of her dissertation defense, on Zoom, and was lost as soon as she'd finished introducing herself. She and her husband now live in Italy, where she has a postdoctoral appointment. Caroline's academic achievements seem especially impressive if you know that until third grade she could barely read: to her, words on a page looked like a pulsing mass. She attended a private school in Connecticut, and there was a set time every day when students selected books to read on their own. "I can't remember how long that lasted, but it felt endless," she told me. She hid her disability by turning pages when her classmates did, and by volunteering to draw illustrations during group story-writing projects. One day, she told her grandmother that she could sound out individual letters but when she got to "the end of a row" she couldn't remember what had come before. A psychologist eventually identified her condition as dyslexia. Fluent readers sometimes think of dyslexia as a tendency to put letters in the wrong order or facing the wrong direction, but it's more complicated than that.
- North America > United States > Connecticut (0.24)
- Europe > Italy (0.24)
- North America > United States > New York > Bronx County > New York City (0.05)
- (9 more...)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Education > Educational Setting (1.00)
State of play: who holds the power in the video games industry in 2025?
The world's most powerful people have started to realise that games have immense influence - why else would the White House post an image of Trump as Halo's Master Chief? The world's most powerful people have started to realise that games have immense influence - why else would the White House post an image of Trump as Halo's Master Chief? State of play: who holds the power in the video games industry in 2025? I love playing video games, but what interests me most as a journalist are the ways in which games intersect with real life. One of the joys of spending 20 years on this beat has been meeting hundreds of people whose lives have been meaningfully enhanced by games, and as their cultural influence has grown, these stories have become more and more plentiful. There is another side to this, however.
- North America > United States (1.00)
- Oceania > Australia (0.05)
- North America > Canada (0.05)
- (3 more...)
- Information Technology > Artificial Intelligence > Games (0.96)
- Information Technology > Communications > Social Media (0.72)
- Information Technology > Game Technology (0.61)
SnapStream: Efficient Long Sequence Decoding on Dataflow Accelerators
Li, Jonathan, Farahini, Nasim, Iuliugin, Evgenii, Vesterlund, Magnus, Häggström, Christian, Wang, Guangtao, Upasani, Shubhangi, Sachdeva, Ayush, Li, Rui, Fu, Faline, Wu, Chen, Siddiqua, Ayesha, Long, John, Zhao, Tuowen, Musaddiq, Matheen, Zeffer, Håkan, Du, Yun, Wang, Mingran, Li, Qinghua, Li, Bo, Thakker, Urmish, Prabhakar, Raghu
The proliferation of 100B+ parameter Large Language Models (LLMs) with 100k+ context length support have resulted in increasing demands for on-chip memory to support large KV caches. Techniques such as StreamingLLM and SnapKV demonstrate how to control KV cache size while maintaining model accuracy. Yet, these techniques are not commonly used within industrial deployments using frameworks like vLLM or SGLang. The reason is twofold: on one hand, the static graphs and continuous batching methodology employed by these frameworks make it difficult to admit modifications to the standard multi-head attention algorithm, while on the other hand, the accuracy implications of such techniques on modern instruction-following and reasoning models are not well understood, obfuscating the need for implementing these techniques. In this paper, we explore these accuracy implications on Llama-3.1-8B-Instruct and DeepSeek-R1, and develop SnapStream, a KV cache compression method that can be deployed at scale. We demonstrate the efficacy of SnapStream in a 16-way tensor-parallel deployment of DeepSeek-671B on SambaNova SN40L accelerators running at 128k context length and up to 1832 tokens per second in a real production setting. SnapStream enables $4\times$ improved on-chip memory usage and introduces minimal accuracy degradation on LongBench-v2, AIME24 and LiveCodeBench. To the best of our knowledge, this is the first implementation of sparse KV attention techniques deployed in a production inference system with static graphs and continuous batching.
- Africa (0.28)
- Asia > Japan (0.14)
- North America > United States > New York (0.14)
- (2 more...)
- Workflow (0.67)
- Overview (0.67)
- Research Report > New Finding (0.45)
- Health & Medicine (1.00)
- Energy (1.00)
- Information Technology (0.93)
- (2 more...)
Dynamic Rank Factor Model for Text Streams
Shaobo Han, Lin Du, Esther Salazar, Lawrence Carin
We propose a semi-parametric and dynamic rank factor model for topic modeling, capable of (i) discovering topic prevalence over time, and (ii) learning contemporary multi-scale dependence structures, providing topic and word correlations as a byproduct. The high-dimensional and time-evolving ordinal/rank observations (such as word counts), after an arbitrary monotone transformation, are well accommodated through an underlying dynamic sparse factor model. The framework naturally admits heavy-tailed innovations, capable of inferring abrupt temporal jumps in the importance of topics. Posterior inference is performed through straightforward Gibbs sampling, based on the forward-filtering backward-sampling algorithm. Moreover, an efficient data subsampling scheme is leveraged to speed up inference on massive datasets. The modeling framework is illustrated on two real datasets: the US State of the Union Address and the JSTOR collection from Science .
- North America > Mexico (0.14)
- North America > Panama (0.14)
- North America > Cuba (0.14)
- (7 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.90)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
The UN's AI warnings grow louder
The UN's AI warnings grow louder Welcome back to In the Loop, new twice-weekly newsletter about AI. It was a busy week for our team: Tharin Pillay was on site during the UN General Assembly in New York, while Harry Booth and Nikita Ostrovsky were at the "All In AI" event in Montreal. If you're reading this in your browser, why not subscribe to have the next one delivered straight to your inbox? The United Nations General Assembly met this week in New York. While the assembly members spent much of their time on the crises in Palestine and Sudan, they also devoted a good chunk to AI.
- North America > United States > New York (0.48)
- North America > Canada > Quebec > Montreal (0.26)
- Asia > Middle East > Palestine (0.25)
- (8 more...)
- Government (1.00)
- Law > Statutes (0.32)
- Media > Film (0.31)
Binary classification for perceived quality of headlines and links on worldwide news websites, 2018-2024
McCutcheon, Austin, de Oliveira, Thiago E. A., Zheleznov, Aleksandr, Brogly, Chris
The proliferation of online news enables potential widespread publication of perceived low-quality news headlines/links. As a result, we investigated whether it was possible to automatically distinguish perceived lower-quality news headlines/links from perceived higher-quality headlines/links. We evaluated twelve machine learning models on a binary, balanced dataset of 57,544,214 worldwide news website links/headings from 2018-2024 (28,772,107 per class) with 115 extracted linguistic features. Binary labels for each text were derived from scores based on expert consensus regarding the respective news domain quality. Traditional ensemble methods, particularly the bagging classifier, had strong performance (88.1% accuracy, 88.3% F1, 80/20 train/test split). Fine-tuned DistilBERT achieved the highest accuracy (90.3%, 80/20 train/test split) but required more training time. The results suggest that both NLP features with traditional classifiers and deep learning models can effectively differentiate perceived news headline/link quality, with some trade-off between predictive performance and train time.
Review for NeurIPS paper: Learning to Execute Programs with Instruction Pointer Attention Graph Neural Networks
Weaknesses: ** Training methodology is not properly mentioned. It is a regression-based loss on the numerical value? It would be good if you can mention how it is related to IPA-GNN. GATs also allow attending using different weights to incoming messages from neighbors. Note that GAT is a kind of convolution-based GNN and does not use a recurrent unit, so you will have to adapt the attention mechanism in the context of GGNN.
A Paradigm for Potential Model Performance Improvement in Classification and Regression Problems. A Proof of Concept
Lobo-Cabrera, Francisco Javier
Binary classification, multilabel classification, and regression prediction constitute fundamental paradigms in machine learning, addressing distinct types of predictive modeling tasks. Binary classification involves categorizing instances into one of two classes, typically denoted as positive and negative [1][2][3]. This modeling framework is particularly applicable to scenarios where outcomes are binary in nature, as observed in domains such as spam detection and medical diagnosis. In multilabel classification, the scope extends to situations where instances can be associated with multiple classes simultaneously, a common occurrence in applications like image tagging and document categorization [1][4]. Conversely, regression prediction is concerned with forecasting continuous outcomes, aiming to predict numeric values [3].
California governor vetoes bill for obligatory human operators in autonomous trucks
California Gov. Gavin Newsom has blocked a bill that would have required autonomous trucks weighing more than 10,000 pounds (4,536kg) to have human safety drivers on board while operating on public roads. The governor said in a statement that the legislation, which California Senate members passed in a 36-2 vote, was unnecessary. Newsom believes existing laws are sufficient to ensure there's an "appropriate regulatory framework." The governor noted that, under a 2012 law, the state's Department of Motor Vehicles collaborates with the National Highway Traffic Safety Administration, California Highway Patrol and other relevant bodies "to determine the regulations necessary for the safe operation of autonomous vehicles on public roads." Newsom added that the DMV is committed to making sure rules keep up with the pace of evolving autonomous vehicle tech.
- Transportation > Ground > Road (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)