Goto

Collaborating Authors

 emperor


Inside the Colosseum's Passage of Commodus, where emperors once walked

Popular Science

Inside the Colosseum's Passage of Commodus, where emperors once walked One theory suggests the infamous Roman emperor survived an assassination attempt in the tunnel now open to the public. From October 2024 to September 2025, a team of experts restored part of the tunnel that's open to visitors for the first time. Breakthroughs, discoveries, and DIY tips sent six days a week. They say all roads lead to Rome . But in the Eternal City, all of the major roads were thought to lead somewhere very specific--a single column called the Milliarium Auereum, or the golden milestone.


Do You Know What I Know?

The New Yorker

Do You Know What I Know? Steven Pinker argues that common knowledge makes the world go round--and off the rails. Take your young kid with you as you commute through Penn Station and you'll find that you have a lot to explain. Walking through the Long Island Railroad concourse, my son was perplexed by the close proximity of three chicken-themed restaurants--Chick-fil-A, Raising Cane's, and Pollo Campero--and by the fact that a shop called Gotham News mainly seemed to sell candy and bottled water. He also wanted to know why some people, as they strolled or waited, drank out of cans in brown paper bags.


EMPEROR: Efficient Moment-Preserving Representation of Distributions

Liu, Xinran, Sharma, Shansita D., Kolouri, Soheil

arXiv.org Machine Learning

Popular choices such as global average pooling [1] and CLS-style attention pooling [2] are computationally attractive but collapse the underlying distribution of features without guarantees on what information is preserved. This heuristic reduction can hinder interpretability, robustness, and data efficiency, and has motivated alternatives that try to encode more distributional structure [3, 4, 5, 6, 7]. However, most existing approaches emphasize empirical performance over principled recoverability or quantifiable fidelity to the original feature distribution. In this paper, we propose EMPEROR, an Efficient Moment-Preserving Representation of Distributions, that treats a layer's features as samples from a finite positive measure and encodes that measure through its moments. The core idea is to replace ambiguous, high-dimensional moment estimation with sliced moments: we project features onto multiple directions, fit lightweight univariate Gaussian mixture models (GMMs) to each projection, and aggregate the resulting slice parameters into a compact descriptor. Theoretically, sliced moments determine the multivariate measure under mild conditions (via Carleman + Cram er-Wold), and specializing to GMMs yields explicit, stable moment formulas.


NIRVANA: Structured pruning reimagined for large language models compression

Ai, Mengting, Wei, Tianxin, Chen, Sirui, He, Jingrui

arXiv.org Artificial Intelligence

To address these critical shortcomings, we introduce NIRV ANA, a novel pruning method explicitly designed to balance immediate zero-shot accuracy preservation with robust fine-tuning capability. Transformer-based (V aswani et al., 2017) large language models (LLMs) have revolutionized natural To alleviate this critical bottleneck, model compression techniques--particularly pruning (LeCun et al., 1989)--emerge as an essential strategy, aiming to create lighter, more accessible models These two can also be applied for semi-structured pruning. This oversight often results in suboptimal pruning choices, impairing model performance. To address these critical gaps, we introduce NIRV ANA (NTK-InfoRmed adaptiVe neuron & AttentioN heAd pruning), a novel structured pruning method that tightly integrates pruning decisions with model fine-tuning dynamics through the lens of the Neural Tangent Kernel (NTK) (Jacot et al., 2018). An adaptive sparsity allocation strategy that dynamically adjusts pruning ratios across layers and modules, explicitly addressing overlooked disparities in existing pruning methodologies. Recent unstructured pruning methods, such as SparseGPT (Frantar and Alistarh, 2023) and Wanda (Sun et al., 2023), prune individual weights Semi-structured methods address this by imposing fixed patterns (e.g., 2:4 sparsity (Fang et al., 2024; Zheng et al., 2024)), yet still struggle to support efficient training and require specialized hardware. ShortGPT (Men et al., 2024) introduce global or layer-wise pruning strategies, yet do not explicitly SliceGPT (Ashkboos et al., 2024) applies PCA-based transformations per block, but remains highly sensitive to calibration data, reflecting a broader Table 4. Since most of the current LLMs are based on SwiGLU Shazeer (2020) structure, we focus Neural Tangent Kernel (NTK) (Jacot et al., 2018) provides a kernel-based framework for analyzing See the details of the derivation in Section A.6 3.2 P Consequently, popular practices include fixing the weights (i.e., setting In Llama3's implementation, which employs Grouped Query Attention (GQA), multiple query heads share Without loss of generality, our analysis can be extended to the vector-output case.


Lee Pace Has Big Hopes for the Fourth Season of 'Foundation'

WIRED

Lee Pace Has Big Hopes for's Fourth Season WIRED spoke to Lee Pace on the eve of the season finale of about clone consciousness, robot gods, and what's next for the newly renewed show. In the world of prestige sci-fi, reigns as the biggest sleeper hit. Mention the Apple TV+ adaptation of Isaac Asimov's classic series in a group of friends and you'll suddenly find everyone has been secretly watching it. Something of a flawed masterpiece, the show, which wraps its third season Friday, has been averaging about 1.5 million hours watched per week in the US over the last month, according to Luminate. Reasons for the show's popularity are many, but it's seemed to have gained traction as it's become more, well, relevant. The series, like Asimov's books, focuses on a group of economists using a predictive algorithm to guide the destiny of humanity through the collapse of a galactic empire.


Trump Is the Emperor of A.I. Slop

The New Yorker

On February 19th, Donald Trump logged onto Truth Social to congratulate himself on vanquishing congestion pricing in his home state. "CONGESTION PRICING IS DEAD," he posted. "Manhattan, and all of New York, is SAVED. The message was amplified by the White House's official X account, which tweeted it with an A.I.-generated image of Trump, golden-haired and golden-crowned, blotting out the New York City skyline. The illustration, which was styled to look like the cover of Time magazine, displayed the President's fondness for crude symbols of power and wealth.


Opinion

#artificialintelligence

ChatGPT opens a Pandora's box of existential fears. Silicon Valley brainiacs have talked about safeguards and kill switches for A.I., but you know they won't pull the plug when their baby turns into M3gan. Once A.I. can run disinformation campaigns at lightning speed, will democracy stand a chance? We seem headed toward a Matrix where "it will become cheaper to show fakes than to show reality," Jaron Lanier, the father of virtual reality, wrote in Tablet. Will bad actors use A.I. to promote bigotry or hijack nuclear weapons?


When "Foundation" Gets the Blockbuster Treatment, Isaac Asimov's Vision Gets Lost

The New Yorker

An innocent viewer of the new Apple TV series "Foundation"--a lavish production complete with clone emperors, a haunted starship, and a killer android who tears off her own face--might be surprised to learn that the novels it's based on inspired Paul Krugman to become an economist. Isaac Asimov's classic saga revolves around the dismal science of "psychohistory," a hybrid of math and psychology that can predict the future. Its inventor, Hari Seldon, lives in a twelve-thousand-year-old galactic empire, which, his equations reveal, is about to collapse. "Interstellar wars will be endless," he warns. His followers establish a Foundation on the frontier world of Terminus--a colony tasked with conserving all human knowledge--where they spend the next millennium fulfilling "Seldon's plan" to reunite the galaxy.


Is the Brain a Useful Model for Artificial Intelligence?

#artificialintelligence

In the summer of 2009, the Israeli neuroscientist Henry Markram strode onto the TED stage in Oxford, England, and made an immodest proposal: Within a decade, he said, he and his colleagues would build a complete simulation of the human brain inside a supercomputer. They'd already spent years mapping the cells in the neocortex, the supposed seat of thought and perception. "It's a bit like going and cataloging a piece of the rain forest," Markram explained. "How many trees does it have? What shapes are the trees?"


Summarising Historical Text in Modern Languages

Peng, Xutan, Zheng, Yi, Lin, Chenghua, Siddharthan, Advaith

arXiv.org Artificial Intelligence

We introduce the task of historical text summarisation, where documents in historical forms of a language are summarised in the corresponding modern language. This is a fundamentally important routine to historians and digital humanities researchers but has never been automated. We compile a high-quality gold-standard text summarisation dataset, which consists of historical German and Chinese news from hundreds of years ago summarised in modern German or Chinese. Based on cross-lingual transfer learning techniques, we propose a summarisation model that can be trained even with no cross-lingual (historical to modern) parallel data, and further benchmark it against state-of-the-art algorithms. We report automatic and human evaluations that distinguish the historic to modern language summarisation task from standard cross-lingual summarisation (i.e., modern to modern language), highlight the distinctness and value of our dataset, and demonstrate that our transfer learning approach outperforms standard cross-lingual benchmarks on this task.