Goto

Collaborating Authors

 mercury


Masked Diffusion Models as Energy Minimization

Neural Information Processing Systems

We present a systematic theoretical framework that interprets masked diffusion models (MDMs) as solutions to energy minimization problems in discrete optimal transport. Specifically, we prove that three distinct energy formulationskinetic, conditional kinetic, and geodesic energyare mathematically equivalent under the structure of MDMs, and that MDMs minimize all three when the mask schedule satisfies a closed-form optimality condition. This unification not only clarifies the theoretical foundations of MDMs, but also motivates practical improvements in sampling. By parameterizing interpolation schedules via Beta distributions, we reduce the schedule design space to a tractable 2D search, enabling efficient post-training tuning without model modification. Experiments on synthetic and real-world benchmarks demonstrate that our energy-inspired schedules outperform hand-crafted baselines, particularly in low-step sampling settings.


They said their toothpaste was the best for my daughter... then I read the sickening claims

Daily Mail - Science & tech

Caitlyn Jenner biographer and Robin Riker's ex William Hasley found dead on hiking trail at 78 Disgraceful texts'hot' teacher sent boy, 17, who she had illegal sex with where she moaned about her HUSBAND Everyone always said I cleared my throat a lot. But then I developed shoulder pain and doctors discovered the sinister cause... the world's deadliest cancer. Don't leave it too late like I did Urgent recall for 1.1m vehicles over fears they could spontaneously CATCH FIRE even when parked Leaked transcript of UNAIRED 60 Minutes interview exposes REAL reason'callous' CBS star Scott Pelley'deserved to be fired' Karmelo Anthony's parents seen leaving the courtroom in tears just before son's defense team pulls shock move'Great' mom, 32, tried to gas herself and her three young kids to death after inviting them to'popcorn sleepover' in car, prosecutors allege The porn-fuelled fantasy middle-class husbands are desperate to try with their wives... and it almost always ends in divorce: JANA HOCKING The historic steel mill that helped build America was written off for dead. Medical student, 24, died by suicide in his white coat a day after he was suspended for alleged'inappropriate' behavior towards female patient, lawsuit alleges, as his heartbreaking goodbye note to parents is revealed Furious dad films his partner in bed with his 19-year-old son: You've seen the viral video - now all three tell the Daily Mail what REALLY happened in the scandal gripping Australia Woke Vegas school compared boy to racist cross burner over pro-ICE stickers and expelled him... but did not punish pro-migrant students for class walkout, lawsuit alleges Gaming influencer Alex Cimo dies'very suddenly' aged 32 just a month after'refusing to accept his fate' Mother's final words before she was shot dead'by new husband' in front of her two young children All the backstage gossip from Miami Swim Week: Insider exposes'catty' VIP's diva demands... STEALING... and'morbidly embarrassing' celeb moment everyone is whispering about They said their toothpaste was the best for my daughter... then I read the sickening claims I am the type of mom who reads every label before buying a product for my four-year-old daughter. So when I learned about a lawsuit against a toothpaste marketed as safe, natural and free from artificial dyes and sweeteners, I immediately checked the tube sitting in my bathroom.


Mercury: A Code Efficiency Benchmark for Code Large Language Models

Neural Information Processing Systems

Amidst the recent strides in evaluating Large Language Models for Code (Code LLMs), existing benchmarks have mainly focused on the functional correctness of generated code, neglecting the importance of their computational efficiency. To fill the gap, we present Mercury, the first code efficiency benchmark for Code LLMs. It comprises 1,889 Python tasks, each accompanied by adequate solutions that serve as real-world efficiency baselines, enabling a comprehensive analysis of the runtime distribution. Based on the distribution, we introduce a new metric Beyond, which computes a runtime-percentile-weighted Pass score to reflect functional correctness and code efficiency simultaneously. On Mercury, leading Code LLMs can achieve 65% on Pass, while less than 50% on Beyond. Given that an ideal Beyond score would be aligned with the Pass score, it indicates that while Code LLMs exhibit impressive capabilities in generating functionally correct code, there remains a notable gap in their efficiency. Finally, our empirical experiments reveal that Direct Preference Optimization (DPO) serves as a robust baseline for enhancing code efficiency compared with Supervised Fine Tuning (SFT), which paves a promising avenue for future exploration of efficient code generation. Our code and data are available on GitHub: https://github.com/Elfsong/Mercury.


'Planetary parade' will see SIX planets align in rare spectacle tonight - here's the best time to spot Mercury, Venus, Jupiter, Saturn, Uranus and Neptune in the night sky

Daily Mail - Science & tech

ROTC students at Old Dominion subdued and killed ISIS-linked gunman who left one dead, two wounded after shouting'Allahu Akbar' and opened fire Horrifying next twist in the Alexander brothers case: MAUREEN CALLAHAN exposes an unthinkable perversion that's been hiding in plain sight Kentucky mother and daughter turn down $26.5MILLION to sell their farms to secretive tech giant that wants to build data center there Hollywood icon who starred in Psycho after Hitchcock dubbed her'my new Grace Kelly' looks incredible at 95 Kylie Jenner's total humiliation in Hollywood: Derogatory rumor leaves her boyfriend's peers'laughing at her' behind her back Tucker Carlson erupts at Trump adviser as she hurls'SLANDER' claim linking him to synagogue shooting Ben Affleck'scores $600m deal' with Netflix to sell his AI film start-up Long hair over 45 is ageing and try-hard. I've finally cut mine off. Alexander brothers' alleged HIGH SCHOOL rape video: Classmates speak out on sickening footage... as creepy unseen photos are exposed Heartbreaking video shows very elderly DoorDash driver shuffle down customer's driveway with coffee order because he is too poor to retire Amber Valletta, 52, was a '90s Vogue model who made movies with Sandra Bullock and Kate Hudson, see her now Model Cindy Crawford, 60, mocked for her'out of touch' morning routine: 'Nothing about this is normal' 'Planetary parade' will see SIX planets align in rare spectacle tonight - here's the best time to spot Mercury, Venus, Jupiter, Saturn, Uranus and Neptune in the night sky Keen astronomers are in for a treat tonight, as a rare'planetary parade' of six planets lights up the night sky. Tonight, Mercury, Venus, Jupiter, Saturn, Uranus, and Neptune will all be visible from Earth. Excitingly, four of these planets will be visible with the naked eye, so you won't need any special equipment to enjoy the spectacle.


Mercury: ACodeEfficiencyBenchmarkforCode LargeLanguageModels

Neural Information Processing Systems

Amidst therecent strides inevaluating LargeLanguage Models forCode (Code LLMs), existing benchmarks havemainly focused onthefunctional correctness of generated code, neglecting the importance of their computational efficiency.



Parallel Thinking, Sequential Answering: Bridging NAR and AR for Efficient Reasoning

arXiv.org Artificial Intelligence

We study reasoning tasks through a framework that integrates auto-regressive (AR) and non-autoregressive (NAR) language models. AR models, which generate text sequentially, excel at producing coherent outputs but often suffer from slow inference, particularly in reasoning-intensive domains such as mathematics and code, where lengthy chains of thought are required. In contrast, NAR models, such as discrete diffusion models, allow parallel generation and offer substantial speedups, though typically at the cost of reduced output quality. To address these limitations, we introduce a new paradigm in which an NAR model efficiently produces intermediate reasoning traces, which subsequently guide an AR model to deliver precise final answers. Experiments demonstrate that our approach yields significant 26% improvements over strong baselines while substantially reducing inference cost.


Towards Better Correctness and Efficiency in Code Generation

arXiv.org Artificial Intelligence

While code large language models have demonstrated remarkable progress in code generation, the generated code often exhibits poor runtime efficiency, limiting its practical application in performance-sensitive scenarios. To address this limitation, we propose an efficiency-oriented reinforcement learning framework guided by a novel performance reward. Based on this framework, we take a deeper dive into the code efficiency problem, identifying then proposing methods to overcome key bottlenecks: (1) Dynamic exploration overcomes the static data constraints of offline fine-tuning, enabling the discovery of more efficient code implementations. (2) The error-insensitive reinforcement learning method and high-contrast efficiency signals are crucial for mitigating systematic errors and achieving effective optimization. (3) Online exploration is most effective when starting from a high-correctness baseline, as this allows for efficiency improvements without sacrificing accuracy. With these discoveries, we finally propose a two-stage tuning method, which achieves high and balanced performance across correctness and efficiency. The results of experiments show the effectiveness of the method, which improves code correctness by 10.18\% and runtime efficiency by 7.75\% on a 7B model, achieving performance comparable to much larger model.


Mercury: A Code Efficiency Benchmark for Code Large Language Models

Neural Information Processing Systems

Amidst the recent strides in evaluating Large Language Models for Code (Code LLMs), existing benchmarks have mainly focused on the functional correctness of generated code, neglecting the importance of their computational efficiency. To fill the gap, we present Mercury, the first code efficiency benchmark for Code LLMs. It comprises 1,889 Python tasks, each accompanied by adequate solutions that serve as real-world efficiency baselines, enabling a comprehensive analysis of the runtime distribution. Based on the distribution, we introduce a new metric Beyond, which computes a runtime-percentile-weighted Pass score to reflect functional correctness and code efficiency simultaneously. On Mercury, leading Code LLMs can achieve 65% on Pass, while less than 50% on Beyond.


Hypernym Mercury: Token Optimization Through Semantic Field Constriction And Reconstruction From Hypernyms. A New Text Compression Method

arXiv.org Artificial Intelligence

Compute optimization using token reduction of LLM prompts is an emerging task in the fields of NLP and next generation, agentic AI. In this white paper, we introduce a novel (patent pending) text representation scheme and a first-of-its-kind word-level semantic compression of paragraphs that can lead to over 90% token reduction, while retaining high semantic similarity to the source text. We explain how this novel compression technique can be lossless and how the detail granularity is controllable. We discuss benchmark results over open source data (i.e. Bram Stoker's Dracula available through Project Gutenberg) and show how our results hold at the paragraph level, across multiple genres and models.