Country
SpaceX IPO raised 10bn more than thought
SpaceX raised $10bn (£7.5bn) more than initially thought when it sold shares to the public on Friday - bringing in a total of $85.7bn. Elon Musk's rocket and Artificial Intellgience (AI) company pulled off the biggest initial public offering (IPO) in history when it joined New York's Nasdaq stock exchange last week. The listing had raised $75bn from investors, which Musk told employees will be spent funding a significant growth phase. But the banks which backed the IPO exercised a so-called greenshoe clause, which let them purchase an extra $10bn of SpaceX shares. The extra $10bn raised, revealed in a statement by SpaceX announcing the completion of the listing, would by itself rank as one of the biggest IPOs in history.
Imitation Beyond Expectation Using Pluralistic Stochastic Dominance
Imitation learning seeks to estimate policies reflecting the values of demonstrated behaviors. Prevalent approaches learn to match or exceed the demonstrator's performance in expectation without knowing the demonstrator's reward function. Unfortunately, this does not induce pluralistic imitators that learn to support distinct demonstrations.
Training-Free Constrained Generation With Stable Diffusion Models
Stable diffusion models represent the state-of-the-art in data synthesis across diverse domains and hold transformative potential for applications in science and engineering, e.g., by facilitating the discovery of novel solutions and simulating systems that are computationally intractable to model explicitly. While there is increasing effort to incorporate physics-based constraints into generative models, existing techniques are either limited in their applicability to latent diffusion frameworks or lack the capability to strictly enforce domain-specific constraints. To address this limitation this paper proposes a novel integration of stable diffusion models with constrained optimization frameworks, enabling the generation of outputs satisfying stringent physical and functional requirements.
Why do South Koreans love AI so much?
Why do South Koreans love AI so much? From eldercare robots to humanoid monks, South Koreans just can't get enough of AI. When I landed in Seoul after a grueling 12-hour flight from San Francisco, I walked through an unmanned immigration checkpoint, where a machine scanned my face and passport. On the subway home, people were glued to their phones (powered by flawless 5G even underground), as we raced past platforms lined with LED screens of ads celebrating K-pop idols ' birthdays. When I got off the station in Gangnam, a cartoon-eyed robot on wheels was waiting patiently at a crosswalk to deliver someone's dinner. Internet cafés dotted the sidewalks, crammed with teenagers playing computer games, maybe hoping to become the next legendary pro gamer .
Anthropic to meet White House over AI tool suspension
Bosses at the artificial intelligence (AI) firm Anthropic are set to meet senior White House officials amid fresh national security concerns over the company's latest release. The meeting is set to take place on Monday in Washington DC between executives at Anthropic and the US Department of Commerce, a government department led by Secretary Howard Lutnick, according to two people familiar with the matter. It comes after Anthropic blocked all public access to the recent release of its latest AI tool on Friday, which it has previously said is too powerful. The firm made the decision after the US government prohibited Anthropic from allowing any foreign national access to the technology. The AI tool at issue is named Fable 5 or Mythos 5. Fable 5 is a version of the tool with extra safeguards made available to the public, while Mythos 5 has different controls and is only available to a select group of organisations.
DeepKD: ADeeply Decoupled and Denoised Knowledge Distillation Trainer
Recent advances in knowledge distillation have emphasized the importance of decoupling different knowledge components. While existing methods utilize momentum mechanisms to separate task-oriented and distillation gradients, they overlook the inherent conflict between target-class and non-target-class knowledge flows. Furthermore, low-confidence dark knowledge in non-target classes introduces noisy signals that hinder effective knowledge transfer. To address these limitations, we propose DeepKD, a novel training framework that integrates duallevel decoupling with adaptive denoising. First, through theoretical analysis of gradient signal-to-noise ratio (GSNR) characteristics in task-oriented and non-taskoriented knowledge distillation, we design independent momentum updaters for each component to prevent mutual interference. We observe that the optimal momentum coefficients for task-oriented gradient (TOG), target-class gradient (TCG), and non-target-class gradient (NCG) should be positively related to their GSNR. Second, we introduce a dynamic top-k mask (DTM) mechanism that gradually increases K from a small initial value to incorporate more non-target classes as training progresses, following curriculum learning principles. The DTM jointly filters low-confidence logits from both teacher and student models, effectively purifying dark knowledge during early training. Extensive experiments on CIFAR-100, ImageNet, and MS-COCO demonstrate DeepKD's effectiveness.
Efficient Part-level 3DObject Generation via Dual Volume Packing
Recent progress in 3D object generation has greatly improved both the quality and efficiency. However, most existing methods generate a single mesh with all parts fused together, which limits the ability to edit or manipulate individual parts. A key challenge is that different objects may have a varying number of parts. To address this, we propose a new end-to-end framework for part-level 3D object generation. Given a single input image, our method generates high-quality 3D objects with an arbitrary number of complete and semantically meaningful parts. We introduce a dual volume packing strategy that organizes all parts into two complementary volumes, allowing for the creation of complete and interleaved parts that assemble into the final object. Experiments show that our model achieves better quality, diversity, and generalization than previous image-based part-level generation methods. Our project page is at https://research.nvidia.com/
Emergence of Linear Truth Encodings in Language Models
Recent probing studies reveal that large language models exhibit linear subspaces that separate true from false statements, yet the mechanism behind their emergence is unclear. We introduce a transparent, one-layer toy model that reproduces such truth subspaces end-to-end and exposes one concrete route by which they can arise. We study one simple setting in which truth encoding can emerge: a data distribution where factual statements co-occur with other factual statements (and vice-versa), encouraging the model to learn this distinction in order to lower the LM loss on future tokens. We corroborate this pattern with experiments in pretrained language models. Finally, in the toy setting we observe a two-phase learning dynamic: networks first memorize individual factual associations in a few steps, then--over a longer horizon--learn to linearly separate true from false, which in turn lowers language-modeling loss. Together, these results provide both a mechanistic demonstration and an empirical motivation for how and why linear truth representations can emerge in language models.
Urgent warning to all Outlook users about scam hijacking email accounts... here's how to stay safe
Former Olympian is arrested for allegedly vandalizing Reflecting Pool... but he claims he merely touched it Call me cynical, but the real reason Gruesome Twosome Harry and Meghan are returning to the UK is just so obvious... and highly humiliating: MAUREEN CALLAHAN Three more arrested over bungee jumper's death after she was hurled from bridge without a rope I lost 50lb without jabs using this easy but overlooked method. But I still felt dowdy - until I discovered these expert anti-ageing fashion and beauty tips. Inside America's new fattest town: Burgers are the size of your head, gyms lie empty and custom mobility scooters carry 800lb loads... as we investigate why Ozempic just DOESN'T work Blake Lively runs errands in frumpy outfit after reconciling with ex-BFF Taylor Swift... miles away from reported'bachelorette party' I've spoken to thousands of children who claim they can recall a past life ... these chilling stories have convinced me they're telling the truth Stingy fast food giant named America's favorite restaurant AGAIN... and experts think they know why Ex-partner of dad who was berated for taking his daughters into women's bathroom claims he'exploited' girls and accuses him of failing to pay child support... before he hits back The'marry me' sex move that'll make even the most commitment-phobic of men beg to see you again... and it worked for THREE of my friends America's next real estate time bomb detonates the sun-kissed southern housing dream: 'New condo crisis' sparks chilling warning as it snakes across the nation Furious Trump hits back at Italian Prime Minister Meloni and gives her unusual'nickname' as their photo feud ramps up TV star mom, 46, who appeared on'quitting everything to change your life' show died in fire at luxury Caribbean beach resort that sent 1,700 tourists running for their lives Grace Kelly's lookalike granddaughter, 27, wows in bikini snaps...as she packs on the PDA during beach getaway The four mistakes that led to bungee tragedy on Skeleton Bridge: FRED KELLY saw the scene for himself, now he retraces the prelude to disaster. So was it really an accident? Famous TV mansion left standing after Malibu's harshest wildfires struggled to find a buyer for 14 years but finally sells for an eye-watering price Swedish actress, 81, was in TWO James Bond movies and also worked with Charlton Heston, who is she?
Robust LLMAlignment via Distributionally Robust Direct Preference Optimization
A major challenge in aligning large language models (LLMs) with human preferences is the issue of distribution shift. LLM alignment algorithms rely on static preference datasets, assuming that they accurately represent real-world user preferences. However, user preferences vary significantly across geographical regions, demographics, linguistic patterns, and evolving cultural trends. This preference distribution shift leads to catastrophic alignment failures in many real-world applications. We address this problem using the principled framework of distributionally robust optimization, and develop two novel distributionally robust direct preference optimization (DPO) algorithms, namely, Wasserstein DPO (WDPO) and Kullback-Leibler DPO (KLDPO). We characterize the sample complexity of learning the optimal policy parameters for WDPO and KLDPO. Moreover, we propose scalable gradient descent-style learning algorithms by developing suitable approximations for the challenging minimax loss functions of WDPO and KLDPO. Our empirical experiments using benchmark data sets and LLMs demonstrate the superior performance of WDPO and KLDPO in substantially improving the alignment when there is a preference distribution shift.