Goto

Collaborating Authors

 nerd


The Argument for Letting AI Burn It All Down

WIRED

When the AI bubble bursts, the nerds will do their best work. Suddenly, and not long ago, our dearest tech industry leaders began to suggest caution. Sam Altman said that AI is in a bubble "for sure," albeit one formed around "a kernel of truth." Mark Zuckerberg said an AI bubble "is quite possible," though "if the models keep on growing in capability year over year and demand keeps growing, then maybe there is no collapse, or something." Even Eric Schmidt is saying to calm down about artificial general intelligence and focus on competing with China .


Neural Robot Dynamics

Xu, Jie, Heiden, Eric, Akinola, Iretiayo, Fox, Dieter, Macklin, Miles, Narang, Yashraj

arXiv.org Artificial Intelligence

Simulation plays a crucial role in various robotics applications, such as policy learning [1, 2, 3, 4, 5, 6, 7], safe and scalable robotic control evaluation [8, 9, 10, 11], and computational optimization of robot designs [12, 13, 14]. Recently, neural robotics simulators have emerged as a promising alternative to traditional analytical simulators, as neural simulators can efficiently predict robot dynamics and learn intricate physics from real-world data. For instance, neural simulators have been leveraged to capture complex interactions challenging for analytical modeling [15, 16, 17, 18], or have served as learned world models to facilitate sample-efficient policy learning [19, 20]. However, existing neural robotics simulators typically require application-specific training, often assuming fixed environments [20, 21] or simultaneous training alongside control policies [22, 23]. These limitations primarily stem from their end-to-end frameworks with inadequate representations of the global simulation state, i.e., neural models often substitute the entire classical simulator and directly map robot state and control actions ( e.g., target joint positions, target link orientations) to the robot's next state. Without encoding the environment in the state representation, the learned simulators have to implicitly memorize the task and environment details. Additionally, utilizing controller actions as input causes the simulators to overfit to particular low-level controllers used during training. Consequently, unlike classical simulators, these neural simulators often fail to generalize to novel state distributions (induced by new tasks), unseen environment setups, and customized controllers ( e.g., novel control laws or controller gains).


Elon Musk stands accused of pretending to be good at video games. The irony is delicious Keza MacDonald

The Guardian

Last year on Joe Rogan's podcast, Elon Musk claimed to be one of the world's best Diablo IV players – and surprisingly, the leaderboards backed him up. For those that haven't had the pleasure, Diablo is one of the most mercilessly time-intensive video games out there; you build a character and carve through armies of demons, spending hundreds of hours refining skills and equipment for maximum hellspawn-cleansing efficiency. I played it for maybe five hours last year and immediately quit, for fear that it would consume my life. Most of the people who play it are young, often male, and have plenty of time to themselves to spend on the internet and playing games – so, the exact demographic of many Musk stans. It suited these hardcore gamer guys to believe that someone who tweets all day and runs several businesses was also an elite player who poured hundreds of hours into Diablo.


'All people could do was hope the nerds would fix it': the global panic over the millennium bug, 25 years on

The Guardian

Just before midnight on New Year's Eve, 25 years ago, Queen Elizabeth II stepped off a private barge to arrive at London's Millennium Dome for its grand opening ceremony. Dressed in a pumpkin-orange coat, she entered the venue with Prince Philip, taking her place alongside Tony and Cherie Blair and 12,000 guests to celebrate the dawn of a new millennium. At the stroke of midnight, Big Ben began to chime and 40 tonnes of fireworks were launched from 16 barges lined along the river. The crowd joined hands, preparing to sing Auld Lang Syne. For a few long moments, the Queen was neglected – she flapped her arms out like a toddler wanting to be lifted up, before Blair and Philip noticed her, took a hand each, and the singing began. A new century was born. One politician who wasn't in attendance at the glitzy celebration was Paddy Tipping, a Labour MP who spent the night in the Cabinet Office.


Beyond Boundaries: Learning a Universal Entity Taxonomy across Datasets and Languages for Open Named Entity Recognition

Yang, Yuming, Zhao, Wantong, Huang, Caishuang, Ye, Junjie, Wang, Xiao, Zheng, Huiyuan, Nan, Yang, Wang, Yuran, Xu, Xueying, Huang, Kaixin, Zhang, Yunke, Gui, Tao, Zhang, Qi, Huang, Xuanjing

arXiv.org Artificial Intelligence

Open Named Entity Recognition (NER), which involves identifying arbitrary types of entities from arbitrary domains, remains challenging for Large Language Models (LLMs). Recent studies suggest that fine-tuning LLMs on extensive NER data can boost their performance. However, training directly on existing datasets faces issues due to inconsistent entity definitions and redundant data, limiting LLMs to dataset-specific learning and hindering out-of-domain generalization. To address this, we present B2NERD, a cohesive and efficient dataset for Open NER, normalized from 54 existing English or Chinese datasets using a two-step approach. First, we detect inconsistent entity definitions across datasets and clarify them by distinguishable label names to construct a universal taxonomy of 400+ entity types. Second, we address redundancy using a data pruning strategy that selects fewer samples with greater category and semantic diversity. Comprehensive evaluation shows that B2NERD significantly improves LLMs' generalization on Open NER. Our B2NER models, trained on B2NERD, outperform GPT-4 by 6.8-12.0 F1 points and surpass previous methods in 3 out-of-domain benchmarks across 15 datasets and 6 languages.


ULMA: Unified Language Model Alignment with Demonstration and Point-wise Human Preference

Cai, Tianchi, Song, Xierui, Jiang, Jiyan, Teng, Fei, Gu, Jinjie, Zhang, Guannan

arXiv.org Artificial Intelligence

Language model alignment is a cutting-edge technique in large language model training to align the model output to user's intent, e.g., being helpful and harmless. Recent alignment framework consists of two steps: supervised fine-tuning with demonstration data and preference learning with human preference data. Previous preference learning methods, such as RLHF and DPO, mainly focus on pair-wise preference data. However, in many real-world scenarios where human feedbacks are intrinsically point-wise, these methods will suffer from information loss or even fail. To fill this gap, in this paper, we first develop a preference learning method called point-wise DPO to tackle point-wise preference data. Further revelation on the connection between supervised fine-tuning and point-wise preference learning enables us to develop a unified framework for both human demonstration and point-wise preference data, which sheds new light on the construction of preference dataset. Extensive experiments on point-wise datasets with binary or continuous labels demonstrate the superior performance and efficiency of our proposed methods. A new dataset with high-quality demonstration samples on harmlessness is constructed and made publicly available.


Nerds are a Menace to Society

Slate

For a long time, we've been sold -- and we've bought -- the idea of the nerd hero; usually a man, usually brilliant, and usually a social outcast who, inevitably, gets the girl. That was the happy ending. But now, we're surrounded by powerful, self-styled nerds who have it all and still want more. And, to some, it's increasingly hard to root for these guys. Ian Bogost, a writer and video game designer, joins us.


Neural Estimation of the Rate-Distortion Function With Applications to Operational Source Coding

Lei, Eric, Hassani, Hamed, Bidokhti, Shirin Saeedi

arXiv.org Artificial Intelligence

A fundamental question in designing lossy data compression schemes is how well one can do in comparison with the rate-distortion function, which describes the known theoretical limits of lossy compression. Motivated by the empirical success of deep neural network (DNN) compressors on large, real-world data, we investigate methods to estimate the rate-distortion function on such data, which would allow comparison of DNN compressors with optimality. While one could use the empirical distribution of the data and apply the Blahut-Arimoto algorithm, this approach presents several computational challenges and inaccuracies when the datasets are large and high-dimensional, such as the case of modern image datasets. Instead, we re-formulate the rate-distortion objective, and solve the resulting functional optimization problem using neural networks. We apply the resulting rate-distortion estimator, called NERD, on popular image datasets, and provide evidence that NERD can accurately estimate the rate-distortion function. Using our estimate, we show that the rate-distortion achievable by DNN compressors are within several bits of the rate-distortion function for real-world datasets. Additionally, NERD provides access to the rate-distortion achieving channel, as well as samples from its output marginal. Therefore, using recent results in reverse channel coding, we describe how NERD can be used to construct an operational one-shot lossy compression scheme with guarantees on the achievable rate and distortion. Experimental results demonstrate competitive performance with DNN compressors.


Meet ML@GT: Lara J. Martin Trains AI Agents to Become Storytellers

#artificialintelligence

The Machine Learning Center at Georgia Tech (ML@GT) is home to many talented students from across campus, representing all six of Georgia Tech's colleges and the Georgia Tech Research Institute (GTRI). These students have diverse backgrounds and a wide variety of interests both inside and outside of the classroom. Today, we'd like you to meet Lara Martin, a fifth-year Ph.D. student who is interested in teaching artificial intelligence agents to tell interesting and coherent stories. Tell us about your research interests. Where might people be impacted them in everyday life?


Tepper Wants to Nerd Out On Data With You

#artificialintelligence

There are many practical reasons why you should choose an online Masters in Business Analytics from the Tepper School of Business at Carnegie Mellon University. We can list facts like: our alumni average $103,000 in starting salary and 84% of our grads secured a promotion or new position within three months of graduation. However, one of the best parts of this degree is spending two years learning from extraordinarily talented people. Some are students, who make up our close-knit cohorts. Others are faculty, who are leading researchers committed to help students get ahead.