Goto

Collaborating Authors

 sutskever


NAIS-Net: Stable Deep Networks from Non-Autonomous Differential Equations

Marco Ciccone, Marco Gallieri, Jonathan Masci, Christian Osendorfer, Faustino Gomez

Neural Information Processing Systems

Each block represents atime-invariant iterativeprocess as the first layer in thei-th block,xi(1), is unrolled into a pattern-dependent number,Ki, of processing stages, using weight matricesAi andBi. The skip connections from the input,ui, to all layers in blockimake the process nonautonomous. Blocks can be chained together (each block modeling adifferent latent space) by passing final latentrepresentation,xi(Ki),ofblockiastheinputtoblocki+1.




The great AI hype correction of 2025

MIT Technology Review

Four ways to think about this year's reckoning When OpenAI released a free web app called ChatGPT in late 2022, it changed the course of an entire industry--and several world economies. Millions of people started talking to their computers, and their computers started talking back. We were enchanted, and we expected more. Technology companies scrambled to stay ahead, putting out rival products that outdid one another with each new release: voice, images, video. With nonstop one-upmanship, AI companies have presented each new product drop as a major breakthrough, reinforcing a widespread faith that this technology would just keep getting better. Boosters told us that progress was exponential.


Tech billionaires seem to be doom prepping. Should we all be worried?

BBC News

Tech billionaires seem to be doom prepping. Should we all be worried? Mark Zuckerberg is said to have started work on Koolau Ranch, his sprawling 1,400-acre compound on the Hawaiian island of Kauai, as far back as 2014. It is set to include a shelter, complete with its own energy and food supplies, though the carpenters and electricians working on the site were banned from talking about it by non-disclosure agreements, according to a report by Wired magazine. A six-foot wall blocked the project from view of a nearby road.


Semi-supervised Sequence Learning

Andrew M. Dai, Quoc V. Le

Neural Information Processing Systems

We present two approaches to use unlabeled data to improve Se quence Learning with recurrent networks. The first approach is to predict wha t comes next in a sequence, which is a language model in NLP . The second approa ch is to use a sequence autoencoder, which reads the input sequence into a vector and predicts the input sequence again. These two algorithms can be used as a "pretraining" algorithm for a later supervised sequence learning algorit hm. In other words, the parameters obtained from the pretraining step can then be us ed as a starting point for other supervised training models. In our experiments, w e find that long short term memory recurrent networks after pretrained with the tw o approaches become more stable to train and generalize better. With pretra ining, we were able to achieve strong performance in many classification tasks, su ch as text classification with IMDB, DBpedia or image recognition in CIFAR-10.


'We're Definitely Going to Build a Bunker Before We Release AGI'

The Atlantic - Technology

In the summer of 2023, Ilya Sutskever, a co-founder and the chief scientist of OpenAI, was meeting with a group of new researchers at the company. By all traditional metrics, Sutskever should have felt invincible: He was the brain behind the large language models that helped build ChatGPT, then the fastest-growing app in history; his company's valuation had skyrocketed; and OpenAI was the unrivaled leader of the industry believed to power the future of Silicon Valley. But the chief scientist seemed to be at war with himself. Sutskever had long believed that artificial general intelligence, or AGI, was inevitable--now, as things accelerated in the generative-AI industry, he believed AGI's arrival was imminent, according to Geoff Hinton, an AI pioneer who was his Ph.D. adviser and mentor, and another person familiar with Sutskever's thinking. To people around him, Sutskever seemed consumed by thoughts of this impending civilizational transformation. What would the world look like when a supreme AGI emerged and surpassed humanity? And what responsibility did OpenAI have to ensure an end state of extraordinary prosperity, not extraordinary suffering?


Are We Taking A.I. Seriously Enough?

The New Yorker

My in-laws own a little two-bedroom beach bungalow. It's part of a condo development that hasn't changed much in fifty years. The units are connected by brick paths that wind through palm trees and tiki shelters to a beach. Nearby, developers have built big hotels and condo towers, and it's always seemed inevitable that the bungalows would be razed and replaced. But it's never happened, probably because, according to the association's bylaws, eighty per cent of the owners have to agree to a sale of the property.


Activation Steering in Neural Theorem Provers

Kirtania, Shashank

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have shown promise in proving formal theorems using proof assistants like Lean. However, current state of the art language models struggle to predict next step in proofs leading practitioners to use different sampling techniques to improve LLMs capabilities. We observe that the LLM is capable of predicting the correct tactic; however, it faces challenges in ranking it appropriately within the set of candidate tactics, affecting the overall selection process. To overcome this hurdle we use activation steering to guide LLMs responses to improve the generations at the time of inference. Our results suggest that activation steering offers a promising lightweight alternative to specialized fine-tuning for enhancing theorem proving capabilities in LLMs, particularly valuable in resource-constrained environments. Interactive proof assistants such as Lean de Moura et al. (2015), Isabelle Wenzel et al. (2008), and Coq Barras et al. (1999) enable the formal verification of mathematical proofs and software by leveraging specialized programming languages Avigad (2023); Ringer et al. (2019).


Can Generative AI be Egalitarian?

Feldman, Philip, Foulds, James R., Pan, Shimei

arXiv.org Artificial Intelligence

The recent explosion of "foundation" generative AI models has been built upon the extensive extraction of value from online sources, often without corresponding reciprocation. This pattern mirrors and intensifies the extractive practices of surveillance capitalism, while the potential for enormous profit has challenged technology organizations' commitments to responsible AI practices, raising significant ethical and societal concerns. However, a promising alternative is emerging: the development of models that rely on content willingly and collaboratively provided by users. This article explores this "egalitarian" approach to generative AI, taking inspiration from the successful model of Wikipedia. We explore the potential implications of this approach for the design, development, and constraints of future foundation models. We argue that such an approach is not only ethically sound but may also lead to models that are more responsive to user needs, more diverse in their training data, and ultimately more aligned with societal values. Furthermore, we explore potential challenges and limitations of this approach, including issues of scalability, quality control, and potential biases inherent in volunteer-contributed content.