The success of deep learning over the last decade, particularly in computer vision, has depended greatly on large training data sets. Even though progress in this area boosted the performance of many tasks such as object detection, recognition, and segmentation, the main bottleneck for future improvement is more labeled data. Self-supervised learning is among the best alternatives for learning useful representations from the data. In this article, we will briefly review the self-supervised learning methods in the literature and discuss the findings of a recent self-supervised learning paper from ICLR 2020 . We may assume that most learning problems can be tackled by having clean labeling and more data obtained in an unsupervised way.
Imagine we want to train a self-driving car in New York so that we can take it all the way to Seattle without tediously driving it for over 48 hours. We hope our car can handle all kinds of environments on the trip and send us safely to the destination. We know that road conditions and views can be very different. It is intuitive to simply collect road data of this trip, let the car learn from every possible condition, and hope it becomes the perfect self-driving car for our New York to Seattle trip. It needs to understand the traffic and skyscrapers in big cities like New York and Chicago, more unpredictable weather in Seattle, mountains and forests in Montana, and all kinds of country views, farmlands, animals, etc.
Here are the most tweeted papers that were uploaded onto arXiv during July 2020. Results are powered by Arxiv Sanity Preserver. Abstract: Massive language models are the core of modern NLP modeling and have been shown to encode impressive amounts of commonsense and factual information. However, that knowledge exists only within the latent parameters of the model, inaccessible to inspection and interpretation, and even worse, factual information memorized from the training corpora is likely to become stale as the world changes. Knowledge stored as parameters will also inevitably exhibit all of the biases inherent in the source materials.
The second invited talk at ICML2020 was given by Brenna Argall. Her presentation covered the use of machine learning within the domain of assistive machines for rehabilitation. She described the efforts of her lab towards customising assistive autonomous machines so that users can decide the level of control they keep, and how much autonomy they hand over to the machine. Within the field of rehabilitation, machines are used in order to rehabilitate the human body, and also to bridge gaps in functionality that are left as a result of injury or disease. Brenna noted that when we are replacing loss of function there persists a challenge in the field of how to capture control signals from the human body in order to operate the machine.
Hosted by Dylan Doyle-Burke and Jessie J Smith, Radical AI is a podcast featuring the voices of the future in the field of artificial intelligence ethics. In this episode Jess and Dylan chat to Eun Seo Jo about "The History that Defines our Technological Future". How does your data tell your story? What do our archives have to do with defining the future of our technology? To answer these questions and more The Radical AI Podcast welcomes Stanford PhD student and archivist Eun Seo Jo to the show.
Twitter users will have seen the proliferation of "I have a joke" tweets in their feed over the past few days. The AI community produced some gems so we've collected a selection here for your amusement. I have a reinforcement learning joke, but not sure it's rewarding. I have a stochastic gradient descent joke but the punchline isn't on this saddle point https://t.co/B7GM2tmz5Z I have a deep learning joke but it has a lot of layers to it.
There were three invited talks at this year's virtual ICML. The first was given by Lester Mackey, and he highlighted some of his efforts to do some good with machine learning. During the talk he also outlined several ways in which social good efforts can be organised, and described numerous social good problems that would benefit from the community's attention. Lester took the audience on a journey from his grad school days to the present, focusing on the social good projects he's been involved in along the way. His research in this area has included efforts to combat nuclear proliferation, climate forecasting, and COVID-19 work.
Neural architecture search (NAS) -- selecting which neural model to use for your learning problem -- is a promising but computationally expensive direction for automating and democratizing machine learning. The weight-sharing method, whose initial success at dramatically accelerating NAS surprised many in the field, has come under scrutiny due to its poor performance as a surrogate for full model-training (a miscorrelation problem known as rank disorder) and inconsistent results on recent benchmarks. In this post, we give a quick overview of weight-sharing and argue in favor of its continued use for NAS. First-generation NAS methods were astronomically expensive due to the combinatorially large search space, requiring the training of thousands of neural networks to completion. Then, in their 2018 ENAS (for Efficient NAS) paper, Pham et al. introduced the idea of weight-sharing, in which only one shared set of model parameters is trained for all architectures.
Failure to use data effectively means we cannot deal with the most pressing issues that face us today, such as discrimination. Addressing this requires institutions that are fit to enable responsible use of data and technology for the public good, engaging civil society and the public as well as industry and government. The Royal Society's Data Governance Explainer (PDF) brings welcome clarity to a complex landscape. It builds upon another Royal Society report published in 2017, written in partnership with the British Academy, that helped set the wheels in motion within the government to create the Centre for Data Ethics and Innovation (CDEI). Since then, the UK institutional landscape has seen an expansion of organisations that are working together to better understand how we effectively and responsibly adopt data-driven technologies and artificial intelligence.
This month we discuss conferences and whether they will ever be the same again now we've had a taste of the virtual. Joining the discussion this week are: Kamalika Chaudhuri (University of California, San Diego), Tom Dietterich (Oregon State University), Sabine Hauert (University of Bristol), Carles Sierra (CSIC). Carles Sierra: I think we will see more and more of these virtual conferences. We could probably work as well as we did when meeting physically, although some aspects will be different. Sabine Hauert: Last week I gave three talks, at three separate conferences, and never had to leave the home, which is really great for work-life balance and childcare responsibilities.