Goto

Collaborating Authors

 Gilbert


DsDm: Model-Aware Dataset Selection with Datamodels

Engstrom, Logan, Feldmann, Axel, Madry, Aleksander

arXiv.org Artificial Intelligence

When selecting data for training large-scale models, standard practice is to filter for examples that match human notions of data quality. Such filtering yields qualitatively clean datapoints that intuitively should improve model behavior. However, in practice the opposite can often happen: we find that selecting according to similarity with "high quality" data sources may not increase (and can even hurt) performance compared to randomly selecting data. To develop better methods for selecting data, we start by framing dataset selection as an optimization problem that we can directly solve for: given target tasks, a learning algorithm, and candidate data, select the subset that maximizes model performance. This framework thus avoids handpicked notions of data quality, and instead models explicitly how the learning process uses train datapoints to predict on the target tasks. Our resulting method greatly improves language model (LM) performance on both pre-specified tasks and previously unseen tasks. Specifically, choosing target tasks representative of standard LM problems and evaluating on diverse held-out benchmarks, our selected datasets provide a 2x compute multiplier over baseline methods.


A Deep Learning Approach Towards Student Performance Prediction in Online Courses: Challenges Based on a Global Perspective

Moubayed, Abdallah, Injadat, MohammadNoor, Alhindawi, Nouh, Samara, Ghassan, Abuasal, Sara, Alazaidah, Raed

arXiv.org Artificial Intelligence

Analyzing and evaluating students' progress in any learning environment is stressful and time consuming if done using traditional analysis methods. This is further exasperated by the increasing number of students due to the shift of focus toward integrating the Internet technologies in education and the focus of academic institutions on moving toward e-Learning, blended, or online learning models. As a result, the topic of student performance prediction has become a vibrant research area in recent years. To address this, machine learning and data mining techniques have emerged as a viable solution. To that end, this work proposes the use of deep learning techniques (CNN and RNN-LSTM) to predict the students' performance at the midpoint stage of the online course delivery using three distinct datasets collected from three different regions of the world. Experimental results show that deep learning models have promising performance as they outperform other optimized traditional ML models in two of the three considered datasets while also having comparable performance for the third dataset.


Videogames 'Fortnite,' 'Minecraft' Catapult Smiley Salamander to Global Fame

WSJ.com: WSJD - Technology

A global audience of a half-billion gamers have gotten to know the axolotl, which largely cluster in the canals around Mexico City and look like little dragons with a goofy smile. The videogame "Fortnite" trotted out axolotl characters in 2020, and "Minecraft" followed suit last summer. Roblox, a platform with millions of user-made games, has dozens of axolotl-centric ones, including "Axolotl Tycoon" and "Axolotl Paradise." Axolotls appear in "Adopt Me!," one of the most-played games on Roblox. All of the exposure has spawned axolotl memes, YouTube videos, coloring books and nonfungible tokens.


Neuron-Specific Dropout: A Deterministic Regularization Technique to Prevent Neural Networks from Overfitting & Reduce Dependence on Large Training Samples

Shunk, Joshua

arXiv.org Machine Learning

In order to develop complex relationships between their inputs and outputs, deep neural networks train and adjust large number of parameters. To make these networks work at high accuracy, vast amounts of data are needed. Sometimes, however, the quantity of data needed is not present or obtainable for training. Neuron-specific dropout (NSDropout) is a tool to address this problem. NSDropout looks at both the training pass, and validation pass, of a layer in a model. By comparing the average values produced by each neuron for each class in a data set, the network is able to drop targeted units. The layer is able to predict what features, or noise, the model is looking at during testing that isn't present when looking at samples from validation. Unlike dropout, the "thinned" networks cannot be "unthinned" for testing. Neuron-specific dropout has proved to achieve similar, if not better, testing accuracy with far less data than traditional methods including dropout and other regularization methods. Experimentation has shown that neuron-specific dropout reduces the chance of a network overfitting and reduces the need for large training samples on supervised learning tasks in image recognition, all while producing best-in-class results.


NASA inaugurates 10 new astronauts who are set to walk on the moon and potentially Mars

Daily Mail - Science & tech

NASA inaugurated its 23rd class of new astronauts on Monday, which includes 10 individuals who are set to walk on the moon and maybe even Mars. Deemed the'Artemis Generation,' this group consists of several former US military, an ex-SpaceX medical director and a bioengineer who also participated in the 2020 Tokyo Olympics as a track cyclist. The name is a reference to NASA's Artemis program, which aims to send the first woman and the first person of color to moon as early as 2025. The astronaut candidates for 2021 are: Nichole Ayers, Marcos Berríos, Guaynabo, Christina Birch, Deniz Burnham, Luke Delaney, Andre Douglas, Jack Hathaway, Anil Menon, Christopher Williams and Jessica Wittner. This is NASA first new class in four years and the group is set to begin the two-year training process in January 2022.


A Leap from Artificial to Intelligence

Communications of the ACM

I am astonished that people who know what computers can do, and, especially, how they do it, still think we (humankind) will ever create a rational being, much less that the day is near. A program that can play winning chess or Go is not one. We all knew it would happen sooner or later. We are talking about a large but finite set of paths through a well-defined set. But such things are the work of engineers, not of the computer or its programs.


Online Tensor Methods for Learning Latent Variable Models

Huang, Furong, Niranjan, U. N., Hakeem, Mohammad Umar, Anandkumar, Animashree

arXiv.org Machine Learning

We introduce an online tensor decomposition based approach for two latent variable modeling problems namely, (1) community detection, in which we learn the latent communities that the social actors in social networks belong to, and (2) topic modeling, in which we infer hidden topics of text articles. We consider decomposition of moment tensors using stochastic gradient descent. We conduct optimization of multilinear operations in SGD and avoid directly forming the tensors, to save computational and storage costs. We present optimized algorithm in two platforms. Our GPU-based implementation exploits the parallelism of SIMD architectures to allow for maximum speed-up by a careful optimization of storage and data transfer, whereas our CPU-based implementation uses efficient sparse matrix computations and is suitable for large sparse datasets. For the community detection problem, we demonstrate accuracy and computational efficiency on Facebook, Yelp and DBLP datasets, and for the topic modeling problem, we also demonstrate good performance on the New York Times dataset. We compare our results to the state-of-the-art algorithms such as the variational method, and report a gain of accuracy and a gain of several orders of magnitude in the execution time.