Goto

Collaborating Authors

Tabular Benchmarks for Joint Architecture and Hyperparameter Optimization

arXiv.org Machine Learning

Due to the high computational demands executing a rigorous comparison between hyperparameter optimization (HPO) methods is often cumbersome. The goal of this paper is to facilitate a better empirical evaluation of HPO methods by providing benchmarks that are cheap to evaluate, but still represent realistic use cases. We believe these benchmarks provide an easy and efficient way to conduct reproducible experiments for neural hyperparameter search. Our benchmarks consist of a large grid of configurations of a feed forward neural network on four different regression datasets including architectural hyperparameters and hyperparameters concerning the training pipeline. Based on this data, we performed an in-depth analysis to gain a better understanding of the properties of the optimization problem, as well as of the importance of different types of hyperparameters. Second, we exhaustively compared various different state-of-the-art methods from the hyperparameter optimization literature on these benchmarks in terms of performance and robustness.


6-Layer Model for a Structured Description and Categorization of Urban Traffic and Environment

arXiv.org Artificial Intelligence

Verification and validation of automated driving functions impose large challenges. Currently, scenario-based approaches are investigated in research and industry, aiming at a reduction of testing efforts by specifying safety relevant scenarios. To define those scenarios and operate in a complex real-world design domain, a structured description of the environment is needed. Within the PEGASUS research project, the 6-Layer Model (6LM) was introduced for the description of highway scenarios. This paper refines the 6LM and extends it to urban traffic and environment. As defined in PEGASUS, the 6LM provides the possibility to categorize the environment and, therefore, functions as a structured basis for subsequent scenario description. The model enables a structured description and categorization of the general environment, without incorporating any knowledge or anticipating any functions of actors. Beyond that, there is a variety of other applications of the 6LM, which are elaborated in this paper. The 6LM includes a description of the road network and traffic guidance objects, roadside structures, temporary modifications of the former, dynamic objects, environmental conditions and digital information. The work at hand specifies each layer by categorizing its items. Guidelines are formulated and explanatory examples are given to standardize the application of the model for an objective environment description. In contrast to previous publications, the model and its design are described in far more detail. Finally, the holistic description of the 6LM presented includes remarks on possible future work when expanding the concept to machine perception aspects.


Gel layer inspired by camel fur could keep food and medicines cool

New Scientist

A thin gel layer that mimics camel fur could help insulate objects, potentially keeping them cool for days, without electricity. Researchers have long been interested in hydrogels, which can absorb water and then release it through evaporation to produce a passive cooling effect without power. But a key challenge has been in finding ways to make this effect last longer. Jeffrey Grossman at the Massachusetts Institute of Technology and his colleagues looked to camels for inspiration by combining hydrogel with a thin layer of another gel – aerogel – which is a light, porous insulating material. "Our evaporation-insulation bilayer mimics the camels," says Grossman. The hydrogel layer is like the camel's sweat gland, allowing water to evaporate and provide a cooling effect, whereas the aerogel layer plays the same role as the camel's fur, he says, providing crucial insulation to keep out heat from the surroundings, while still allowing water from the hydrogel to escape through it.


On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines

arXiv.org Machine Learning

Fine-tuning pre-trained transformer-based language models such as BERT has become a common practice dominating leaderboards across various NLP benchmarks. Despite the strong empirical performance of fine-tuned models, fine-tuning is an unstable process: training the same model with multiple random seeds can result in a large variance of the task performance. Previous literature (Devlin et al., 2019; Lee et al., 2020; Dodge et al., 2020) identified two potential reasons for the observed instability: catastrophic forgetting and small size of the fine-tuning datasets. In this paper, we show that both hypotheses fail to explain the fine-tuning instability. We analyze BERT, RoBERTa, and ALBERT, finetuned on three commonly used datasets from the GLUE benchmark, and show that the observed instability is caused by optimization difficulties that lead to vanishing gradients. Additionally, we show that the remaining variance of the downstream task performance can be attributed to differences in generalization where fine-tuned models with the same training loss exhibit noticeably different test performance. Based on our analysis, we present a simple but strong baseline that makes fine-tuning BERTbased models significantly more stable than the previously proposed approaches. Pre-trained transformer-based masked language models such as BERT (Devlin et al., 2019), RoBERTa (Liu et al., 2019), and ALBERT (Lan et al., 2020) have had a dramatic impact on the NLP landscape in the recent year. The standard recipe for using such models typically involves training a pretrained model for a few epochs on a supervised downstream dataset, which is known as fine-tuning.


Internet of Things - Wisar Lab

#artificialintelligence

The Internet of Things (IoT) is the network of physical objects embedded with electronics, software and sensors, with network connectivity that allows objects to be sensed and controlled remotely across the Internet. The IOT World Forum reference model, illustrated, consists of 7 layers. Data is collected by sensors and systems at the physical layer (Layer 1) and sent via a network, Layer 2, to storage in Layer 4. In some instances data is processed locally, near the sensor network before being stored which is referred to as Layer 3 "Edge Computing". Data sent from layer 1 to layer 4 is event based meaning that data is measured and sent periodically or when a certain event occurs such as the movement of a sensor, in this sense Layers 1 to 4 are often classified as "real time". Once the data in stored into a database on the cloud it can be accessed or queried anytime, represented in Layer 5.