AITopics | hypermodel

Collaborating Authors

hypermodel

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning to Navigate in Mazes with Novel Layouts using Abstract Top-down Maps

Zhao, Linfeng, Wong, Lawson L. S.

arXiv.org Artificial IntelligenceDec-16-2024

Learning navigation capabilities in different environments has long been one of the major challenges in decision-making. In this work, we focus on zero-shot navigation ability using given abstract $2$-D top-down maps. Like human navigation by reading a paper map, the agent reads the map as an image when navigating in a novel layout, after learning to navigate on a set of training maps. We propose a model-based reinforcement learning approach for this multi-task learning problem, where it jointly learns a hypermodel that takes top-down maps as input and predicts the weights of the transition network. We use the DeepMind Lab environment and customize layouts using generated maps. Our method can adapt better to novel environments in zero-shot and is more robust to noise.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2412.12024

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.82)

Industry:

Leisure & Entertainment (0.46)
Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

HyperTuning: Toward Adapting Large Language Models without Back-propagation

Phang, Jason, Mao, Yi, He, Pengcheng, Chen, Weizhu

arXiv.org Artificial IntelligenceNov-22-2022

Fine-tuning large language models for different tasks can be costly and inefficient, and even methods that reduce the number of tuned parameters still require full gradient-based optimization. We propose HyperTuning, a novel approach to model adaptation that uses a hypermodel to generate task-specific parameters for a fixed downstream model. We demonstrate a simple setup for hypertuning with HyperT5, a T5-based hypermodel that produces soft prefixes or LoRA parameters for a frozen T5 model from few-shot examples. We train HyperT5 in two stages: first, hyperpretraining with a modified conditional language modeling objective that trains a hypermodel to generate parameters; second, multi-task fine-tuning (MTF) on a large number of diverse language tasks. We evaluate HyperT5 on P3, MetaICL and Super-NaturalInstructions datasets, and show that it can effectively generate parameters for unseen tasks. Moreover, we show that using hypermodel-generated parameters as initializations for further parameter-efficient fine-tuning improves performance. HyperTuning can thus be a flexible and efficient way to leverage large language models for diverse downstream applications.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2211.12485

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > United States > New York (0.04)
North America > Dominican Republic (0.04)
(2 more...)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Automatic Hyperparameter Optimization With Keras Tuner

#artificialintelligenceAug-31-2022, 09:54:33 GMT

Hyperparameters are configurations that determine the structure of machine learning models and control their learning processes. They shouldn't be confused with the model's parameters (such as the bias) whose optimal values are determined during training. Hyperparameters are adjustable configurations that are manually set and tuned to optimize the model performance. They are top-level parameters whose values contribute to determining the weights of the model parameters. The two main types of hyperparameters are the model hyperparameters (such as the number and units of layers) which determine the structure of the model and the algorithm hyperparameters (such as the optimization algorithm and learning rate), which influences and controls the learning process.

hyperparameter, keras tuner, search space, (12 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Customizing keras-tuner for flexibility and maintainability

#artificialintelligenceJul-5-2022, 23:56:45 GMT

Let's get right into coding! We will import all necessary libraries, internal code, and specify callbacks, metrics, inputs, and loss function which will be passed to our custom HyperModel. Hypermodel is a class that allow us to use hp arguments to define hyperparameters. After initializing hypermodel, reassign directory name get_project_name()(where hpo trials will be saved) if already exists since if you run with already existing directory name keras tuner does not run properly. Another way to avoid this is to add overwrite True parameter when instantiating kt.RandomSearch.

customizing keras-tuner, dataset, flexibility and maintainability, (8 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback

Hyperparameter Tuning with KerasTuner and TensorFlow

#artificialintelligenceAug-28-2021, 04:05:04 GMT

Building machine learning models is an iterative process that involves optimizing the model’s performance and compute resources. The settings that you adjust during each iteration are called…

hyperparameter, keras tuner, kerastuner and tensorflow, (12 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Randomized Value Functions via Posterior State-Abstraction Sampling

Arumugam, Dilip, Van Roy, Benjamin

arXiv.org Artificial IntelligenceOct-5-2020

State abstraction has been an essential tool for dramatically improving the sample efficiency of reinforcement-learning algorithms. Indeed, by exposing and accentuating various types of latent structure within the environment, different classes of state abstraction have enabled improved theoretical guarantees and empirical performance. When dealing with state abstractions that capture structure in the value function, however, a standard assumption is that the true abstraction has been supplied or unrealistically computed a priori, leaving open the question of how to efficiently uncover such latent structure while jointly seeking out optimal behavior. Taking inspiration from the bandit literature, we propose that an agent seeking out latent task structure must explicitly represent and maintain its uncertainty over that structure as part of its overall uncertainty about the environment. We introduce a practical algorithm for doing this using two posterior distributions over state abstractions and abstract-state values. In empirically validating our approach, we find that substantial performance gains lie in the multi-task setting where tasks share a common, low-dimensional representation.

machine learning, reinforcement learning, state abstraction, (12 more...)

arXiv.org Artificial Intelligence

2010.02383

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Hypermodels for Exploration

Dwaracherla, Vikranth, Lu, Xiuyuan, Ibrahimi, Morteza, Osband, Ian, Wen, Zheng, Van Roy, Benjamin

arXiv.org Machine LearningJun-12-2020

We study the use of hypermodels to represent epistemic uncertainty and guide exploration. This generalizes and extends the use of ensembles to approximate Thompson sampling. The computational cost of training an ensemble grows with its size, and as such, prior work has typically been limited to ensembles with tens of elements. We show that alternative hypermodels can enjoy dramatic efficiency gains, enabling behavior that would otherwise require hundreds or thousands of elements, and even succeed in situations where ensemble methods fail to learn regardless of size. This allows more accurate approximation of Thompson sampling as well as use of more sophisticated exploration schemes. In particular, we consider an approximate form of information-directed sampling and demonstrate performance gains relative to Thompson sampling. As alternatives to ensembles, we consider linear and neural network hypermodels, also known as hypernetworks. We prove that, with neural network base models, a linear hypermodel can represent essentially any distribution over functions, and as such, hypernetworks are no more expressive.

base model, hypermodel, linear hypermodel, (15 more...)

arXiv.org Machine Learning

2006.07464

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

keras-team/keras-tuner

#artificialintelligenceJun-23-2019, 02:04:59 GMT

Then, start the search for the best hyperparameter configuration. The call to search has the same signature as model.fit(). Here's what happens in search: models are built iteratively by calling the model-building function, which populates the hyperparameter space (search space) tracked by the hp object. The tuner progressively explores the space, recording metrics for each configuration.

artificial intelligence, hypermodel, machine learning, (8 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.39)
Information Technology > Artificial Intelligence > Machine Learning (0.34)

Add feedback