AITopics

2410.23274

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.40)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceSep-30-2024

SpaceMesh: A Continuous Representation for Learning Manifold Surface Meshes

Shen, Tianchang, Li, Zhaoshuo, Law, Marc, Atzmon, Matan, Fidler, Sanja, Lucas, James, Gao, Jun, Sharp, Nicholas

Meshes are ubiquitous in visual computing and simulation, yet most existing machine learning techniques represent meshes only indirectly, e.g. as the level set of a scalar field or deformation of a template, or as a disordered triangle soup lacking local structure. This work presents a scheme to directly generate manifold, polygonal meshes of complex connectivity as the output of a neural network. Our key innovation is to define a continuous latent connectivity space at each mesh vertex, which implies the discrete mesh. In particular, our vertex embeddings generate cyclic neighbor relationships in a halfedge mesh representation, which gives a guarantee of edge-manifoldness and the ability to represent general polygonal meshes. This representation is well-suited to machine learning and stochastic optimization, without restriction on connectivity or topology. We first explore the basic properties of this representation, then use it to fit distributions of meshes from large datasets. The resulting models generate diverse meshes with tessellation structure learned from the dataset population, with concise details and high-quality mesh elements. In applications, this approach not only yields high-quality outputs from generative models, but also enables directly learning challenging geometry processing tasks such as mesh repair.

artificial intelligence, machine learning, mesh, (14 more...)

doi: 10.1145/3680528.3687634

2409.20562

Country:

Asia > Middle East > Israel > Mediterranean Sea (0.24)
North America > Canada > Ontario (0.14)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)

arXiv.org Machine LearningJun-26-2024

Improving Hyperparameter Optimization with Checkpointed Model Weights

Mehta, Nikhil, Lorraine, Jonathan, Masson, Steve, Arunachalam, Ramanathan, Bhat, Zaid Pervaiz, Lucas, James, Zachariah, Arun George

When training deep learning models, the performance depends largely on the selected hyperparameters. However, hyperparameter optimization (HPO) is often one of the most expensive parts of model design. Classical HPO methods treat this as a black-box optimization problem. However, gray-box HPO methods, which incorporate more information about the setup, have emerged as a promising direction for more efficient optimization. For example, using intermediate loss evaluations to terminate bad selections. In this work, we propose an HPO method for neural networks using logged checkpoints of the trained weights to guide future hyperparameter selections. Our method, Forecasting Model Search (FMS), embeds weights into a Gaussian process deep kernel surrogate model, using a permutationinvariant graph metanetwork to be data-efficient with the logged network weights. To facilitate reproducibility and further research, we open-source our code.

artificial intelligence, machine learning, optimization, (18 more...)

2406.1863

Country:

Asia (0.46)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceMar-22-2024

LATTE3D: Large-scale Amortized Text-To-Enhanced3D Synthesis

Xie, Kevin, Lorraine, Jonathan, Cao, Tianshi, Gao, Jun, Lucas, James, Torralba, Antonio, Fidler, Sanja, Zeng, Xiaohui

Recent text-to-3D generation approaches produce impressive 3D results but require time-consuming optimization that can take up to an hour per prompt [21, 39]. Amortized methods like ATT3D [26] optimize multiple prompts simultaneously to improve efficiency, enabling fast text-to-3D synthesis. However, they cannot capture high-frequency geometry and texture details and struggle to scale to large prompt sets, so they generalize poorly. We introduce Latte3D, addressing these limitations to achieve fast, high-quality generation on a significantly larger prompt set. Key to our method is 1) building a scalable architecture and 2) leveraging 3D data during optimization through 3D-aware diffusion priors, shape regularization, and model initialization to achieve robustness to diverse and complex training prompts. Latte3D amortizes both neural field and textured surface generation to produce highly detailed textured meshes in a single forward pass. Latte3D generates 3D objects in 400ms, and can be further enhanced with fast test-time optimization.

machine learning, natural language, optimization, (17 more...)

2403.15385

Country:

North America > United States (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.82)

Industry:

Leisure & Entertainment (0.92)
Transportation > Passenger (0.46)
Transportation > Ground (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningDec-29-2023

Graph Metanetworks for Processing Diverse Neural Architectures

Lim, Derek, Maron, Haggai, Law, Marc T., Lorraine, Jonathan, Lucas, James

Neural networks efficiently encode learned information within their parameters. Consequently, many tasks can be unified by treating neural networks themselves as input data. When doing so, recent studies demonstrated the importance of accounting for the symmetries and geometry of parameter spaces. However, those works developed architectures tailored to specific networks such as MLPs and CNNs without normalization layers, and generalizing such architectures to other types of networks can be challenging. In this work, we overcome these challenges by building new metanetworks -- neural networks that take weights from other neural networks as input. Put simply, we carefully build graphs representing the input neural networks and process the graphs using graph neural networks. We prove that GMNs are expressive and equivariant to parameter permutation symmetries that leave the input neural network functions unchanged. Neural networks are well-established for predicting, generating, and transforming data. A newer paradigm is to treat the parameters of neural networks themselves as data. This insight inspired researchers to suggest neural architectures that can predict properties of trained neural networks (Eilertsen et al., 2020), generate new networks (Erkoç et al., 2023), optimize networks (Metz et al., 2022), or otherwise transform them (Navon et al., 2023; Zhou et al., 2023a). We refer to these neural networks that process other neural networks as metanetworks, or metanets for short. Metanets enable new applications, but designing them is nontrivial. A common approach is to flatten the network parameters into a vector representation, neglecting the input network structure. More generally, a prominent challenge in metanet design is that the space of neural network parameters exhibits symmetries. For example, permuting the neurons in the hidden layers of a Multilayer Perceptron (MLP) leaves the network output unchanged (Hecht-Nielsen, 1990). Instead, equivariant metanets respect these symmetries, so that if the input network is permuted then the metanet output is permuted in the same way. Recently, several works have proposed equivariant metanets that have shown significantly improved performance (Navon et al., 2023; Zhou et al., 2023a;b). However, these networks typically require highly specialized, hand-designed layers that can be difficult to devise.

artificial intelligence, machine learning, neural network, (20 more...)

2312.04501

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.87)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceJun-6-2023

ATT3D: Amortized Text-to-3D Object Synthesis

Lorraine, Jonathan, Xie, Kevin, Zeng, Xiaohui, Lin, Chen-Hsuan, Takikawa, Towaki, Sharp, Nicholas, Lin, Tsung-Yi, Liu, Ming-Yu, Fidler, Sanja, Lucas, James

Text-to-3D modelling has seen exciting progress by combining generative text-to-image models with image-to-3D methods like Neural Radiance Fields. DreamFusion recently achieved high-quality results but requires a lengthy, per-prompt optimization to create 3D objects. To address this, we amortize optimization over text prompts by training on many prompts simultaneously with a unified model, instead of separately. With this, we share computation across a prompt set, training in less time than per-prompt optimization. Our framework - Amortized text-to-3D (ATT3D) - enables knowledge-sharing between prompts to generalize to unseen setups and smooth interpolations between text for novel assets and simple animations.

artificial intelligence, machine learning, optimization, (19 more...)

2306.07349

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe (0.14)
Asia (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceApr-23-2021

Analyzing Monotonic Linear Interpolation in Neural Network Loss Landscapes

Lucas, James, Bae, Juhan, Zhang, Michael R., Fort, Stanislav, Zemel, Richard, Grosse, Roger

Linear interpolation between initial neural network parameters and converged parameters after training with stochastic gradient descent (SGD) typically leads to a monotonic decrease in the training objective. This Monotonic Linear Interpolation (MLI) property, first observed by Goodfellow et al. (2014) persists in spite of the non-convex objectives and highly non-linear training dynamics of neural networks. Extending this work, we evaluate several hypotheses for this property that, to our knowledge, have not yet been explored. Using tools from differential geometry, we draw connections between the interpolated paths in function space and the monotonicity of the network - providing sufficient conditions for the MLI property under mean squared error. While the MLI property holds under various settings (e.g. network architectures and learning problems), we show in practice that networks violating the MLI property can be produced systematically, by encouraging the weights to move far from initialization. The MLI property raises important questions about the loss landscape geometry of neural networks and highlights the need to further study their global properties.

deep learning, mli property, neural network, (17 more...)

2104.11044

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

arXiv.org Machine LearningDec-10-2020

Flexible Few-Shot Learning with Contextual Similarity

Ren, Mengye, Triantafillou, Eleni, Wang, Kuan-Chieh, Lucas, James, Snell, Jake, Pitkow, Xaq, Tolias, Andreas S., Zemel, Richard

Existing approaches to few-shot learning deal with tasks that have persistent, rigid notions of classes. Typically, the learner observes data only from a fixed number of classes at training time and is asked to generalize to a new set of classes at test time. Two examples from the same class would always be assigned the same labels in any episode. In this work, we consider a realistic setting where the similarities between examples can change from episode to episode depending on the task context, which is not given to the learner. We define new benchmark datasets for this flexible few-shot scenario, where the tasks are based on images of faces (Celeb-A), shoes (Zappos50K), and general objects (ImageNet-with-Attributes). While classification baselines and episodic approaches learn representations that work well for standard few-shot learning, they suffer in our flexible tasks as novel similarity definitions arise during testing. We propose to build upon recent contrastive unsupervised learning techniques and use a combination of instance and class invariance learning, aiming to obtain general and flexible features. We find that our approach performs strongly on our new flexible few-shot learning benchmarks, demonstrating that unsupervised learning obtains more generalizable representations.

inductive learning, learning, neural network, (17 more...)

2012.05895

Country: North America > Canada > Ontario > Toronto (0.15)

Genre: Research Report (1.00)

Industry: Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)

arXiv.org Machine LearningOct-14-2020

Theoretical bounds on estimation error for meta-learning

Lucas, James, Ren, Mengye, Kameni, Irene, Pitassi, Toniann, Zemel, Richard

Machine learning models have traditionally been developed under the assumption that the training and test distributions match exactly. However, recent success in few-shot learning and related problems are encouraging signs that these models can be adapted to more realistic settings where train and test distributions differ. Unfortunately, there is severely limited theoretical support for these algorithms and little is known about the difficulty of these problems. In this work, we provide novel information-theoretic lower-bounds on minimax rates of convergence for algorithms that are trained on data from multiple sources and tested on novel data. Our bounds depend intuitively on the information shared between sources of data, and characterize the difficulty of learning in this setting for arbitrary algorithms. We demonstrate these bounds on a hierarchical Bayesian model of meta-learning, computing both upper and lower bounds on parameter estimation via maximum-a-posteriori inference.

ameni, artificial intelligence, bayesian inference, (18 more...)

2010.0714

Country:

North America > United States > California (0.28)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.50)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

arXiv.org Machine LearningJul-13-2020

Regularized linear autoencoders recover the principal components, eventually

Bao, Xuchan, Lucas, James, Sachdeva, Sushant, Grosse, Roger

Our understanding of learning input-output relationships with neural nets has improved rapidly in recent years, but little is known about the convergence of the underlying representations, even in the simple case of linear autoencoders (LAEs). We show that when trained with proper regularization, LAEs can directly learn the optimal representation -- ordered, axis-aligned principal components. We analyze two such regularization schemes: non-uniform $\ell_2$ regularization and a deterministic variant of nested dropout [Rippel et al, ICML' 2014]. Though both regularization schemes converge to the optimal representation, we show that this convergence is slow due to ill-conditioning that worsens with increasing latent dimension. We show that the inefficiency of learning the optimal representation is not inevitable -- we present a simple modification to the gradient descent update that greatly speeds up convergence empirically.

artificial intelligence, neural network, representation, (18 more...)

2007.06731

Country: North America > Canada > Ontario > Toronto (0.46)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.49)