Goto

Collaborating Authors

 node function


1abed6ee581b9ceb4e2ddf37822c7fcb-Supplemental-Conference.pdf

Neural Information Processing Systems

A.1 Graph-building strategies The graphs were built using the IsayevNN class from the pymatgen [48] package. It implements the commonly used Voronoi tessalation to define neighbors. Two atoms are considered bonded if they share a face in the Voronoi tessalation of the supercell and their distance is less than the sum of the atomic Cordero radii (a measure of the atomic radius) plus a cutoff =0 .5Å. This value of the cutoff was increase compared to [32] to reduce the number of disconnected graphs. We provide statistics for the graphs obtained by the method described in Section 5. A hard cutoff on atomic distances of 6Å is also imposed on atomic distances. Figure 5: Histogram of the number of primitive cell sites per material in the processed Materials Project dataset.



Graph Variate Neural Networks

arXiv.org Artificial Intelligence

Modelling dynamically evolving spatio-temporal signals is a prominent challenge in the Graph Neural Network (GNN) literature. Notably, GNNs assume an existing underlying graph structure. While this underlying structure may not always exist or is derived independently from the signal, a temporally evolving functional network can always be constructed from multi-channel data. Graph Variate Signal Analysis (GVSA) defines a unified framework consisting of a network tensor of instantaneous connectivity profiles against a stable support usually constructed from the signal itself. Building on GVSA and tools from graph signal processing, we introduce Graph-Variate Neural Networks (GVNNs): layers that convolve spatio-temporal signals with a signal-dependent connectivity tensor combining a stable long-term support with instantaneous, data-driven interactions. This design captures dynamic statistical interdependencies at each time step without ad hoc sliding windows and admits an efficient implementation with linear complexity in sequence length. Across forecasting benchmarks, GVNNs consistently outperform strong graph-based baselines and are competitive with widely used sequence models such as LSTMs and Transformers. On EEG motor-imagery classification, GVNNs achieve strong accuracy highlighting their potential for brain-computer interface applications.


Sparse Grouped Gaussian Processes for Solar Power Forecasting

arXiv.org Machine Learning

We consider multi-task regression models where observations are assumed to be a linear combination of several latent node and weight functions, all drawn from Gaussian process priors that allow nonzero covariance between grouped latent functions. Motivated by the problem of developing scalable methods for distributed solar forecasting, we exploit sparse covariance structures where latent functions are assumed to be conditionally independent given a group-pivot latent function. We exploit properties of multivariate Gaussians to construct sparse Cholesky factors directly, rather than obtaining them through iterative routines, and by doing so achieve significantly improved time and memory complexity including prediction complexity that is linear in the number of grouped functions. We test our approach on large multi-task datasets and find that sparse specifications achieve the same or better accuracy than non-sparse counterparts in less time, and improve on benchmark model accuracy.


Grouped Gaussian Processes for Solar Power Prediction

arXiv.org Machine Learning

Edwin V. Bonilla School of Computer Science and Engineering University of New South Wales Sydney, Australia We consider multi-task regression models where the observations are assumed to be a linear combination of several latent node functions and weight functions, which are both drawn from Gaussian process priors. Driven by the problem of developing scalable methods for distributed solar power forecasting, we propose coupled priors over groups of (node or weight) processes to estimate a forecast model for solar power production at multiple distributed sites, exploiting spatial dependence between functions. Our results show that our approach provides better quantification of predictive uncertainties than competing benchmarks while maintaining high point-prediction accuracy.


Bayesian Optimization for Parameter Tuning of the XOR Neural Network

arXiv.org Machine Learning

When applying Machine Learning techniques to problems, one must select model parameters to ensure that the system converges but also does not become stuck at the objective function's local minimum. Tuning these parameters becomes a non-trivial task for large models and it is not always apparent if the user has found the optimal parameters. We aim to automate the process of tuning a Neural Network, (where only a limited number of parameter search attempts are available) by implementing Bayesian Optimization. In particular, by assigning Gaussian Process Priors to the parameter space, we utilize Bayesian Optimization to tune an Artificial Neural Network used to learn the XOR function, with the result of achieving higher prediction accuracy.


Neural Tree Indexers for Text Understanding

arXiv.org Machine Learning

Recurrent neural networks (RNNs) process input text sequentially and model the conditional transition between word tokens. In contrast, the advantages of recursive networks include that they explicitly model the compositionality and the recursive structure of natural language. However, the current recursive architecture is limited by its dependence on syntactic tree. In this paper, we introduce a robust syntactic parsing-independent tree structured model, Neural Tree Indexers (NTI) that provides a middle ground between the sequential RNNs and the syntactic treebased recursive models. NTI constructs a full n-ary tree by processing the input text with its node function in a bottom-up fashion. Attention mechanism can then be applied to both structure and node function. We implemented and evaluated a binarytree model of NTI, showing the model achieved the state-of-the-art performance on three different NLP tasks: natural language inference, answer sentence selection, and sentence classification, outperforming state-of-the-art recurrent and recursive neural networks.


Constant-Time Loading of Shallow 1-Dimensional Networks

Neural Information Processing Systems

The complexity of learning in shallow I-Dimensional neural networks has been shown elsewhere to be linear in the size of the network. However, when the network has a huge number of units (as cortex has) even linear time might be unacceptable. Furthermore, the algorithm that was given to achieve this time was based on a single serial processor and was biologically implausible. In this work we consider the more natural parallel model of processing and demonstrate an expected-time complexity that is constant (i.e.


Constant-Time Loading of Shallow 1-Dimensional Networks

Neural Information Processing Systems

The complexity of learning in shallow I-Dimensional neural networks has been shown elsewhere to be linear in the size of the network. However, when the network has a huge number of units (as cortex has) even linear time might be unacceptable. Furthermore, the algorithm that was given to achieve this time was based on a single serial processor and was biologically implausible. In this work we consider the more natural parallel model of processing and demonstrate an expected-time complexity that is constant (i.e.


Constant-Time Loading of Shallow 1-Dimensional Networks

Neural Information Processing Systems

The complexity of learning in shallow I-Dimensional neural networks has been shown elsewhere to be linear in the size of the network. However, when the network has a huge number of units (as cortex has) even linear time might be unacceptable. Furthermore, the algorithm that was given to achieve this time was based on a single serial processor and was biologically implausible. In this work we consider the more natural parallel model of processing and demonstrate an expected-time complexity that is constant (i.e.