consensus score
Representation learning for a generalized, quantitative comparison of complex model outputs
Cess, Colin G., Finley, Stacey D.
Computational models are quantitative representations of systems. By analyzing and comparing the outputs of such models, it is possible to gain a better understanding of the system itself. Though as the complexity of model outputs increases, it becomes increasingly difficult to compare simulations to each other. While it is straightforward to only compare a few specific model outputs across multiple simulations, additional useful information can come from comparing model simulations as a whole. However, it is difficult to holistically compare model simulations in an unbiased manner. To address these limitations, we use representation learning to transform model simulations into low-dimensional points, with the neural networks capturing the relationships between the model outputs without the need to manually specify which outputs to focus on. The distance in low-dimensional space acts as a comparison metric, reducing the difference between simulations to a single value. We provide an approach to training neural networks on model simulations and display how the trained networks can then be used to provide a holistic comparison of model outputs. This approach can be applied to a wide range of model types, providing a quantitative method of analyzing the complex outputs of computational models.
All Neural Networks are Created Equal
Hacohen, Guy, Weinshall, Daphna
One of the unresolved questions in the context of deep learning is the triumph of GD based optimization, which is guaranteed to converge to one of many local minima. To shed light on the nature of the solutions that are thus being discovered, we investigate the ensemble of solutions reached by the same network architecture, with different random initialization of weights and random mini-batches. Surprisingly, we observe that these solutions are in fact very similar - more often than not, each train and test example is either classified correctly by all the networks, or by none at all. Moreover, all the networks seem to share the same learning dynamics, whereby initially the same train and test examples are incorporated into the learnt model, followed by other examples which are learnt in roughly the same order. When different neural network architectures are compared, the same learning dynamics is observed even when one architecture is significantly stronger than the other and achieves higher accuracy. Finally, when investigating other methods that involve the gradual refinement of a solution, such as boosting, once again we see the same learning pattern. In all cases, it appears as if all the classifiers start by learning to classify correctly the same train and test examples, while the more powerful classifiers continue to learn to classify correctly additional examples. These results are incredibly robust, observed for a large variety of architectures, hyperparameters and different datasets of images. Thus we observe that different classification solutions may be discovered by different means, but typically they evolve in roughly the same manner and demonstrate a similar success and failure behavior. For a given dataset, such behavior seems to be strongly correlated with effective generalization, while the induced ranking of examples may reflect inherent structure in the data.
Quantifying consensus of rankings based on q-support patterns
Xue, Zhengui, Lin, Zhiwei, Wang, Hui, McClean, Sally
Rankings, representing preferences over a set of candidates, are widely used in many information systems, e.g., group decision making. It is of great importance to evaluate the consensus of the obtained rankings from multiple agents. There is often no ground truth available for a ranking task. An overall measure of the consensus degree enables us to have a clear cognition about the ranking data. Moreover, it could provide a quantitative indicator for consensus comparison between groups and further improvement of a ranking system. In this paper, a novel consensus quantifying approach, without the need for any correlation or distance functions, is proposed based on a concept of q-support patterns of rankings. The q-support patterns represent the commonality embedded in a set of rankings. A method for detecting outliers in a set of rankings is naturally derived from the proposed consensus quantifying approach. Experimental studies are conducted to demonstrate the effectiveness of the proposed approach.
Reference Based LSTM for Image Captioning
Chen, Minghai (Tsinghua University) | Ding, Guiguang (Tsinghua University) | Zhao, Sicheng (Tsinghua University) | Chen, Hui (Tsinghua University) | Liu, Qiang (Tsinghua University) | Han, Jungong (Northumbria University)
Image captioning is an important problem in artificial intelligence, related to both computer vision and natural language processing. There are two main problems in existing methods: in the training phase, it is difficult to find which parts of the captions are more essential to the image; in the caption generation phase, the objects or the scenes are sometimes misrecognized. In this paper, we consider the training images as the references and propose a Reference based Long Short Term Memory (R-LSTM) model, aiming to solve these two problems in one goal. When training the model, we assign different weights to different words, which enables the network to better learn the key information of the captions. When generating a caption, the consensus score is utilized to exploit the reference information of neighbor images, which might fix the misrecognition and make the descriptions more natural-sounding. The proposed R-LSTM model outperforms the state-of-the-art approaches on the benchmark dataset MS COCO and obtains top 2 position on 11 of the 14 metrics on the online test server.