Oceania
Intelligence, physics and information -- the tradeoff between accuracy and simplicity in machine learning
How can we enable machines to make sense of the world, and become better at learning? To approach this goal, I believe viewing intelligence in terms of many integral aspects, and also a universal two-term tradeoff between task performance and complexity, provides two feasible perspectives. In this thesis, I address several key questions in some aspects of intelligence, and study the phase transitions in the two-term tradeoff, using strategies and tools from physics and information. Firstly, how can we make the learning models more flexible and efficient, so that agents can learn quickly with fewer examples? Inspired by how physicists model the world, we introduce a paradigm and an AI Physicist agent for simultaneously learning many small specialized models (theories) and the domain they are accurate, which can then be simplified, unified and stored, facilitating few-shot learning in a continual way. Secondly, for representation learning, when can we learn a good representation, and how does learning depend on the structure of the dataset? We approach this question by studying phase transitions when tuning the tradeoff hyperparameter. In the information bottleneck, we theoretically show that these phase transitions are predictable and reveal structure in the relationships between the data, the model, the learned representation and the loss landscape. Thirdly, how can agents discover causality from observations? We address part of this question by introducing an algorithm that combines prediction and minimizing information from the input, for exploratory causal discovery from observational time series. Fourthly, to make models more robust to label noise, we introduce Rank Pruning, a robust algorithm for classification with noisy labels. I believe that building on the work of my thesis we will be one step closer to enable more intelligent machines that can make sense of the world.
Deep Image Clustering with Tensor Kernels and Unsupervised Companion Objectives
Trosten, Daniel J., Kampffmeyer, Michael C., Jenssen, Robert
In this paper we develop a new model for deep image clustering, using convolutional neural networks and tensor kernels. The proposed Deep Tensor Kernel Clustering (DTKC) consists of a convolutional neural network (CNN), which is trained to reflect a common cluster structure at the output of its intermediate layers. Encouraging a consistent cluster structure throughout the network has the potential to guide it towards meaningful clusters, even though these clusters might appear to be nonlinear in the input space. The cluster structure is enforced through the idea of unsupervised companion objectives, where separate loss functions are attached to layers in the network. These unsupervised companion objectives are constructed based on a proposed generalization of the Cauchy-Schwarz (CS) divergence, from vectors to tensors of arbitrary rank. Generalizing the CS divergence to tensor-valued data is a crucial step, due to the tensorial nature of the intermediate representations in the CNN. Several experiments are conducted to thoroughly assess the performance of the proposed DTKC model. The results indicate that the model outperforms, or performs comparable to, a wide range of baseline algorithms. We also empirically demonstrate that our model does not suffer from objective function mismatch, which can be a problematic artifact in autoencoder-based clustering models.
Exact Information Bottleneck with Invertible Neural Networks: Getting the Best of Discriminative and Generative Modeling
Ardizzone, Lynton, Mackowiak, Radek, Köthe, Ullrich, Rother, Carsten
The Information Bottleneck (IB) principle offers a unified approach to many learning and prediction problems. Although optimal in an information-theoretic sense, practical applications of IB are hampered by a lack of accurate high-dimensional estimators of mutual information, its main constituent. We propose to combine IB with invertible neural networks (INNs), which for the first time allows exact calculation of the required mutual information. Applied to classification, our proposed method results in a generative classifier we call IB-INN. It accurately models the class conditional likelihoods, generalizes well to unseen data and reliably recognizes out-of-distribution examples. In contrast to existing generative classifiers, these advantages incur only minor reductions in classification accuracy in comparison to corresponding discriminative methods such as feed-forward networks. Furthermore, we provide insight into why IB-INNs are superior to other generative architectures and training procedures and show experimentally that our method outperforms alternative models of comparable complexity.
Unsupervised Sentiment Analysis for Code-mixed Data
Yadav, Siddharth, Chakraborty, Tanmoy
Code-mixing is the practice of alternating between two or more languages. Mostly observed in multilingual societies, its occurrence is increasing and therefore its importance. A major part of sentiment analysis research has been monolingual, and most of them perform poorly on code-mixed text. In this work, we introduce methods that use different kinds of multilingual and cross-lingual embeddings to efficiently transfer knowledge from monolingual text to code-mixed text for sentiment analysis of code-mixed text. Our methods can handle code-mixed text through a zero-shot learning. Our methods beat state-of-the-art on English-Spanish code-mixed sentiment analysis by absolute 3\% F1-score. We are able to achieve 0.58 F1-score (without parallel corpus) and 0.62 F1-score (with parallel corpus) on the same benchmark in a zero-shot way as compared to 0.68 F1-score in supervised settings. Our code is publicly available.
Measuring Diversity of Artificial Intelligence Conferences
Freire, Ana, Porcaro, Lorenzo, Gómez, Emilia
The lack of diversity of the Artificial Intelligence (AI) field is nowadays a concern, and several initiatives such as funding schemes and mentoring programs have been designed to fight against it. However, there is no indication on how these initiatives actually impact AI diversity in the short and long term. This work studies the concept of diversity in this particular context and proposes a small set of diversity indicators (i.e. indexes) of AI scientific events. These indicators are designed to quantify the lack of diversity of the AI field and monitor its evolution. We consider diversity in terms of gender, geographical location and business (understood as the presence of academia versus industry). We compute these indicators for the different communities of a conference: authors, keynote speakers and organizing committee. From these components we compute a summarized diversity indicator for each AI event. We evaluate the proposed indexes for a set of recent major AI conferences and we discuss their values and limitations.
Negative Statements Considered Useful
Arnaout, Hiba, Razniewski, Simon, Weikum, Gerhard
Knowledge bases (KBs), pragmatic collections of knowledge about notable entities, are an important asset in applications such as search, question answering and dialogue. Rooted in a long tradition in knowledge representation, all popular KBs only store positive information, while they abstain from taking any stance towards statements not contained in them. In this paper, we make the case for explicitly stating interesting statements which are not true. Negative statements would be important to overcome current limitations of question answering, yet due to their potential abundance, any effort towards compiling them needs a tight coupling with ranking. We introduce two approaches towards compiling negative statements. (i) In peer-based statistical inferences, we compare entities with highly related entities in order to derive potential negative statements, which we then rank using supervised and unsupervised features. (ii) In query-log-based text extraction, we use a pattern-based approach for harvesting search engine query logs. Experimental results show that both approaches hold promising and complementary potential. Along with this paper, we publish the first datasets on interesting negative information, containing over 1.1M statements for 100K popular Wikidata entities.
Clearview AI: The company that might end privacy as we know it - ETtech
You take a picture of a person, upload it and get to see public photos of that person along with links to where those photos appeared. By Kashmir Hill Until recently, Hoan Ton-That's greatest hit was an app that let people put Donald Trump's distinctive yellow hair on their own photos. Then Ton-That did something momentous: He invented a tool that could end your ability to walk down the street anonymously and provided it to hundreds of law enforcement agencies. His tiny company, Clearview AI, devised a groundbreaking facial recognition app. You take a picture of a person, upload it and get to see public photos of that person along with links to where those photos appeared.
Indian Ocean Dipole can be better predicted thru machine learning, say researchers
Researchers in Japan and The Netherlands have, for the first time, used machine learning techniques, in particular artificial neural networks (ANNs), to predict the Indian Ocean Dipole (IOD), a positive phase of which has affected weather and climate in India and Australia in a spectacular fashion so far in 2019-20. The IOD has both positive and negative phases, and signals large socio-economic impacts on many countries and hence predicting the IOD well in advance will benefit the affected societies, note authors JV Ratnam and Swadhin K Behera (Application Laboratory, Japan Agency for Marine-Earth Science and Technology, Yokohama) and HA Dijkstra (Institute for Marine and Atmospheric Research Utrecht, Utrecht University in The Netherlands) in a paper published by Nature. The IOD is a mode of climate variability observed in the Indian Ocean sea surface temperature anomalies with one pole in Sumatra (Indonesia) and the other near East Africa. Therefore, the IOD is represented by an index derived from the gradient between the western equatorial Indian Ocean and the south-eastern equatorial Indian Ocean. It starts sometime in May-June, peaks in September-October and ends in November (2019's rather strong positive phase of the IOD lasted into early January of 2020).
Christmas gift ideas 2019: 20 great tech gifts for the whole family
Christmas is just around the corner, which means it's time to start planning your presents. Finding the perfect gift for your loved ones can be tricky, but don't worry, TechRadar is here to help you plan ahead. There's nothing like watching the people you care about erupt into smiles as they tear off your wrapping, and are greeted with a gift they actually love. So if you want to leave a lasting impression, the latest tech gadget can do just that. Technology is evolving so quickly that if you decided on a gizmo last year, there's always something new to choose from this year.
Short Text Classification via Term Graph
Short text classi cation is a method for classifying short sentence with prede ned labels. However, short text is limited in shortness in text length that leads to a challenging problem of sparse features. Most of existing methods treat each short sentences as independently and identically distributed (IID), local context only in the sentence itself is focused and the relational information between sentences are lost. To overcome these limitations, we propose a PathWalk model that combine the strength of graph networks and short sentences to solve the sparseness of short text. Experimental results on four different available datasets show that our PathWalk method achieves the state-of-the-art results, demonstrating the efficiency and robustness of graph networks for short text classification.