Oceania
Natural Language Processing for Information Extraction
With rise of digital age, there is an explosion of information in the form of news, articles, social media, and so on. Much of this data lies in unstructured form and manually managing and effectively making use of it is tedious, boring and labor intensive. This explosion of information and need for more sophisticated and efficient information handling tools gives rise to Information Extraction(IE) and Information Retrieval(IR) technology. Information Extraction systems takes natural language text as input and produces structured information specified by certain criteria, that is relevant to a particular application. Various sub-tasks of IE such as Named Entity Recognition, Coreference Resolution, Named Entity Linking, Relation Extraction, Knowledge Base reasoning forms the building blocks of various high end Natural Language Processing (NLP) tasks such as Machine Translation, Question-Answering System, Natural Language Understanding, Text Summarization and Digital Assistants like Siri, Cortana and Google Now. This paper introduces Information Extraction technology, its various sub-tasks, highlights state-of-the-art research in various IE subtasks, current challenges and future research directions.
Temporal graph-based clustering for historical record linkage
Nanayakkara, Charini, Christen, Peter, Ranbaduge, Thilina
Research in the social sciences is increasingly based on large and complex data collections, where individual data sets from different domains are linked and integrated to allow advanced analytics. A popular type of data used in such a context are historical censuses, as well as birth, death, and marriage certificates. Individually, such data sets however limit the types of studies that can be conducted. Specifically, it is impossible to track individuals, families, or households over time. Once such data sets are linked and family trees spanning several decades are available it is possible to, for example, investigate how education, health, mobility, employment, and social status influence each other and the lives of people over two or even more generations. A major challenge is however the accurate linkage of historical data sets which is due to data quality and commonly also the lack of ground truth data being available. Unsupervised techniques need to be employed, which can be based on similarity graphs generated by comparing individual records. In this paper we present initial results from clustering birth records from Scotland where we aim to identify all births of the same mother and group siblings into clusters. We extend an existing clustering technique for record linkage by incorporating temporal constraints that must hold between births by the same mother, and propose a novel greedy temporal clustering technique. Experimental results show improvements over non-temporary approaches, however further work is needed to obtain links of high quality.
Causal Deep Information Bottleneck
Parbhoo, Sonali, Wieser, Mario, Roth, Volker
Estimating causal effects in the presence of latent confounding is a frequently occurring problem in several tasks. In real world applications such as medicine, accounting for the effects of latent confounding is even more challenging as a result of high-dimensional and noisy data. In this work, we propose estimating the causal effect from the perspective of the information bottleneck principle by explicitly identifying a low-dimensional representation of latent confounding. In doing so, we prove theoretically that the proposed model can be used to recover the average causal effect. Experiments on both synthetic data and existing causal benchmarks illustrate that our method achieves state-of-the-art performance in terms of prediction accuracy and sample efficiency, without sacrificing interpretability.
Fully Scalable Gaussian Processes using Subspace Inducing Inputs
Panos, Aristeidis, Dellaportas, Petros, Titsias, Michalis K.
We introduce fully scalable Gaussian processes, an implementation scheme that tackles the problem of treating a high number of training instances together with high dimensional input data. Our key idea is a representation trick over the inducing variables called subspace inducing inputs. This is combined with certain matrix-preconditioning based parametrizations of the variational distributions that lead to simplified and numerically stable variational lower bounds. Our illustrative applications are based on challenging extreme multi-label classification problems with the extra burden of the very large number of class labels. We demonstrate the usefulness of our approach by presenting predictive performances together with low computational times in datasets with extremely large number of instances and input dimensions.
Broad interests reap benefits for science
We asked young scientists this question: How do broad interests benefit your science? Scientists with a variety of hobbies responded that their extracurricular activities have enhanced a wide range of skills, from creativity to communication to resilience. Many also mentioned the value of clearing their minds and relaxing. Follow NextGen and share your own hobbies on Twitter with #NextGenSci. As a rock climber, you have to risk falling in order to become better; the same principle applies in science.
Tinder gets animated: New '2 second 'Loops' profile pictures launched
Tinder is finally allowing users to animate their profile prictures. The dating app today confirmed its'loops' feature is available globally, after it was initially tested in Canada and Sweden. It allows two second video loops to be uploaded. The dating app today confirmed its'loops' feature is available globally, after it was initially tested in Canada and Sweden. 'It all started with the swipe--that fun, simple movement that changed the way people meet,' Tinder said in a blog post announcing the new feature.
Australian airport begins passport-free biometric check-in trials
Qantas passengers who travel through Sydney Airport will be among the first groups of travelers to use facial recognition in automated check-ins, bag drop, lounge access and plane boarding. The system will ultimately allow officials to process travelers quicker. Early trials which provide a glimpse into a seamless, passport-free future are currently underway, but their implementation is provoking mixed responses. Sydney Airport CEO Geoff Culbert is optimistic: "There will be no more juggling passports and bags at check-in and digging through pockets or smartphones to show your boarding pass -- your face will be your passport and your boarding pass at every step of the process". The biometrics system has also been endorsed by the Australian federal government, which promised to pump $22.5 million AUD ($16.6 million) to ensure facial recognition technology would be adopted across all Australian airports.
Sydney Uni harvests big data to boost crops
Embedding big data and machine learning into everyday farming could soon help Australia significantly boost its food production to meet growing demands without degrading soil and water quality or overusing fertilisers. That's the vision a University of Sydney research team, led by associate professor Thomas Bishop, will put to the test as they look for better ways to harvest information literally in the field by finding novel approaches to precision farming. The individual datapoints might be small, but the vision is large with agriculture taking up 56 percent of Australia's land surface and generating a wealth of data in the process - much of it currently unconnected. To get a much clearer picture of agricultural performance, researchers will investigate combining disparate data sets surrounding crop yield, weather, and management practices to better predict crop volumes and quality. This will guide researchers and the agricultural industry on how best to apply fertilisers to maximise grain output and quality, without increasing the amount of chemical runoff into waterways.
UNSW student designs system to improve IVF success rates
A final year medicine student at the University of New South Wales (UNSW) has designed an artificial intelligence (AI) system that's helping women become pregnant using IVF techniques. UNSW said on Wednesday that Aengus Tran, 24, and his MBA- accredited brother Dimitry have founded Harrrison-AI, an organisation that uses machine learning technology to improve the embryo selection process. The men were driven to start the company following a lecture from IVF Australia's scientific director, Dr Simon Cooke, who said that embryologists manually assess groups of embryos based on physical appearance at a limited number of critical development checkpoints. This is before they select the embryo they feel is most likely to result in a pregnancy, UNSW said. Tran has created a system called'Ivy' that uses machine learning from thousands of previous successful and unsuccessful embryos to help make decisions about viability faster and better.