Goto

Collaborating Authors

 Oceania


After China landed a probe on the dark side of the Moon in secret we must wake up to a threat

Daily Mail - Science & tech

When the Apollo 11 spacecraft was orbiting the Moon prior to the first lunar landing, Nasa officials told the astronauts on board to look out for the'lovely girl with a big rabbit'. They were jokingly referring to a story from Chinese mythology in which the goddess Chang'e escapes Earth to live on the Moon with her pet, Jade Rabbit. This week, almost 50 years on from that'giant leap for mankind', the legend of Chang'e resurfaced -- and this time the joke is on the Americans as China announced it had became the first nation to land a spacecraft on the'dark side of the moon'. The robotic probe was named Chang'e 4, a product of China's £3.9 billion a year space exploration project. This week, almost 50 years on from that'giant leap for mankind', the legend of Chang'e resurfaced -- and this time the joke is on the Americans as China announced it had became the first nation to land a spacecraft on the'dark side of the moon' If ever there was a metaphor for the Communist super-power's obsessive secrecy and soaring global ambition, then this audacious secret mission provides it.


Population-Guided Large Margin Classifier for High-Dimension Low -Sample-Size Problems

arXiv.org Machine Learning

Various applications in different fields, such as gene expression analysis or computer vision, suffer from data sets with high-dimensional low-sample-size (HDLSS), which has posed significant challenges for standard statistical and modern machine learning methods. In this paper, we propose a novel linear binary classifier, denoted by population-guided large margin classifier (PGLMC), which is applicable to any sorts of data, including HDLSS. PGLMC is conceived with a projecting direction w given by the comprehensive consideration of local structural information of the hyperplane and the statistics of the training samples. Our proposed model has several advantages compared to those widely used approaches. First, it is not sensitive to the intercept term b. Second, it operates well with imbalanced data. Third, it is relatively simple to be implemented based on Quadratic Programming. Fourth, it is robust to the model specification for various real applications. The theoretical properties of PGLMC are proven. We conduct a series of evaluations on two simulated and six real-world benchmark data sets, including DNA classification, digit recognition, medical image analysis, and face recognition. PGLMC outperforms the state-of-the-art classification methods in most cases, or at least obtains comparable results.


Dawn of the Robo-train: Autonomous railway is the largest robot in the world

Daily Mail - Science & tech

The world's largest robot has been unveiled and it is a completely autonomous railway system. AutoHaul has been developed by a mining firm and is being used to transport iron ore from mines to shipping ports 500 miles away (800 km) in Western Australia. This journey can be completed in just 40 hours, including the loading and dumping of the ferrous cargo. Its deployment is the end result of a project which has so far cost $940 million (£740 million). Rio Tinto, the corporation that built the infrastructure and hardware for the locomotive, says this could be the first step in transforming the firm's 1,000-mile (1,700-kilometre) network connecting 16 iron ore mines and two ports.


Baraja raises $32 million for autonomous vehicle lidar with prism-like optics

#artificialintelligence

Lidar startup Baraja has raised $32 million in a series A round of funding led by Sequoia China and Main Sequence Ventures' CSIRO Innovation Fund, with participation from Blackbird Ventures. Founded out of Sydney, Australia, in 2015, Baraja is one of a number of lidar startups targeting the burgeoning driverless car industry with the necessary smarts to safely navigate busy thoroughfares without human intervention. Lidar technology essentially surveys the environment by beaming out laser-powered light to measure distances. Anyone who has observed the big players in the autonomous vehicle realm, such as Alphabet's Waymo, will have noticed the giant spinning lasers mounted atop the vehicle's roof that rotate to garner a comprehensive view of the environment. These are not only bulky, but expensive -- perhaps prohibitively expensive if self-driving cars are ever to hit mass production.


More Intelligent, More Intuitive, More Adaptive: Contact Center Software for A New Year

#artificialintelligence

Heading into 2019, year-end trendspotting is surfacing predictions for the future of nearly every form of technology, and after years of software disruption in the contact center, customer engagement and customer experience industry, one CEO of a fast-growing Denver-based startup is turning away from the usual predictions and sharing, instead, insights driven by years of developing sensitive software that can analyze in near real time "digital conversations" happening between brands and consumers. "We're in a new world now," said Joe Galvin, founder of CustomerView, a company supported by CR-X, a global software development firm based in Australia, and developer of the CustomerView big data analytics, AI, natural language processing and deep machine learning platform. "Our collective team here in the US, Australia and elsewhere are veterans of the telecom and customer service industries, and even as we have been working very closely with customers and partners over the last three years to build a capability to truly understand, record, transcribe, analyze and report on interactions the landscape has become more dynamic and exciting than we originally believed." With technology giants like Amazon, Google and Microsoft investing more in the space, Galvin believes the bar is being raised and that any successful brand – banks, insurance companies, travel companies, healthcare providers, consumer electronics innovators and more – will need to "lead with speed and service, giving this and coming generations who will expect immediate and accurate answers, and efficient and friendly support, through the channel of their choosing." Galvin sees the affordability and availability of more scalable and less expensive technologies (including AWS and other cloud application providers) as key to unlocking innovation which will not only drive adoption but change the compliance landscape in ways we could never have dreamed of in the premise, then hybrid, and now nearly pure-cloud world.


An Adaptive Weighted Deep Forest Classifier

arXiv.org Machine Learning

A modification of the confidence screening mechanism based on adaptive weighing of every training instance at each cascade level of the Deep Forest is proposed. The idea underlying the modification is very simple and stems from the confidence screening mechanism idea proposed by Pang et al. to simplify the Deep Forest classifier by means of updating the training set at each level in accordance with the classification accuracy of every training instance. However, if the confidence screening mechanism just removes instances from training and testing processes, then the proposed modification is more flexible and assigns weights by taking into account the classification accuracy. The modification is similar to the AdaBoost to some extent. Numerical experiments illustrate good performance of the proposed modification in comparison with the original Deep Forest proposed by Zhou and Feng.


Learning to Remember More with Less Memorization

arXiv.org Machine Learning

Memory-augmented neural networks consisting of a neural controller and an external memory have shown potentials in long-term sequential learning. Current RAM-like memory models maintain memory accessing every timesteps, thus they do not effectively leverage the short-term memory held in the controller. We hypothesize that this scheme of writing is suboptimal in memory utilization and introduces redundant computation. To validate our hypothesis, we derive a theoretical bound on the amount of information stored in a RAM-like system and formulate an optimization problem that maximizes the bound. The proposed solution dubbed Uniform Writing is proved to be optimal under the assumption of equal timestep contributions. To relax this assumption, we introduce modifications to the original solution, resulting in a solution termed Cached Uniform Writing. This method aims to balance between maximizing memorization and forgetting via overwriting mechanisms. Through an extensive set of experiments, we empirically demonstrate the advantages of our solutions over other recurrent architectures, claiming the state-of-the-arts in various sequential modeling tasks.


Off-Policy Evaluation of Probabilistic Identity Data in Lookalike Modeling

arXiv.org Machine Learning

We evaluate the impact of probabilistically-constructed digital identity data collected from Sep. to Dec. 2017 (approx.), in the context of Lookalike-targeted campaigns. The backbone of this study is a large set of probabilistically-constructed "identities", represented as small bags of cookies and mobile ad identifiers with associated metadata, that are likely all owned by the same underlying user. The identity data allows to generate "identity-based", rather than "identifier-based", user models, giving a fuller picture of the interests of the users underlying the identifiers. We employ off-policy techniques to evaluate the potential of identity-powered lookalike models without incurring the risk of allowing untested models to direct large amounts of ad spend or the large cost of performing A/B tests. We add to historical work on off-policy evaluation by noting a significant type of "finite-sample bias" that occurs for studies combining modestly-sized datasets and evaluation metrics involving rare events (e.g., conversions). We illustrate this bias using a simulation study that later informs the handling of inverse propensity weights in our analyses on real data. We demonstrate significant lift in identity-powered lookalikes versus an identity-ignorant baseline: on average ~70% lift in conversion rate. This rises to factors of ~(4-32)x for identifiers having little data themselves, but that can be inferred to belong to users with substantial data to aggregate across identifiers. This implies that identity-powered user modeling is especially important in the context of identifiers having very short lifespans (i.e., frequently churned cookies). Our work motivates and informs the use of probabilistically-constructed identities in marketing. It also deepens the canon of examples in which off-policy learning has been employed to evaluate the complex systems of the internet economy.


Learning Graph Embedding with Adversarial Training Methods

arXiv.org Machine Learning

Graph embedding aims to transfer a graph into vectors to facilitate subsequent graph analytics tasks like link prediction and graph clustering. Most approaches on graph embedding focus on preserving the graph structure or minimizing the reconstruction errors for graph data. They have mostly overlooked the embedding distribution of the latent codes, which unfortunately may lead to inferior representation in many cases. In this paper, we present a novel adversarially regularized framework for graph embedding. By employing the graph convolutional network as an encoder, our framework embeds the topological information and node content into a vector representation, from which a graph decoder is further built to reconstruct the input graph. The adversarial training principle is applied to enforce our latent codes to match a prior Gaussian or Uniform distribution. Based on this framework, we derive two variants of adversarial models, the adversarially regularized graph autoencoder (ARGA) and its variational version, adversarially regularized variational graph autoencoder (ARVGA), to learn the graph embedding effectively. We also exploit other potential variations of ARGA and ARVGA to get a deeper understanding on our designs. Experimental results compared among twelve algorithms for link prediction and twenty algorithms for graph clustering validate our solutions.


Mapping Informal Settlements in Developing Countries using Machine Learning and Low Resolution Multi-spectral Data

arXiv.org Machine Learning

Informal settlements are home to the most socially and economically vulnerable people on the planet. In order to deliver effective economic and social aid, non-government organizations (NGOs), such as the United Nations Children's Fund (UNICEF), require detailed maps of the locations of informal settlements. However, data regarding informal and formal settlements is primarily unavailable and if available is often incomplete. This is due, in part, to the cost and complexity of gathering data on a large scale. An additional complication is that the definition of an informal settlement is also very broad, which makes it a non-trivial task to collect data. This also makes it challenging to teach a machine what to look for. Due to these challenges we provide three contributions in this work. 1) A brand new machine learning data-set, purposely developed for informal settlement detection that contains a series of low and very-high resolution imagery, with accompanying ground truth annotations marking the locations of known informal settlements. 2) We demonstrate that it is possible to detect informal settlements using freely available low-resolution (LR) data, in contrast to previous studies that use very-high resolution (VHR) satellite and aerial imagery, which is typically cost-prohibitive for NGOs. 3) We demonstrate two effective classification schemes on our curated data set, one that is cost-efficient for NGOs and another that is cost-prohibitive for NGOs, but has additional utility. We integrate these schemes into a semi-automated pipeline that converts either a LR or VHR satellite image into a binary map that encodes the locations of informal settlements. We evaluate and compare our methods.