Oceania
Cryptographers Dismiss AI, Quantum Computing Threats
SAN FRANCISCO--Cryptographers said at the RSA Conference Tuesday they're skeptical that advances in quantum computing and artificial intelligence will profoundly transform computer security. "I'm skeptical there will be much of an impact," Ron Rivest, a MIT professor and inventor of several symmetric key encryption algorithms, said early at the annual Cryptographers' Panel here. Susan Landau, a professor who specializes in cybersecurity policy and computer science at Worcester Polytechnic Institute, said that while artificial intelligence can be helpful when it comes to processing lots of data effectively, she doesn't think it will be useful in fingering out series attacks or anomalous situations. Adi Shamir, Borman Professor of Computer Science at the Weizmann Institute, said he was optimistic about AI's potential when it comes to defense โ anything that involves finding deviations in behavior โ but said he doubts it can ever be used in offensive sense, such as in identifying zero days, something he said requires more ingenuity and originality. The discussion was steered by a report recently released by the Global Risk Institute on the emergence of quantum computing technologies.
10 things marketers need to know about AI
For years, marketing was considered more art than science. But more recently, as marketing automation software has proliferated, marketers have had to blend the art of storytelling with the science of data. Then along comes artificial intelligence (AI) and machine learning, which promise to help marketers make sense of all that data. Some experts believe AI's impact on marketing will be hugely significant, that it could even change the nature of marketing entirely -- enabling brands to break through the noise and deliver a more personalized experience to customers. Not surprisingly, though, there are challenges ahead for organizations seeking to add AI to their marketing technology stack.
Christopher Strachey's Nineteen-Fifties Love Machine
Overwrought love letters began turning up on the notice board at the University of Manchester's computer lab in August, 1953. Dripping with lustful vocabulary, they were all variations on a basic syntactic template: "YOU ARE MY [adjective] [noun]. And the signatory was always the same: "M.U.C.," for the Manchester University computer, a Ferranti Mark 1, the world's first general-purpose and commercially available machine of its kind. But the real author of the letters (in the first instance, anyway) was Christopher Strachey, a pioneering programmer. As he confessed in an article the following year, "There are many obvious imperfections in this scheme (indeed very little thought went into its devising), and the fact that the vocabulary was largely based on Roget's Thesaurus lends a very peculiar flavor to the results." For Strachey, though, the interesting thing was how a simple setup, using only about seventy base words, could produce a combinatorial explosion of results--on the order of three hundred billion different letters. The lovelorn user could run the program over and over until his fingers seized up, and never see the same letter twice. Strachey was something of an outlier, according to Martin Campbell-Kelly, a historian of computing at the University of Warwick. While scientists and mathematicians of the day typically used computers strictly for numerical calculations, like analyzing weapons trajectories or seeking prime factors of huge numbers, his fascination was with non-numerical computations--what soon became known as artificial intelligence. "Strachey grabbed hold of that much more than anybody else," Campbell-Kelly told me. The results were not always lovey-dovey. Besides training the Mark 1 to churn out billets-doux, he also taught it to play checkers ("draughts," in British parlance). If M.U.C.'s opponent made too many mistakes, it would get crotchety and print out a reprimand: "I refuse to waste any more time.
Multi-View Correlated Feature Learning by Uncovering Shared Component
Xue, Xiaowei (Zhejiang University) | Nie, Feiping (Northwestern Polytechnical University) | Wang, Sen (Griffith University) | Chang, Xiaojun (University of Technology Sydney) | Stantic, Bela (Griffith University) | Yao, Min (Zhejiang University)
Learning multiple heterogeneous features from different data sources is challenging. One research topic is how to exploit and utilize the correlations among various features across multiple views with the aim of improving the performance of learning tasks, such as classification. In this paper, we propose a new multi-view feature learning algorithm that simultaneously analyzes features from different views. Compared to most of the existing subspace learning methods that only focus on exploiting a shared latent subspace, our algorithm not only learns individual information in each view but also captures feature correlations among multiple views by learning a shared component. By assuming that such a component is shared by all views, we simultaneously exploit the shared component and individual information of each view in a batch mode. Since the objective function is non-smooth and difficult to solve, we propose an efficient iterative algorithm for optimization with guaranteed convergence. Extensive experiments are conducted on several benchmark datasets. The results demonstrate that our proposed algorithm performs better than all the compared multi-view learning algorithms.
Improving Efficiency of SVM k -Fold Cross-Validation by Alpha Seeding
Wen, Zeyi (The University of Melbourne) | Li, Bin (South China University of Technology) | Kotagiri, Ramamohanarao (The University of Melbourne) | Chen, Jian (South China University of Technology) | Chen, Yawen (South China University of Technology) | Zhang, Rui (The University of Melbourne)
The k-fold cross-validation is commonly used to evaluate the effectiveness of SVMs with the selected hyper-parameters. It is known that the SVM k-fold cross-validation is expensive, since it requires training k SVMs. However, little work has explored reusing the h-th SVM for training the (h+1)-th SVM for improving the efficiency of k-fold cross-validation. In this paper, we propose three algorithms that reuse the h-th SVM for improving the efficiency of training the (h+1)-th SVM. Our key idea is to efficiently identify the support vectors and to accurately estimate their associated weights (also called alpha values) of the next SVM by using the previous SVM. Our experimental results show that our algorithms are several times faster than the k-fold cross-validation which does not make use of the previously trained SVM. Moreover, our algorithms produce the same results (hence same accuracy) as the k-fold cross-validation which does not make use of the previously trained SVM.
Unbiased Multivariate Correlation Analysis
Wang, Yisen (Tsinghua University) | Romano, Simone (University of Melbourne) | Nguyen, Vinh (University of Melbourne) | Bailey, James (University of Melbourne) | Ma, Xingjun (University of Melbourne) | Xia, Shu-Tao (Tsinghua University)
Correlation measures are a key element of statistics and machine learning, and essential for a wide range of data analysis tasks. Most existing correlation measures are for pairwise relationships, but real-world data can also exhibit complex multivariate correlations, involving three or more variables. We argue that multivariate correlation measures should be comparable, interpretable, scalable and unbiased. However, no existing measures satisfy all these requirements. In this paper, we propose an unbiased multivariate correlation measure, called UMC, which satisfies all the above criteria. UMC is a cumulative entropy based non-parametric multivariate correlation measure, which can capture both linear and non-linear correlations for groups of three or more variables. It employs a correction for chance using a statistical model of independence to address the issue of bias. UMC has high interpretability and we empirically show it outperforms state-of-the-art multivariate correlation measures in terms of statistical power, as well as for use in both subspace clustering and outlier detection tasks.
Cost-Sensitive Feature Selection via F-Measure Optimization Reduction
Liu, Meng (Peking University) | Xu, Chang (University of Technology, Sydney) | Luo, Yong (Nanyang Technological University) | Xu, Chao (Peking University) | Wen, Yonggang (Nanyang Technological University) | Tao, Dacheng (University of Technology, Sydney)
Feature selection aims to select a small subset from the high-dimensional features which can lead to better learning performance, lower computational complexity, and better model readability. The class imbalance problem has been neglected by traditional feature selection methods, therefore the selected features will be biased towards the majority classes. Because of the superiority of F-measure to accuracy for imbalanced data, we propose to use F-measure as the performance measure for feature selection algorithms. As a pseudo-linear function, the optimization of F-measure can be achieved by minimizing the total costs. In this paper, we present a novel cost-sensitive feature selection (CSFS) method which optimizes F-measure instead of accuracy to take class imbalance issue into account. The features will be selected according to optimal F-measure classifier after solving a series of cost-sensitive feature selection sub-problems. The features selected by our method will fully represent the characteristics of not only majority classes, but also minority classes. Extensive experimental results conducted on synthetic, multi-class and multi-label datasets validate the efficiency and significance of our feature selection method.
Robust Manifold Matrix Factorization for Joint Clustering and Feature Extraction
Zhang, Lefei (Wuhan University) | Zhang, Qian (Alibaba Group) | Du, Bo (Wuhan University) | Tao, Dacheng (University of Technology Sydney) | You, Jane (The Hong Kong Polytechnic University)
Low-rank matrix approximation has been widely used for data subspace clustering and feature representation in many computer vision and pattern recognition applications. However, in order to enhance the discriminability, most of the matrix approximation based feature extraction algorithms usually generate the cluster labels by certain clustering algorithm (e.g., the kmeans) and then perform the matrix approximation guided by such label information. In addition, the noises and outliers in the dataset with large reconstruction errors will easily dominate the objective function by the conventional โ 2 -norm based squared residue minimization. In this paper, we propose a novel clustering and feature extraction algorithm based on an unified low-rank matrix factorization framework, which suggests that the observed data matrix can be approximated by the production of projection matrix and low dimensional representation, among which the low-dimensional representation can be approximated by the cluster indicator and latent feature matrix simultaneously. Furthermore, we have proposed using the โ 2,1 -norm and integrating the manifold regularization to further promote the proposed model. A novel Augmented Lagrangian Method (ALM) based procedure is designed to effectively and efficiently seek the optimal solution of the problem. The experimental results in both clustering and feature extraction perspectives demonstrate the superior performance of the proposed method.
Event Video Mashup: From Hundreds of Videos to Minutes of Skeleton
Gao, Lianli (University of Electronic Science and Technology of China) | Wang, Peng (The University of Queensland) | Song, Jingkuan (Columbia University) | Huang, Zi (The University of Queensland) | Shao, Jie (University of Electronic Science and Technology of China) | Shen, Heng Tao (University of Electronic Science and Technology of China)
The explosive growth of video content on the Web has been revolutionizing the way people share, exchange and perceive information, such as events. While an individual video usually concerns a specific aspect of an event, the videos that are uploaded by different users at different locations and times can embody different emphasis and compensate each other in describing the event. Combining these videos from different sources together can unveil a more complete picture of the event. Simply concatenating videos together is an intuitive solution, but it may degrade user experience since it is time-consuming and tedious to view those highly redundant, noisy and disorganized content. Therefore, we develop a novel approach, termed event video mashup (EVM), to automatically generate a unified short video from a collection of Web videos to describe the storyline of an event. We propose a submodular based content selection model that embodies both importance and diversity to depict the event from comprehensive aspects in an efficient way. Importantly, the video content is organized temporally and semantically conforming to the event evolution. We evaluate our approach on a real-world YouTube event dataset collected by ourselves. The extensive experimental results demonstrate the effectiveness of the proposed framework.
Diagnosability Planning for Controllable Discrete Event Systems
Ibrahim, Hassan (LRI, Univ. Paris-Sud and CNRS, Univ. Paris-Saclay) | Dague, Philippe (LRI, Univ. Paris-Sud and CNRS, Univ. Paris-Saclay ) | Grastien, Alban ( Data61 and Australian National University ) | Ye, Lina (LRI, Univ. Paris-Sud and CNRS, Univ. Paris-Saclay) | Simon, Laurent (LaBRI, Univ. Bordeaux and CNRS)
In this paper, we propose an approach to ensure the diagnosability of a partially controllable system. Given a model of correct and faulty behaviors of a partially observable discrete event system, equipped with a set of elementary actions that do not intertwine with autonomous events, we search a diagnosability plan, i.e., a sequence of applicable actions that leads the system from an initial belief state (a set of potentially current states) to a diagnosable belief state, in which the system is then left to run freely. This helps in reducing the diagnosis interaction with running systems and can be applied, e.g., on the output of a repair plan, like in power networks. The two successive stages of this approach keep diagnosability planning, including diagnosability tests, in PSpace in comparison to the Exptime test for the more complex active diagnosability used usually in such cases. For this, we propose to construct incrementally the twin plant structure of the given system and to exploit its parts already constructed while testing the candidate plans and constructing its next parts. This helps in pruning the twin plant constructions and many non-diagnosability plan tests. We have created a special benchmark and tested three proposed methods, according to the recycling level of twin plants construction, with one cost function used for plan optimality and an optional heuristics.