South America
Generating Representative Headlines for News Stories
Gu, Xiaotao, Mao, Yuning, Han, Jiawei, Liu, Jialu, Yu, Hongkun, Wu, You, Yu, Cong, Finnie, Daniel, Zhai, Jiaqi, Zukoski, Nicholas
Millions of news articles are published online every day, which can be overwhelming for readers to follow. Grouping articles that are reporting the same event into news stories is a common way of assisting readers in their news consumption. However, it remains a challenging research problem to efficiently and effectively generate a representative headline for each story. Automatic summarization of a document set has been studied for decades, while few studies have focused on generating representative headlines for a set of articles. Unlike summaries, which aim to capture most information with least redundancy, headlines aim to capture information jointly shared by the story articles in short length, and exclude information that is too specific to each individual article. In this work, we study the problem of generating representative headlines for news stories. We develop a distant supervision approach to train large-scale generation models without any human annotation. This approach centers on two technical components. First, we propose a multi-level pre-training framework that incorporates massive unlabeled corpus with different quality-vs.-quantity balance at different levels. We show that models trained within this framework outperform those trained with pure human curated corpus. Second, we propose a novel self-voting-based article attention layer to extract salient information shared by multiple articles. We show that models that incorporate this layer are robust to potential noises in news stories and outperform existing baselines with or without noises. We can further enhance our model by incorporating human labels, and we show our distant supervision approach significantly reduces the demand on labeled data.
"Hey, Update My Voice" Exposes Cyber Harassment.
The "Hey, Update My Voice" movement, in partnership with UNESCO, was born out of this context with the goal of teaching respect towards virtual assistants and, in addition, asking tech companies to update their assistants' responses. Because if that happens to them, imagine what happens in real life to real women. Every day around the world, virtual assistants suffer abuse and harassment of all kinds. In Brazil, for example, Lu, the virtual assistant of Magazine Luiza stores, has been victimized by this sort of violence. Worldwide, cases have been reported involving Siri and Alexa, among others.
London Cops Will Use Facial Recognition to Hunt Suspects
There will soon be a new bobby on the beat in London: artificial intelligence. London's Metropolitan Police said Friday that it will deploy facial recognition technology to find wanted criminals and missing persons. It said the technology will be deployed at "specific locations," each with a "bespoke watch list" of wanted persons, mostly violent offenders. However, a spokesperson was unable to specify how many facial recognition systems will be used, where, or how frequently. The Met said use of the technology would be publicized beforehand and marked by signs on site.
Multi-task Learning for Voice Trigger Detection
Sigtia, Siddharth, Clark, Pascal, Haynes, Rob, Richards, Hywel, Bridle, John
We describe the design of a voice trigger detection system for smart speakers. In this study, we address two major challenges. The first is that the detectors are deployed in complex acoustic environments with external noise and loud playback by the device itself. Secondly, collecting training examples for a specific keyword or trigger phrase is challenging resulting in a scarcity of trigger phrase specific training data. We describe a two-stage cascaded architecture where a low-power detector is always running and listening for the trigger phrase. If a detection is made at this stage, the candidate audio segment is re-scored by larger, more complex models to verify that the segment contains the trigger phrase. In this study, we focus our attention on the architecture and design of these second-pass detectors. We start by training a general acoustic model that produces phonetic transcriptions given a large labelled training dataset. Next, we collect a much smaller dataset of examples that are challenging for the baseline system. We then use multi-task learning to train a model to simultaneously produce accurate phonetic transcriptions on the larger dataset \emph{and} discriminate between true and easily confusable examples using the smaller dataset. Our results demonstrate that the proposed model reduces errors by half compared to the baseline in a range of challenging test conditions \emph{without} requiring extra parameters.
Multi-task Learning for Speaker Verification and Voice Trigger Detection
Sigtia, Siddharth, Marchi, Erik, Kajarekar, Sachin, Naik, Devang, Bridle, John
Automatic speech transcription and speaker recognition are usually treated as separate tasks even though they are interdependent. In this study, we investigate training a single network to perform both tasks jointly. We train the network in a supervised multi-task learning setup, where the speech transcription branch of the network is trained to minimise a phonetic connectionist temporal classification (CTC) loss while the speaker recognition branch of the network is trained to label the input sequence with the correct label for the speaker. We present a large-scale empirical study where the model is trained using several thousand hours of labelled training data for each task. We evaluate the speech transcription branch of the network on a voice trigger detection task while the speaker recognition branch is evaluated on a speaker verification task. Results demonstrate that the network is able to encode both phonetic \emph{and} speaker information in its learnt representations while yielding accuracies at least as good as the baseline models for each task, with the same number of parameters as the independent models.
Learning the Hypotheses Space from data Part I: Learning Space and U-curve Property
Marcondes, Diego, Simonis, Adilson, Barrera, Junior
The agnostic PAC learning model consists of: a Hypothesis Space $\mathcal{H}$, a probability distribution $P$, a sample complexity function $m_{\mathcal{H}}(\epsilon,\delta): [0,1]^{2} \mapsto \mathbb{Z}_{+}$ of precision $\epsilon$ and confidence $1 - \delta$, a finite i.i.d. sample $\mathcal{D}_{N}$, a cost function $\ell$ and a learning algorithm $\mathbb{A}(\mathcal{H},\mathcal{D}_{N})$, which estimates $\hat{h} \in \mathcal{H}$ that approximates a target function $h^{\star} \in \mathcal{H}$ seeking to minimize out-of-sample error. In this model, prior information is represented by $\mathcal{H}$ and $\ell$, while problem solution is performed through their instantiation in several applied learning models, with specific algebraic structures for $\mathcal{H}$ and corresponding learning algorithms. However, these applied models use additional important concepts not covered by the classic PAC learning theory: model selection and regularization. This paper presents an extension of this model which covers these concepts. The main principle added is the selection, based solely on data, of a subspace of $\mathcal{H}$ with a VC-dimension compatible with the available sample. In order to formalize this principle, the concept of Learning Space $\mathbb{L}(\mathcal{H})$, which is a poset of subsets of $\mathcal{H}$ that covers $\mathcal{H}$ and satisfies a property regarding the VC dimension of related subspaces, is presented as the natural search space for model selection algorithms. A remarkable result obtained on this new framework are conditions on $\mathbb{L}(\mathcal{H})$ and $\ell$ that lead to estimated out-of-sample error surfaces, which are true U-curves on $\mathbb{L}(\mathcal{H})$ chains, enabling a more efficient search on $\mathbb{L}(\mathcal{H})$. Hence, in this new framework, the U-curve optimization problem becomes a natural component of model selection algorithms.
Scalable and Customizable Benchmark Problems for Many-Objective Optimization
Meneghini, Ivan Reinaldo, Alves, Marcos Antonio, Gaspar-Cunha, António, Guimarães, Frederico Gadelha
Solving many-objective problems (MaOPs) is still a significant challenge in the multi-objective optimization (MOO) field. One way to measure algorithm performance is through the use of benchmark functions (also called test functions or test suites), which are artificial problems with a well-defined mathematical formulation, known solutions and a variety of features and difficulties. In this paper we propose a parameterized generator of scalable and customizable benchmark problems for MaOPs. It is able to generate problems that reproduce features present in other benchmarks and also problems with some new features. We propose here the concept of generative benchmarking, in which one can generate an infinite number of MOO problems, by varying parameters that control specific features that the problem should have: scalability in the number of variables and objectives, bias, deceptiveness, multimodality, robust and non-robust solutions, shape of the Pareto front, and constraints. The proposed Generalized Position-Distance (GPD) tunable benchmark generator uses the position-distance paradigm, a basic approach to building test functions, used in other benchmarks such as Deb, Thiele, Laumanns and Zitzler (DTLZ), Walking Fish Group (WFG) and others. It includes scalable problems in any number of variables and objectives and it presents Pareto fronts with different characteristics. The resulting functions are easy to understand and visualize, easy to implement, fast to compute and their Pareto optimal solutions are known.
Top Artificial Intelligence Funding in December 2019
The companies that are out there in the market, in order to serve their objectives better, need significant funding. In particular, for startups, fundraising is crucial to harness their rich potential to contribute to the growth of their respective industry and market. Without a funding source, a business, specially technology-business will flounder under the weight of its own debt. With the advancements in technology, the requirements, assets, and liabilities of such firms have grown exponentially in recent years. Amid this, funding works as a fuel on which a business runs and excels. When it comes to technologies like omnipresent AI or artificial intelligence, the pressure naturally increases to thrive in the market where big techs like Google, Microsoft, and significant others are operating.
AI applications for social good Tryolabs Blog
Artificial intelligence is gaining traction in areas of social responsibility. From climate change to social polarization to epidemics, humankind has been seeking new solutions to old but persistent problems. From a technological point of view, the amount of daily data produced in the digital universe now allows for state-of-the-art approaches, which may lead to innovative solutions in these underserved areas. AI for social good turned into a reality for us at Tryolabs after we collaborated with an NGO to improve upon how African lions are tracked, which helps with species preservation. We will go into more detail on that timely case, especially as wildlife conservation faces the immense challenges posed by devastating megafires threatening the lives of millions of animals in historic ways.
Alcatraz escape mystery may have just been solved with facial-recognition tech
The 57-year-old mystery of an infamous prison break from Alcatraz may have finally been solved using artificial-intelligence and facial-recognition technology. Rothco, the Irish creative agency owned by Accenture Interactive, teamed up with AI specialists at Identv to analyse a picture of two escapees and have, for the first time, confirmed their identities. On 11 June 1962, three prisoners – Frank Morris, along with brothers John and Clarence Anglin – broke out of their cells and escaped from the prison on Alcatraz Island, near San Francisco Bay. The trio's extraordinary escape, in which they used sharpened spoons to dig through the walls and made papier-mâché dummies to fool the guards, was made famous in the 1979 movie Escape from Alcatraz. The prison, which shut down in 1963, was famed for being supposedly impossible to escape from.