South America
Are we ready for the next steps of artificial intelligence?
Disclaimer: This is the translation of an article published at TAB UOL. At first, Google was the first company to release a massive solution using artificial intelligence (AI) to generate images, Deep Dream. Back in the day, this technology looked very exciting, probably for its potential rather than what it was actually doing at the moment -- a scary psychedelic creation filled with random dog faces. More recently, mobile apps like Zao began to automate deep fakes, through which people could replace the face of superheroes and Hollywoodian celebrities for their own. Now, the same company released Dream.ai, an AI based on text prompts.
50 Global Hubs for Top AI Talent
Artificial intelligence (AI) has crossed a threshold. "In the past five years, AI has made the leap from something that mostly happens in research labs or other highly controlled settings to something that's out in society affecting people's lives," says Michael Littman, chair of the One Hundred Year Study on Artificial Intelligence, hosted at Stanford. It's easy to see what he's talking about: The technology's impact can be seen introducing automation, driving efficiency gains and enhancing productivity, creating new jobs, and reducing risks associated with cyber-threats and fraud. During the pandemic, AI enabled more effective testing for Covid-19 and faster vaccine development, and helped manage grocery supply chains and tailor lessons for individual students affected by remote schooling. As AI expands into more and more facets of our lives, there is also more scrutiny on who's developing it.
How Can the Use of AI Tools Benefit the Indian Parliament?
From self-driving cars to intuitive automatic vacuum cleaners, AI has taken over every industry and government organization around the world. Artificial intelligence is an emerging focus area of policy development in India. Many governments have begun to implement AI across various small-scale pilots. But many are still limited to implementation and experimentation. If implemented effectively, AI tools can generate benefits for both private and public-sector organizations.
Towards Building ASR Systems for the Next Billion Users
Javed, Tahir, Doddapaneni, Sumanth, Raman, Abhigyan, Bhogale, Kaushal Santosh, Ramesh, Gowtham, Kunchukuttan, Anoop, Kumar, Pratyush, Khapra, Mitesh M.
Recent methods in speech and language technology pretrain very LARGE models which are fine-tuned for specific tasks. However, the benefits of such LARGE models are often limited to a few resource rich languages of the world. In this work, we make multiple contributions towards building ASR systems for low resource languages from the Indian subcontinent. First, we curate 17,000 hours of raw speech data for 40 Indian languages from a wide variety of domains including education, news, technology, and finance. Second, using this raw speech data we pretrain several variants of wav2vec style models for 40 Indian languages. Third, we analyze the pretrained models to find key features: codebook vectors of similar sounding phonemes are shared across languages, representations across layers are discriminative of the language family, and attention heads often pay attention within small local windows. Fourth, we fine-tune this model for downstream ASR for 9 languages and obtain state-of-the-art results on 3 public datasets, including on very low-resource languages such as Sinhala and Nepali. Our work establishes that multilingual pretraining is an effective strategy for building ASR systems for the linguistically diverse speakers of the Indian subcontinent. Our code, data and models are available publicly at https://indicnlp.ai4bharat.org/indicwav2vec/ and we hope they will help advance research in ASR for Indic languages.
Constraining cosmological parameters from N-body simulations with Bayesian Neural Networks
In this paper we use The Quijote simulations in order to extract the cosmological parameters through Bayesian Neural Networks. This kind of models has a remarkable ability of estimating the associated uncertainty, which is one of the ultimate goals in the precision cosmology era. We demonstrate the advantages of BNNs for extracting more complex output distributions and non-Gaussianities information from the simulations.
Surrogate Likelihoods for Variational Annealed Importance Sampling
Variational inference is a powerful paradigm for approximate Bayesian inference with a number of appealing properties, including support for model learning and data subsampling. By contrast MCMC methods like Hamiltonian Monte Carlo do not share these properties but remain attractive since, contrary to parametric methods, MCMC is asymptotically unbiased. For these reasons researchers have sought to combine the strengths of both classes of algorithms, with recent approaches coming closer to realizing this vision in practice. However, supporting data subsampling in these hybrid methods can be a challenge, a shortcoming that we address by introducing a surrogate likelihood that can be learned jointly with other variational parameters. We argue theoretically that the resulting algorithm permits the user to make an intuitive trade-off between inference fidelity and computational cost. In an extensive empirical comparison we show that our method performs well in practice and that it is well-suited for black-box inference in probabilistic programming frameworks.
Neuroevolution deep learning architecture search for estimation of river surface elevation from photogrammetric Digital Surface Models
Szostak, Radosław, Pietroń, Marcin, Zimnoch, Mirosław, Wachniew, Przemysław, Ćwiąkała, Paweł, Puniach, Edyta
Development of the new methods of surface water observation is crucial in the perspective of increasingly frequent extreme hydrological events related to global warming and increasing demand for water. Orthophotos and digital surface models (DSMs) obtained using UAV photogrammetry can be used to determine the Water Surface Elevation (WSE) of a river. However, this task is difficult due to disturbances of the water surface on DSMs caused by limitations of photogrammetric algorithms. In this study, machine learning was used to extract a WSE value from disturbed photogrammetric data. A brand new dataset has been prepared specifically for this purpose by hydrology and photogrammetry experts. The new method is an important step toward automating water surface level measurements with high spatial and temporal resolution. Such data can be used to validate and calibrate of hydrological, hydraulic and hydrodynamic models making hydrological forecasts more accurate, in particular predicting extreme and dangerous events such as floods or droughts. For our knowledge this is the first approach in which dataset was created for this purpose and deep learning models were used for this task. Additionally, neuroevolution algorithm was set to explore different architectures to find local optimal models and non-gradient search was performed to fine-tune the model parameters. The achieved results have better accuracy compared to manual methods of determining WSE from photogrammetric DSMs.
Adversarial Attacks against Windows PE Malware Detection: A Survey of the State-of-the-Art
Ling, Xiang, Wu, Lingfei, Zhang, Jiangyu, Qu, Zhenqing, Deng, Wei, Chen, Xiang, Wu, Chunming, Ji, Shouling, Luo, Tianyue, Wu, Jingzheng, Wu, Yanjun
The malware has been being one of the most damaging threats to computers that span across multiple operating systems and various file formats. To defend against the ever-increasing and ever-evolving threats of malware, tremendous efforts have been made to propose a variety of malware detection methods that attempt to effectively and efficiently detect malware. Recent studies have shown that, on the one hand, existing ML and DL enable the superior detection of newly emerging and previously unseen malware. However, on the other hand, ML and DL models are inherently vulnerable to adversarial attacks in the form of adversarial examples, which are maliciously generated by slightly and carefully perturbing the legitimate inputs to confuse the targeted models. Basically, adversarial attacks are initially extensively studied in the domain of computer vision, and some quickly expanded to other domains, including NLP, speech recognition and even malware detection. In this paper, we focus on malware with the file format of portable executable (PE) in the family of Windows operating systems, namely Windows PE malware, as a representative case to study the adversarial attack methods in such adversarial settings. To be specific, we start by first outlining the general learning framework of Windows PE malware detection based on ML/DL and subsequently highlighting three unique challenges of performing adversarial attacks in the context of PE malware. We then conduct a comprehensive and systematic review to categorize the state-of-the-art adversarial attacks against PE malware detection, as well as corresponding defenses to increase the robustness of PE malware detection. We conclude the paper by first presenting other related attacks against Windows PE malware detection beyond the adversarial attacks and then shedding light on future research directions and opportunities.
Algorithmic Probability of Large Datasets and the Simplicity Bubble Problem in Machine Learning
Abrahão, Felipe S., Zenil, Hector, Porto, Fabio, Wehmuth, Klaus
When mining large datasets in order to predict new data, limitations of the principles behind statistical machine learning pose a serious challenge not only to the Big Data deluge, but also to the traditional assumptions that data generating processes are biased toward low algorithmic complexity. Even when one assumes an underlying algorithmic-informational bias toward simplicity in finite dataset generators, we show that fully automated, with or without access to pseudo-random generators, computable learning algorithms, in particular those of statistical nature used in current approaches to machine learning (including deep learning), can always be deceived, naturally or artificially, by sufficiently large datasets. In particular, we demonstrate that, for every finite learning algorithm, there is a sufficiently large dataset size above which the algorithmic probability of an unpredictable deceiver is an upper bound (up to a multiplicative constant that only depends on the learning algorithm) for the algorithmic probability of any other larger dataset. In other words, very large and complex datasets are as likely to deceive learning algorithms into a "simplicity bubble" as any other particular dataset. These deceiving datasets guarantee that any prediction will diverge from the high-algorithmic-complexity globally optimal solution while converging toward the low-algorithmic-complexity locally optimal solution. We discuss the framework and empirical conditions for circumventing this deceptive phenomenon, moving away from statistical machine learning towards a stronger type of machine learning based on, or motivated by, the intrinsic power of algorithmic information theory and computability theory.
Domain Adaptation with Pre-trained Transformers for Query Focused Abstractive Text Summarization
Laskar, Md Tahmid Rahman, Hoque, Enamul, Huang, Jimmy Xiangji
The Query Focused Text Summarization (QFTS) task aims at building systems that generate the summary of the text document(s) based on the given query. A key challenge in addressing this task is the lack of large labeled data for training the summarization model. In this paper, we address this challenge by exploring a series of domain adaptation techniques. Given the recent success of pre-trained transformer models in a wide range of natural language processing tasks, we utilize such models to generate abstractive summaries for the QFTS task for both single-document and multi-document scenarios. For domain adaptation, we apply a variety of techniques using pre-trained transformer-based summarization models including transfer learning, weakly supervised learning, and distant supervision. Extensive experiments on six datasets show that our proposed approach is very effective in generating abstractive summaries for the QFTS task while setting a new state-of-the-art result in several datasets across a set of automatic and human evaluation metrics.