AITopics | South America

Collaborating Authors

South America

Extended Parallel Corpus for Amharic-English Machine Translation

Gezmu, Andargachew Mekonnen, Nürnberger, Andreas, Bati, Tesfaye Bayu

arXiv.org Artificial IntelligenceApr-8-2021

This paper describes the acquisition, preprocessing, segmentation, and alignment of an Amharic-English parallel corpus. It will be useful for machine translation of an under-resourced language, Amharic. The corpus is larger than previously compiled corpora; it is released for research purposes. We trained neural machine translation and phrase-based statistical machine translation models using the corpus. In the automatic evaluation, neural machine translation models outperform phrase-based statistical machine translation models.

computational linguistic, machine translation, translation, (12 more...)

arXiv.org Artificial Intelligence

2104.03543

Country:

Europe > Germany > Berlin (0.05)
Europe > Germany > Saxony-Anhalt > Magdeburg (0.04)
Europe > Czechia > Prague (0.04)
(24 more...)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

HumAID: Human-Annotated Disaster Incidents Data from Twitter with Deep Learning Benchmarks

Alam, Firoj, Qazi, Umair, Imran, Muhammad, Ofli, Ferda

arXiv.org Artificial IntelligenceApr-8-2021

Social networks are widely used for information consumption and dissemination, especially during time-critical events such as natural disasters. Despite its significantly large volume, social media content is often too noisy for direct use in any application. Therefore, it is important to filter, categorize, and concisely summarize the available content to facilitate effective consumption and decision-making. To address such issues automatic classification systems have been developed using supervised modeling approaches, thanks to the earlier efforts on creating labeled datasets. However, existing datasets are limited in different aspects (e.g., size, contains duplicates) and less suitable to support more advanced and data-hungry deep learning models. In this paper, we present a new large-scale dataset with ~77K human-labeled tweets, sampled from a pool of ~24 million tweets across 19 disaster events that happened between 2016 and 2019. Moreover, we propose a data collection and sampling pipeline, which is important for social media data sampling for human annotation. We report multiclass classification results using classic and deep learning (fastText and transformer) based models to set the ground for future studies. The dataset and associated resources are publicly available. https://crisisnlp.qcri.org/humaid_dataset.html

dataset, information, tweet, (16 more...)

arXiv.org Artificial Intelligence

2104.0309

Country:

North America > Haiti (0.14)
South America > Ecuador (0.05)
North America > United States > Maryland (0.05)
(16 more...)

Genre: Research Report (1.00)

Industry: Information Technology > Services (0.66)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

What the Hell Are You Supposed to Do With Your Vaccine Card?

SlateApr-7-2021, 19:47:27 GMT

The joy, anxiety, and anticipation of getting a COVID vaccine in America culminates, quite anticlimactically, with a piece of white cardstock. Some have already lost their vaccine cards or never got them to begin with. Others have their names misspelled and crossed out on it. Many are having trouble reconciling how something so simple--and easily forged--can carry such import and weight. The White House has recently clarified that there will be no federal vaccine passport.

levine, rutherford, vaccine card, (14 more...)

Slate

Country:

North America > United States > California > San Francisco County > San Francisco (0.15)
North America > United States > New York (0.07)
South America (0.05)
(2 more...)

Genre: Personal (0.31)

Industry:

Health & Medicine > Therapeutic Area > Immunology (1.00)
Government > Regional Government > North America Government > United States Government (0.35)

Technology: Information Technology > Artificial Intelligence (0.30)

Add feedback

Council Post: AI's Role In Analyzing Shifting Sentiments Around Companies

#artificialintelligenceApr-7-2021, 17:52:32 GMT

Despite only being early in the year, significant events have already taken place in 2021. Mass vaccinations for Covid-19 have begun around the world, and new strains of the disease have surfaced in the United Kingdom, South Africa and Brazil. For companies, this news has had a direct impact on their ability to conduct business while further placing their pandemic response under the public microscope. How companies are being talked and written about is changing as the pandemic unfolds, and these nuances could reveal more than simply how effective an organization's marketing department is. What if shifts in sentiment could help traders make more informed financial decisions?

analyzing shifting sentiment, sentiment, stock price, (9 more...)

#artificialintelligence

Country:

South America > Brazil (0.25)
Europe > United Kingdom (0.25)
Africa > South Africa (0.25)

Industry: Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.54)

Add feedback

Three keys to working effectively on Artificial Intelligence projects

#artificialintelligenceApr-7-2021, 13:38:52 GMT

On many occasions, the greatest impediments to creating Artificial Intelligence solutions do not lie in the capacity of highly qualified teams, but in establishing an effective way of working between the different professional profiles involved in the life cycle of analytical models. This is one of the main tasks we are currently tackling at BBVA AI Factory. It is a task guided by three concepts: simplify, accelerate and reuse. My first direct contact with the AI Factory was in April 2020, in the middle of lockdown. I found myself with a team of data scientists who were extremely competent in creating AI models, but who needed to continue to push for common working guidelines in order to deal with the complexity – both organisational and technical – that exists in the Engineering domain.

ai factory, analytical model, artificial intelligence project, (8 more...)

#artificialintelligence

Country:

South America > Venezuela (0.05)
South America > Peru (0.05)
South America > Colombia (0.05)
(3 more...)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

Detection of marine litter using deep learning

AIHubApr-7-2021, 09:31:26 GMT

Researchers at the University of Barcelona have developed an open access, deep learning-based web app that will enable the detection and quantification of floating plastics in the sea with a reliability of over 80%. Floating sea macro-litter is a threat to the conservation of marine ecosystems worldwide. According to UNESCO, plastic debris causes the deaths of more than a million seabirds every year, as well as more than 100,000 marine mammals. Eroded fragments, known as micro-plastics, are now prevalent across the food chain. The largest density of floating litter is found in the great ocean gyres (systems of circular currents) with litter being caught and spun in these vast cycles.

deep learning, litter, marine litter, (8 more...)

AIHub

AI-Alerts: 2021 > 2021-04 > AAAI AI-Alert for Apr 13, 2021 (1.00)

Country:

South America > Paraguay > Asunción > Asunción (0.06)
Europe > Spain > Catalonia (0.06)

Genre: Research Report (0.75)

Industry: Media > Photography (0.33)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Adaptive Clustering of Robust Semantic Representations for Adversarial Image Purification

Silva, Samuel Henrique, Das, Arun, Scarff, Ian, Najafirad, Peyman

arXiv.org Artificial IntelligenceApr-7-2021

Deep Learning models are highly susceptible to adversarial manipulations that can lead to catastrophic consequences. One of the most effective methods to defend against such disturbances is adversarial training but at the cost of generalization of unseen attacks and transferability across models. In this paper, we propose a robust defense against adversarial attacks, which is model agnostic and generalizable to unseen adversaries. Initially, with a baseline model, we extract the latent representations for each class and adaptively cluster the latent representations that share a semantic similarity. We obtain the distributions for the clustered latent representations and from their originating images, we learn semantic reconstruction dictionaries (SRD). We adversarially train a new model constraining the latent space representation to minimize the distance between the adversarial latent representation and the true cluster distribution. To purify the image, we decompose the input into low and high-frequency components. The high-frequency component is reconstructed based on the most adequate SRD from the clean dataset. In order to evaluate the most adequate SRD, we rely on the distance between robust latent representations and semantic cluster distributions. The output is a purified image with no perturbation. Image purification on CIFAR-10 and ImageNet-10 using our proposed method improved the accuracy by more than 10% compared to state-of-the-art results.

adversarial attack, latent representation, representation, (15 more...)

arXiv.org Artificial Intelligence

2104.02155

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Texas > Bexar County > San Antonio (0.04)
Asia > Middle East > Jordan (0.04)
(5 more...)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

Deep learning for prediction of complex geology ahead of drilling

Fossum, Kristian, Alyaev, Sergey, Tveranger, Jan, Elsheikh, Ahmed

arXiv.org Machine LearningApr-6-2021

During a geosteering operation the well path is intentionally adjusted in response to the new data acquired while drilling. To achieve consistent high-quality decisions, especially when drilling in complex environments, decision support systems can help cope with high volumes of data and interpretation complexities. They can assimilate the real-time measurements into a probabilistic earth model and use the updated model for decision recommendations. Recently, machine learning (ML) techniques have enabled a wide range of methods that redistribute computational cost from on-line to off-line calculations. In this paper, we introduce two ML techniques into the geosteering decision support framework. Firstly, a complex earth model representation is generated using a Generative Adversarial Network (GAN). Secondly, a commercial extra-deep electromagnetic simulator is represented using a Forward Deep Neural Network (FDNN). The numerical experiments demonstrate that the combination of the GAN and the FDNN in an ensemble randomized maximum likelihood data assimilation scheme provides real-time estimates of complex geological uncertainty. This yields reduction of geological uncertainty ahead of the drill-bit from the measurements gathered behind and around the well bore.

deep learning, realization, upstream oil & gas, (20 more...)

arXiv.org Machine Learning

2104.0255

Country:

North America > United States > Colorado > Garfield County (0.16)
Europe > Norway (0.14)
South America (0.14)
(2 more...)

Genre: Research Report (0.82)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Add feedback

Heuristics2Annotate: Efficient Annotation of Large-Scale Marathon Dataset For Bounding Box Regression

Rajput, Pranjal Singh, Napolean, Yeshwanth, van Gemert, Jan

arXiv.org Artificial IntelligenceApr-6-2021

Annotating a large-scale in-the-wild person re-identification dataset especially of marathon runners is a challenging task. The variations in the scenarios such as camera viewpoints, resolution, occlusion, and illumination make the problem non-trivial. Manually annotating bounding boxes in such large-scale datasets is cost-inefficient. Additionally, due to crowdedness and occlusion in the videos, aligning the identity of runners across multiple disjoint cameras is a challenge. We collected a novel large-scale in-the-wild video dataset of marathon runners. The dataset consists of hours of recording of thousands of runners captured using 42 hand-held smartphone cameras and covering real-world scenarios. Due to the presence of crowdedness and occlusion in the videos, the annotation of runners becomes a challenging task. We propose a new scheme for tackling the challenges in the annotation of such large dataset. Our technique reduces the overall cost of annotation in terms of time as well as budget. We demonstrate performing fps analysis to reduce the effort and time of annotation. We investigate several annotation methods for efficiently generating tight bounding boxes. Our results prove that interpolating bounding boxes between keyframes is the most efficient method of bounding box generation amongst several other methods and is 3x times faster than the naive baseline method. We introduce a novel way of aligning the identity of runners in disjoint cameras. Our inter-camera alignment tool integrated with the state-of-the-art person re-id system proves to be sufficient and effective in the alignment of the runners across multiple cameras with non-overlapping views. Our proposed framework of annotation reduces the annotation cost of the dataset by a factor of 16x, also effectively aligning 93.64% of the runners in the cross-camera setting.

annotation, dataset, video, (17 more...)

arXiv.org Artificial Intelligence

2104.02749

Country:

Europe > Netherlands > North Brabant > Eindhoven (0.05)
Europe > Netherlands > South Holland > Delft (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)

Genre: Research Report > New Finding (0.86)

Industry: Leisure & Entertainment > Sports > Running (1.00)

Technology:

Information Technology > Graphics (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(3 more...)

Add feedback

A Heuristic-driven Uncertainty based Ensemble Framework for Fake News Detection in Tweets and News Articles

Das, Sourya Dipta, Basak, Ayan, Dutta, Saikat

arXiv.org Artificial IntelligenceApr-5-2021

The significance of social media has increased manifold in the past few decades as it helps people from even the most remote corners of the world to stay connected. With the advent of technology, digital media has become more relevant and widely used than ever before and along with this, there has been a resurgence in the circulation of fake news and tweets that demand immediate attention. In this paper, we describe a novel Fake News Detection system that automatically identifies whether a news item is "real" or "fake", as an extension of our work in the CONSTRAINT COVID-19 Fake News Detection in English challenge. We have used an ensemble model consisting of pre-trained models followed by a statistical feature fusion network , along with a novel heuristic algorithm by incorporating various attributes present in news items or tweets like source, username handles, URL domains and authors as statistical feature. Our proposed framework have also quantified reliable predictive uncertainty along with proper class output confidence level for the classification task. We have evaluated our results on the COVID-19 Fake News dataset and FakeNewsNet dataset to show the effectiveness of the proposed algorithm on detecting fake news in short news content as well as in news articles. We obtained a best F1-score of 0.9892 on the COVID-19 dataset, and an F1-score of 0.9073 on the FakeNewsNet dataset.

arxiv preprint arxiv, dataset, news item, (11 more...)

arXiv.org Artificial Intelligence

2104.01791

Country:

South America > Uruguay > Maldonado > Maldonado (0.04)
North America > United States (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)
Asia > India (0.04)

Genre:

Research Report > New Finding (0.48)
Research Report > Promising Solution (0.46)

Industry:

Media > News (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.68)
Health & Medicine > Therapeutic Area > Immunology (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(3 more...)

Add feedback