AITopics | South America

Collaborating Authors

South America

An Empirical Study on the Joint Impact of Feature Selection and Data Re-sampling on Imbalance Classification

Zhang, Chongsheng, Soda, Paolo, Bi, Jingjun, Fan, Gaojuan, Almpanidis, George, Garcia, Salvador

arXiv.org Artificial IntelligenceSep-13-2021

In predictive tasks, real-world datasets often present different degrees of imbalanced (i.e., long-tailed or skewed) distributions. While the majority (the head) classes have sufficient samples, the minority (the tail) classes can be under-represented by a rather limited number of samples. Data pre-processing has been shown to be very effective in dealing with such problems. On one hand, data re-sampling is a common approach to tackling class imbalance. On the other hand, dimension reduction, which reduces the feature space, is a conventional technique for reducing noise and inconsistencies in a dataset. However, the possible synergy between feature selection and data re-sampling for high-performance imbalance classification has rarely been investigated before. To address this issue, we carry out a comprehensive empirical study on the joint influence of feature selection and re-sampling on two-class imbalance classification. Specifically, we study the performance of two opposite pipelines for imbalance classification by applying feature selection before or after data re-sampling. We conduct a large number of experiments, with a total of 9225 tests, on 52 publicly available datasets, using 9 feature selection methods, 6 re-sampling approaches for class imbalance learning, and 3 well-known classification algorithms. Experimental results show that there is no constant winner between the two pipelines; thus both of them should be considered to derive the best performing model for imbalance classification. We find that the performance of an imbalance classification model not only depends on the classifier adopted and the ratio between the number of majority and minority samples, but also depends on the ratio between the number of samples and features. Overall, this study should provide new reference value for researchers and practitioners in imbalance learning.

feature selection, pipeline, specific combination, (13 more...)

arXiv.org Artificial Intelligence

2109.00201

Country:

South America > Paraguay > Asunción > Asunción (0.04)
North America > United States > New York (0.04)
Europe > Spain > Andalusia > Granada Province > Granada (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

TEASEL: A Transformer-Based Speech-Prefixed Language Model

Arjmand, Mehdi, Dousti, Mohammad Javad, Moradi, Hadi

arXiv.org Artificial IntelligenceSep-12-2021

Multimodal language analysis is a burgeoning field of NLP that aims to simultaneously model a speaker's words, acoustical annotations, and facial expressions. In this area, lexicon features usually outperform other modalities because they are pre-trained on large corpora via Transformer-based models. Despite their strong performance, training a new self-supervised learning (SSL) Transformer on any modality is not usually attainable due to insufficient data, which is the case in multimodal language learning. This work proposes a Transformer-Based Speech-Prefixed Language Model called TEASEL to approach the mentioned constraints without training a complete Transformer model. TEASEL model includes speech modality as a dynamic prefix besides the textual modality compared to a conventional language model. This method exploits a conventional pre-trained language model as a cross-modal Transformer model. We evaluated TEASEL for the multimodal sentiment analysis task defined by CMU-MOSI dataset. Extensive experiments show that our model outperforms unimodal baseline language models by 4% and outperforms the current multimodal state-of-the-art (SoTA) model by 1% in F1-score. Additionally, our proposed method is 72% smaller than the SoTA model.

language model, modality, transformer, (14 more...)

arXiv.org Artificial Intelligence

2109.05522

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States (0.04)
(4 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Estimating a new panel MSK dataset for comparative analyses of national absorptive capacity systems, economic growth, and development in low and middle income economies

Khan, Muhammad Salar

arXiv.org Machine LearningSep-12-2021

Within the national innovation system literature, empirical analyses are severely lacking for developing economies. Particularly, the low- and middle-income countries (LMICs) eligible for the World Bank's International Development Association (IDA) support, are rarely part of any empirical discourse on growth, development, and innovation. One major issue hindering panel analyses in LMICs, and thus them being subject to any empirical discussion, is the lack of complete data availability. This work offers a new complete panel dataset with no missing values for LMICs eligible for IDA's support. I use a standard, widely respected multiple imputation technique (specifically, Predictive Mean Matching) developed by Rubin (1987). This technique respects the structure of multivariate continuous panel data at the country level. I employ this technique to create a large dataset consisting of many variables drawn from publicly available established sources. These variables, in turn, capture six crucial country-level capacities: technological capacity, financial capacity, human capital capacity, infrastructural capacity, public policy capacity, and social capacity. Such capacities are part and parcel of the National Absorptive Capacity Systems (NACS). The dataset (MSK dataset) thus produced contains data on 47 variables for 82 LMICs between 2005 and 2019. The dataset has passed a quality and reliability check and can thus be used for comparative analyses of national absorptive capacities and development, transition, and convergence analyses among LMICs.

dataset, imputation, multiple imputation, (15 more...)

arXiv.org Machine Learning

2109.05529

Country:

Asia > Pakistan (0.04)
Asia > Myanmar (0.04)
North America > United States > New York > Monroe County > Rochester (0.04)
(73 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Epidemiology (1.00)
Government (1.00)
Banking & Finance > Economy (1.00)
(4 more...)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)

Add feedback

US judge rules only humans, not AI, can get patents

#artificialintelligenceSep-11-2021, 14:00:17 GMT

The big picture: A US judge ruled this week that an artificial intelligence cannot be listed as the inventor of a patent. This ruling is the latest on an issue that has come before judges in multiple countries. A court in Alexandria, Virginia, ruled that inventions can only be patented under the name of a "natural person." The decision was made against someone who tried to list two designs under the name of an AI as part of a broader project to gain worldwide recognition of AI-powered inventions. Imagination Engines, Inc. CEO Stephen Thaler built an AI called DEBUS, which independently designed a new kind of drink holder and flashing light (used to get someone's attention). The name "DEBUS," along with "Invention generated by artificial intelligence," was used in the attempted patent filing for the inventions.

inventor, patent, us judge rule only human, (4 more...)

#artificialintelligence

Country:

North America > United States > Virginia > Alexandria County > Alexandria (0.27)
South America > Brazil (0.07)
Oceania > Australia (0.07)
(5 more...)

Industry:

Law > Intellectual Property & Technology Law (1.00)
Government > Regional Government > North America Government > United States Government (0.62)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

Trust in EU approach to artificial intelligence risks being undermined by new AI rules

#artificialintelligenceSep-11-2021, 13:02:42 GMT

The EU is winning the battle for trust among artificial intelligence (AI) researchers, academics on both sides of the Atlantic say, bolstering the Commission's ambitions to set global standards for the technology. But some fear the EU risks squandering this confidence by imposing ill-thought through rules in its recently proposed Artificial Intelligence act, which some academics say are at odds with the realities of AI research. "We do see a push for trustworthy and transparent AI also in the US, but, in terms of governance, we are not as far [ahead] as the EU in this regard," said Bart Selman, president of the Association for Advancement of Artificial Intelligence (AAAI) and a professor at Cornell University. Highly international AI researchers are "aware that AI developments in the US are dominated by business interests, and in China by the government interest," said Holger Hoos, professor of machine learning at Leiden University, and a founder of the Confederation of Laboratories for Artificial Intelligence Research in Europe (CLAIRE). EU policymaking, though slower, incorporated "more voices, and more perspectives" than the more centralised process in the US and China, he argued, with the EU having taken strong action on privacy through the General Data Protection regulation, which came into effect in 2018.

ai researcher, artificial intelligence risk, regulation, (12 more...)

#artificialintelligence

Country:

Asia > China (0.58)
Europe > Netherlands > South Holland > Leiden (0.25)
South America > Brazil (0.05)
(3 more...)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government > Regional Government > Europe Government (0.49)

Technology: Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)

Add feedback

The Third Revolution in Warfare

The Atlantic - TechnologySep-11-2021, 12:00:00 GMT

On the 20th anniversary of 9/11, against the backdrop of the rushed U.S.-allied Afghanistan withdrawal, the grisly reality of armed combat and the challenge posed by asymmetric suicide terror attacks grow harder to ignore. But weapons technology has changed substantially over the past two decades. And thinking ahead to the not-so-distant future, we must ask: What if these assailants were able to remove human suicide bombers or attackers from the equation altogether? As someone who has studied and worked in artificial intelligence for the better part of four decades, I worry about such a technology threat, born from artificial intelligence and robotics. Autonomous weaponry is the third revolution in warfare, following gunpowder and nuclear arms.

arm race, autonomous weapon, danger, (16 more...)

The Atlantic - Technology

Country:

Asia > Afghanistan (0.25)
South America > Venezuela (0.15)
North America > United States (0.05)
(4 more...)

Industry: Government > Military (1.00)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

To present AI as optimistic or dystopian? "That was the biggest argument"

#artificialintelligenceSep-11-2021, 03:30:15 GMT

AI 2041: Ten Visions for Our Future is an unusual book. Each chapter consists of a short story, penned by science fiction writer Chen Qiufan, and a related analysis piece from Kai-Fu Lee, CEO of Sinovation Ventures and author of the nonfiction bestseller AI Superpowers. Chen, who also is founder of Thema Mundi, a content development studio, spoke with Fast Company on the eve of the release of AI 2041 about his collaboration with Lee, his own experiences with artificial intelligence, and what machine learning will mean for artists and writers. This interview was edited for length and clarity. Fast Company: How did this project come about?

biggest argument, kai-fu lee, science fiction, (12 more...)

#artificialintelligence

Country:

South America > Brazil > São Paulo (0.05)
Oceania > Australia (0.05)
Europe > Middle East (0.05)
(6 more...)

Genre: Personal > Interview (0.35)

Technology:

Information Technology > Artificial Intelligence > Science Fiction (0.41)
Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (0.30)
Information Technology > Artificial Intelligence > Natural Language (0.30)

Add feedback

Sequential Modelling with Applications to Music Recommendation, Fact-Checking, and Speed Reading

Hansen, Christian

arXiv.org Artificial IntelligenceSep-11-2021

Sequential modelling entails making sense of sequential data, which naturally occurs in a wide array of domains. One example is systems that interact with users, log user actions and behaviour, and make recommendations of items of potential interest to users on the basis of their previous interactions. In such cases, the sequential order of user interactions is often indicative of what the user is interested in next. Similarly, for systems that automatically infer the semantics of text, capturing the sequential order of words in a sentence is essential, as even a slight re-ordering could significantly alter its original meaning. This thesis makes methodological contributions and new investigations of sequential modelling for the specific application areas of systems that recommend music tracks to listeners and systems that process text semantics in order to automatically fact-check claims, or "speed read" text for efficient further classification.

automatic identification and verification, state-of-the-art speed reading model, veracity prediction model, (17 more...)

arXiv.org Artificial Intelligence

2109.06736

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Denmark > Capital Region > Copenhagen (0.04)
North America > United States > New York > New York County > New York City (0.04)
(31 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Research Report > Experimental Study (0.92)
Research Report > Promising Solution (0.67)

Industry:

Media > News (1.00)
Media > Music (1.00)
Leisure & Entertainment (1.00)
(3 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
(8 more...)

Add feedback

On the Compression of Neural Networks Using $\ell_0$-Norm Regularization and Weight Pruning

Oliveira, Felipe Dennis de Resende, Batista, Eduardo Luiz Ortiz, Seara, Rui

arXiv.org Artificial IntelligenceSep-10-2021

Despite the growing availability of high-capacity computational platforms, implementation complexity still has been a great concern for the real-world deployment of neural networks. This concern is not exclusively due to the huge costs of state-of-the-art network architectures, but also due to the recent push towards edge intelligence and the use of neural networks in embedded applications. In this context, network compression techniques have been gaining interest due to their ability for reducing deployment costs while keeping inference accuracy at satisfactory levels. The present paper is dedicated to the development of a novel compression scheme for neural networks. To this end, a new $\ell_0$-norm-based regularization approach is firstly developed, which is capable of inducing strong sparseness in the network during training. Then, targeting the smaller weights of the trained network with pruning techniques, smaller yet highly effective networks can be obtained. The proposed compression scheme also involves the use of $\ell_2$-norm regularization to avoid overfitting as well as fine tuning to improve the performance of the pruned network. Experimental results are presented aiming to show the effectiveness of the proposed scheme as well as to make comparisons with competing approaches.

neural network, pruning, regularization, (15 more...)

arXiv.org Artificial Intelligence

2109.05075

Country:

South America > Brazil > Santa Catarina > Florianópolis (0.04)
Asia (0.04)

Genre: Research Report (0.64)

Industry: Banking & Finance (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

Add feedback

MultiAzterTest: a Multilingual Analyzer on Multiple Levels of Language for Readability Assessment

Bengoetxea, Kepa, Gonzalez-Dios, Itziar

arXiv.org Artificial IntelligenceSep-10-2021

Readability assessment is the task of determining how difficult or easy a text is or which level/grade it has. Traditionally, language dependent readability formula have been used, but these formulae take few text characteristics into account. However, Natural Language Processing (NLP) tools that assess the complexity of texts are able to measure more different features and can be adapted to different languages. In this paper, we present the MultiAzterTest tool: (i) an open source NLP tool which analyzes texts on over 125 measures of cohesion, language, and readability for English, Spanish and Basque, but whose architecture is designed to easily adapt other languages; (ii) readability assessment classifiers that improve the performance of Coh-Metrix in English, Coh-Metrix-Esp in Spanish and ErreXail in Basque; iii) a web tool. MultiAzterTest obtains 90.09 % in accuracy when classifying into three reading levels (elementary, intermediate, and advanced) in English and 95.50 % in Basque and 90 % in Spanish when classifying into two reading levels (simple and complex) using a SMO classifier.

accuracy, classifier, multiaztertest, (16 more...)

arXiv.org Artificial Intelligence

2109.0487

Country:

South America > Brazil > São Paulo (0.04)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.93)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Add feedback