AITopics | SPE

Collaborating Authors

SPE

Homophily Outlier Detection in Non-IID Categorical Data

Pang, Guansong, Cao, Longbing, Chen, Ling

arXiv.org Artificial IntelligenceMar-21-2021

Most of existing outlier detection methods assume that the outlier factors (i.e., outlierness scoring measures) of data entities (e.g., feature values and data objects) are Independent and Identically Distributed (IID). This assumption does not hold in real-world applications where the outlierness of different entities is dependent on each other and/or taken from different probability distributions (non-IID). This may lead to the failure of detecting important outliers that are too subtle to be identified without considering the non-IID nature. The issue is even intensified in more challenging contexts, e.g., high-dimensional data with many noisy features. This work introduces a novel outlier detection framework and its two instances to identify outliers in categorical data by capturing non-IID outlier factors. Our approach first defines and incorporates distribution-sensitive outlier factors and their interdependence into a value-value graph-based representation. It then models an outlierness propagation process in the value graph to learn the outlierness of feature values. The learned value outlierness allows for either direct outlier detection or outlying feature selection. The graph representation and mining approach is employed here to well capture the rich non-IID characteristics. Our empirical results on 15 real-world data sets with different levels of data complexities show that (i) the proposed outlier detection methods significantly outperform five state-of-the-art methods at the 95%/99% confidence level, achieving 10%-28% AUC improvement on the 10 most complex data sets; and (ii) the proposed feature selection methods significantly outperform three competing methods in enabling subsequent outlier detection of two different existing detectors.

detection, outlier detection, outlierness, (14 more...)

arXiv.org Artificial Intelligence

2103.11516

Country:

South America > Uruguay > Maldonado > Maldonado (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
Oceania > Australia > South Australia > Adelaide (0.04)

Genre: Research Report > Experimental Study (0.92)

Industry:

Health & Medicine > Therapeutic Area (0.46)
Information Technology > Security & Privacy (0.45)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

TextFlint: Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processing

Gui, Tao, Wang, Xiao, Zhang, Qi, Liu, Qin, Zou, Yicheng, Zhou, Xin, Zheng, Rui, Zhang, Chong, Wu, Qinzhuo, Ye, Jiacheng, Pang, Zexiong, Zhang, Yongxin, Li, Zhengyan, Ma, Ruotian, Fei, Zichu, Cai, Ruijian, Zhao, Jun, Hu, Xinwu, Yan, Zhiheng, Tan, Yiding, Hu, Yuan, Bian, Qiyuan, Liu, Zhihua, Zhu, Bolin, Qin, Shan, Xing, Xiaoyu, Fu, Jinlan, Zhang, Yue, Peng, Minlong, Zheng, Xiaoqing, Zhou, Yaqian, Wei, Zhongyu, Qiu, Xipeng, Huang, Xuanjing

arXiv.org Artificial IntelligenceMar-21-2021

Various robustness evaluation methodologies from different perspectives have been proposed for different natural language processing (NLP) tasks. These methods have often focused on either universal or task-specific generalization capabilities. In this work, we propose a multilingual robustness evaluation platform for NLP tasks (TextFlint) that incorporates universal text transformation, task-specific transformation, adversarial attack, subpopulation, and their combinations to provide comprehensive robustness analysis. TextFlint enables practitioners to automatically evaluate their models from all aspects or to customize their evaluations as desired with just a few lines of code. To guarantee user acceptability, all the text transformations are linguistically based, and we provide a human evaluation for each one. TextFlint generates complete analytical reports as well as targeted augmented data to address the shortcomings of the model's robustness. To validate TextFlint's utility, we performed large-scale empirical evaluations (over 67,000 evaluations) on state-of-the-art deep learning models, classic supervised methods, and real-world systems. Almost all models showed significant performance degradation, including a decline of more than 50% of BERT's prediction accuracy on tasks such as aspect-level sentiment classification, named entity recognition, and natural language inference. Therefore, we call for the robustness to be included in the model evaluation, so as to promote the healthy development of NLP technology.

dataset, evaluation, transformation, (14 more...)

arXiv.org Artificial Intelligence

2103.11441

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Beijing > Beijing (0.04)
(10 more...)

Genre: Research Report (1.00)

Industry: Government > Military (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.88)

Add feedback

White Paper Machine Learning in Certified Systems

Delseny, Hervé, Gabreau, Christophe, Gauffriau, Adrien, Beaudouin, Bernard, Ponsolle, Ludovic, Alecu, Lucian, Bonnin, Hugues, Beltran, Brice, Duchel, Didier, Ginestet, Jean-Brice, Hervieu, Alexandre, Martinez, Ghilaine, Pasquet, Sylvain, Delmas, Kevin, Pagetti, Claire, Gabriel, Jean-Marc, Chapdelaine, Camille, Picard, Sylvaine, Damour, Mathieu, Cappi, Cyril, Gardès, Laurent, De Grancey, Florence, Jenn, Eric, Lefevre, Baptiste, Flandin, Gregory, Gerchinovitz, Sébastien, Mamalet, Franck, Albore, Alexandre

arXiv.org Artificial IntelligenceMar-18-2021

Machine Learning (ML) seems to be one of the most promising solution to automate partially or completely some of the complex tasks currently realized by humans, such as driving vehicles, recognizing voice, etc. It is also an opportunity to implement and embed new capabilities out of the reach of classical implementation techniques. However, ML techniques introduce new potential risks. Therefore, they have only been applied in systems where their benefits are considered worth the increase of risk. In practice, ML techniques raise multiple challenges that could prevent their use in systems submitted to certification constraints. But what are the actual challenges? Can they be overcome by selecting appropriate ML techniques, or by adopting new engineering or certification practices? These are some of the questions addressed by the ML Certification 3 Workgroup (WG) set-up by the Institut de Recherche Technologique Saint Exup\'ery de Toulouse (IRT), as part of the DEEL Project.

development process, neural network, safety requirement, (17 more...)

arXiv.org Artificial Intelligence

2103.10529

Country:

Europe > France > Occitanie > Haute-Garonne > Toulouse (0.24)
North America > Canada > Ontario > Toronto (0.14)
North America > Canada > Quebec > Montreal (0.04)
(10 more...)

Genre:

Research Report > Experimental Study (0.67)
Research Report > Promising Solution (0.48)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)
Transportation > Ground > Rail (1.00)
(10 more...)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(6 more...)

Add feedback

Using a Personal Health Library-Enabled mHealth Recommender System for Self-Management of Diabetes Among Underserved Populations: Use Case for Knowledge Graphs and Linked Data

Ammar, Nariman, Bailey, James E, Davis, Robert L, Shaban-Nejad, Arash

arXiv.org Artificial IntelligenceMar-16-2021

Personal health libraries (PHLs) provide a single point of secure access to patients digital health data and enable the integration of knowledge stored in their digital health profiles with other sources of global knowledge. PHLs can help empower caregivers and health care providers to make informed decisions about patients health by understanding medical events in the context of their lives. This paper reports the implementation of a mobile health digital intervention that incorporates both digital health data stored in patients PHLs and other sources of contextual knowledge to deliver tailored recommendations for improving self-care behaviors in diabetic adults. We conducted a thematic assessment of patient functional and nonfunctional requirements that are missing from current EHRs based on evidence from the literature. We used the results to identify the technologies needed to address those requirements. We describe the technological infrastructures used to construct, manage, and integrate the types of knowledge stored in the PHL. We leverage the Social Linked Data (Solid) platform to design a fully decentralized and privacy-aware platform that supports interoperability and care integration. We provided an initial prototype design of a PHL and drafted a use case scenario that involves four actors to demonstrate how the proposed prototype can be used to address user requirements, including the construction and management of the PHL and its utilization for developing a mobile app that queries the knowledge stored and integrated into the PHL in a private and fully decentralized manner to provide better recommendations. The proposed PHL helps patients and their caregivers take a central role in making decisions regarding their health and equips their health care providers with informatics tools that support the collection and interpretation of the collected knowledge.

jmir form res 2021, jmir formative research ammar, phl, (11 more...)

arXiv.org Artificial Intelligence

doi: 10.2196/24738

2103.09311

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Tennessee > Shelby County > Memphis (0.04)
North America > United States > California > Monterey County > Monterey (0.04)
(8 more...)

Genre: Research Report > Experimental Study (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
(3 more...)

Technology:

Information Technology > Biomedical Informatics (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Communications > Web > Semantic Web (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.64)

Add feedback

Robots increase the gender pay gap despite raising wages overall

New Scientist - NewsMar-15-2021, 16:49:54 GMT

When industries replace workers with robots, wages rise for all on average due to productivity gains, but the difference in pay for men and women widens. They found that the number of robots per 10,000 workers increased, on average, by 47 per cent between 2006 and 2014.

computer engineering, gender pay gap, sustainability, (4 more...)

New Scientist - News

AI-Alerts: 2021 > 2021-03 > AAAI AI-Alert for Mar 16, 2021 (1.00)

Country: Europe (0.56)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

How Data Training Accelerates the Implementation of AI into Medical Industry

#artificialintelligenceMar-12-2021, 21:30:23 GMT

COVID-19 has undoubtedly accelerated the application of AI in the healthcare industry, such as virus surveillance, diagnosis, and patient risk assessments. AI-powered robots and digital assistants with real-time monitoring and analysis have enabled doctors to provide more effective and personalized treatment. Machine learning is the study of computer algorithms that improve automatically through experience. It is seen as a part of artificial intelligence. It gives algorithms the ability to "learn" from training data so as to identify patterns and make decisions with little human intervention.

artificial intelligence, data training accelerate, machine learning, (7 more...)

#artificialintelligence

Industry: Health & Medicine > Health Care Providers & Services (0.38)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.38)

Add feedback

Sentinel: A Hyper-Heuristic for the Generation of Mutant Reduction Strategies

Guizzo, Giovani, Sarro, Federica, Krinke, Jens, Vergilio, Silvia Regina

arXiv.org Artificial IntelligenceMar-12-2021

Mutation testing is an effective approach to evaluate and strengthen software test suites, but its adoption is currently limited by the mutants' execution computational cost. Several strategies have been proposed to reduce this cost (a.k.a. mutation cost reduction strategies), however none of them has proven to be effective for all scenarios since they often need an ad-hoc manual selection and configuration depending on the software under test (SUT). In this paper, we propose a novel multi-objective evolutionary hyper-heuristic approach, dubbed Sentinel, to automate the generation of optimal cost reduction strategies for every new SUT. We evaluate Sentinel by carrying out a thorough empirical study involving 40 releases of 10 open-source real-world software systems and both baseline and state-of-the-art strategies as a benchmark. We execute a total of 4,800 experiments, and evaluate their results with both quality indicators and statistical significance tests, following the most recent best practice in the literature. The results show that strategies generated by Sentinel outperform the baseline strategies in 95% of the cases always with large effect sizes. They also obtain statistically significantly better results than state-of-the-art strategies in 88% of the cases, with large effect sizes for 95% of them. Also, our study reveals that the mutation strategies generated by Sentinel for a given software version can be used without any loss in quality for subsequently developed versions in 95% of the cases. These results show that Sentinel is able to automatically generate mutation strategies that reduce mutation testing cost without affecting its testing effectiveness (i.e. mutation score), thus taking off from the tester's shoulders the burden of manually selecting and configuring strategies for each SUT.

mutation score, operator, sentinel, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TSE.2020.3002496

2103.07241

Country:

Europe > United Kingdom (0.14)
South America > Uruguay > Maldonado > Maldonado (0.04)
South America > Brazil > Paraná > Curitiba (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Add feedback

Partial Differential Equations is All You Need for Generating Neural Architectures -- A Theory for Physical Artificial Intelligence Systems

Guo, Ping, Huang, Kaizhu, Xu, Zenglin

arXiv.org Artificial IntelligenceMar-9-2021

In this work, we generalize the reaction-diffusion equation in statistical physics, Schr\"odinger equation in quantum mechanics, Helmholtz equation in paraxial optics into the neural partial differential equations (NPDE), which can be considered as the fundamental equations in the field of artificial intelligence research. We take finite difference method to discretize NPDE for finding numerical solution, and the basic building blocks of deep neural network architecture, including multi-layer perceptron, convolutional neural network and recurrent neural networks, are generated. The learning strategies, such as Adaptive moment estimation, L-BFGS, pseudoinverse learning algorithms and partial differential equation constrained optimization, are also presented. We believe it is of significance that presented clear physical image of interpretable deep neural networks, which makes it be possible for applying to analog computing device design, and pave the road to physical artificial intelligence.

deep learning, equation, upstream oil & gas, (18 more...)

arXiv.org Artificial Intelligence

2103.08313

Country:

North America > United States > New York (0.14)
North America > Canada > Quebec (0.14)
Asia > China > Guangdong Province (0.14)
(3 more...)

Genre: Research Report (0.40)

Industry: Energy > Oil & Gas > Upstream (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The AI Index 2021 Annual Report

Zhang, Daniel, Mishra, Saurabh, Brynjolfsson, Erik, Etchemendy, John, Ganguli, Deep, Grosz, Barbara, Lyons, Terah, Manyika, James, Niebles, Juan Carlos, Sellitto, Michael, Shoham, Yoav, Clark, Jack, Perrault, Raymond

arXiv.org Artificial IntelligenceMar-8-2021

Welcome to the fourth edition of the AI Index Report. This year we significantly expanded the amount of data available in the report, worked with a broader set of external organizations to calibrate our data, and deepened our connections with the Stanford Institute for Human-Centered Artificial Intelligence (HAI). The AI Index Report tracks, collates, distills, and visualizes data related to artificial intelligence. Its mission is to provide unbiased, rigorously vetted, and globally sourced data for policymakers, researchers, executives, journalists, and the general public to develop intuitions about the complex field of AI. The report aims to be the most credible and authoritative source for data and insights about AI in the world.

data mining, large language model, machine learning, (27 more...)

arXiv.org Artificial Intelligence

2103.06312

Country:

Asia > India (0.46)
Oceania > Australia (0.28)
Asia > Japan (0.28)
(110 more...)

Genre:

Research Report (1.00)
Questionnaire & Opinion Survey (1.00)
Overview (1.00)
Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Social Sector (1.00)
Media > News (1.00)
Leisure & Entertainment > Sports (1.00)
(17 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Information Management > Search (1.00)
Information Technology > Data Science > Data Mining (1.00)
(15 more...)

Add feedback

Teaching Home Robots

#artificialintelligenceMar-7-2021, 18:55:41 GMT

At TRI, our goal is to make breakthrough capabilities in Artificial Intelligence (AI). Despite recent advancements in AI, the large amount of data collection needed to deploy systems in unstructured environments continues to be a burden. Data collection in computer vision can be both quite costly and time-consuming, largely due to the process of annotating. Annotating data is typically done by a team of labelers, who are provided a long list of rules for how to handle different scenarios and what data to collect. For complex systems like a home robot or a self-driving car, these rules must be constantly refined, which creates an expensive feedback loop.

artificial intelligence, sensor, simulator, (16 more...)

#artificialintelligence

Industry: Automobiles & Trucks > Manufacturer (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots > Robots in the Home (0.61)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.56)

Add feedback