AITopics

2207.00681

Country: North America > United States > Washington > King County > Seattle (0.14)

Genre: Research Report (1.00)

Industry:

Energy (0.93)
Information Technology (0.67)
Government > Military (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Locomotion (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

#artificialintelligenceAug-30-2022, 14:32:28 GMT

Frequently Asked Data Science Interview Questions - Analytics Vidhya

This article was published as a part of the Data Science Blogathon. This article will discuss some data science interview questions and their answers to help you fare well in job interviews. These are data science interview questions and are based on data science topics. Though some of the questions may sound basic, these are frequently asked in interviews. Most candidates overlook them and won't focus on the basics, and they face rejection in job interviews.

actual result, algorithm, data science interview question, (8 more...)

#artificialintelligence

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.72)

Fault Detection for Non-Condensing Boilers using Simulated Building Automation System Sensor Data

Shohet, Rony, Kandil, Mohamed, Wang, Y., McArthur, J. J.

Building performance has been shown to degrade significantly after commissioning, resulting in increased energy consumption and associated greenhouse gas emissions. Continuous Commissioning using existing sensor networks and IoT devices has the potential to minimize this waste by continually identifying system degradation and re-tuning control strategies to adapt to real building performance. Due to its significant contribution to greenhouse gas emissions, the performance of gas boiler systems for building heating is critical. A review of boiler performance studies has been used to develop a set of common faults and degraded performance conditions, which have been integrated into a MATLAB/Simulink emulator. This resulted in a labeled dataset with approximately 10,000 simulations of steady-state performance for each of 14 non-condensing boilers. The collected data is used for training and testing fault classification using K-nearest neighbour, Decision tree, Random Forest, and Support Vector Machines. The results show that the Support Vector Machines method gave the best prediction accuracy, consistently exceeding 90%, and generalization across multiple boilers is not possible due to low classification accuracy.

artificial intelligence, boiler, machine learning, (19 more...)

doi: 10.1016/j.aei.2020.101176

2205.08418

Country: North America > United States (0.45)

Genre:

Overview (1.00)
Research Report > New Finding (0.34)

Industry:

Energy > Oil & Gas > Upstream (1.00)
Energy > Renewable (0.95)
Construction & Engineering > HVAC (0.72)
Energy > Power Industry (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.66)

Virtual impactor-based label-free bio-aerosol detection using holography and deep learning

Luo, Yi, Zhang, Yijie, Liu, Tairan, Yu, Alan, Wu, Yichen, Ozcan, Aydogan

Exposure to bio-aerosols such as mold spores and pollen can lead to adverse health effects. There is a need for a portable and cost-effective device for long-term monitoring and quantification of various bio-aerosols. To address this need, we present a mobile and cost-effective label-free bio-aerosol sensor that takes holographic images of flowing particulate matter concentrated by a virtual impactor, which selectively slows down and guides particles larger than ~6 microns to fly through an imaging window. The flowing particles are illuminated by a pulsed laser diode, casting their inline holograms on a CMOS image sensor in a lens-free mobile imaging device. The illumination contains three short pulses with a negligible shift of the flowing particle within one pulse, and triplicate holograms of the same particle are recorded at a single frame before it exits the imaging field-of-view, revealing different perspectives of each particle. The particles within the virtual impactor are localized through a differential detection scheme, and a deep neural network classifies the aerosol type in a label-free manner, based on the acquired holographic images. We demonstrated the success of this mobile bio-aerosol detector with a virtual impactor using different types of pollen (i.e., bermuda, elm, oak, pine, sycamore, and wheat) and achieved a blind classification accuracy of 92.91%. This mobile and cost-effective device weighs ~700 g and can be used for label-free sensing and quantification of various bio-aerosols over extended periods since it is based on a cartridge-free virtual impactor that does not capture or immobilize particulate matter.

artificial intelligence, machine learning, particle, (15 more...)

doi: 10.1021/acssensors.2c01890

2208.13979

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.29)
North America > Bermuda (0.24)
North America > United States > Texas (0.04)
Europe (0.04)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Semiconductors & Electronics (0.88)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Mo, Shentong, Morgado, Pedro

A Closer Look at Weakly-Supervised Audio-Visual Source Localization

Audio-visual source localization is a challenging task that aims to predict the location of visual sound sources in a video. Since collecting ground-truth annotations of sounding objects can be costly, a plethora of weakly-supervised localization methods that can learn from datasets with no bounding-box annotations have been proposed in recent years, by leveraging the natural co-occurrence of audio and visual signals. Despite significant interest, popular evaluation protocols have two major flaws. First, they allow for the use of a fully annotated dataset to perform early stopping, thus significantly increasing the annotation effort required for training. Second, current evaluation metrics assume the presence of sound sources at all times. This is of course an unrealistic assumption, and thus better metrics are necessary to capture the model's performance on (negative) samples with no visible sound sources. To accomplish this, we extend the test set of popular benchmarks, Flickr SoundNet and VGG-Sound Sources, in order to include negative samples, and measure performance using metrics that balance localization accuracy and recall. Using the new protocol, we conducted an extensive evaluation of prior methods, and found that most prior works are not capable of identifying negatives and suffer from significant overfitting problems (rely heavily on early stopping for best results). We also propose a new approach for visual sound source localization that addresses both these problems. In particular, we found that, through extreme visual dropout and the use of momentum encoders, the proposed approach combats overfitting effectively, and establishes a new state-of-the-art performance on both Flickr SoundNet and VGG-Sound Source. Code and pre-trained models are available at https://github.com/stoneMo/SLAVC.

artificial intelligence, localization, machine learning, (15 more...)

2209.09634

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Virginia (0.04)

Genre: Research Report > Promising Solution (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.96)

Frittoli, Luca, Carrera, Diego, Boracchi, Giacomo

Nonparametric and Online Change Detection in Multivariate Datastreams using QuantTree

We address the problem of online change detection in multivariate datastreams, and we introduce QuantTree Exponentially Weighted Moving Average (QT-EWMA), a nonparametric change-detection algorithm that can control the expected time before a false alarm, yielding a desired Average Run Length (ARL$_0$). Controlling false alarms is crucial in many applications and is rarely guaranteed by online change-detection algorithms that can monitor multivariate datastreams without knowing the data distribution. Like many change-detection algorithms, QT-EWMA builds a model of the data distribution, in our case a QuantTree histogram, from a stationary training set. To monitor datastreams even when the training set is extremely small, we propose QT-EWMA-update, which incrementally updates the QuantTree histogram during monitoring, always keeping the ARL$_0$ under control. Our experiments, performed on synthetic and real-world datastreams, demonstrate that QT-EWMA and QT-EWMA-update control the ARL$_0$ and the false alarm rate better than state-of-the-art methods operating in similar conditions, achieving lower or comparable detection delays.

algorithm, arl 0, datastream, (14 more...)

doi: 10.1109/TKDE.2022.3201635

2208.14801

Country:

North America > United States > California (0.04)
Europe > Finland > Pirkanmaa > Tampere (0.04)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)

Rodrigues, Érick Oliveira, de Morais, Felipe Fernandes Cordeiro, Conci, Aura

On the Automated Segmentation of Epicardial and Mediastinal Cardiac Adipose Tissues Using Classification Algorithms

The quantification of fat depots on the surroundings of the heart is an accurate procedure for evaluating health risk factors correlated with several diseases. However, this type of evaluation is not widely employed in clinical practice due to the required human workload. This work proposes a novel technique for the automatic segmentation of cardiac fat pads. The technique is based on applying classification algorithms to the segmentation of cardiac CT images. Furthermore, we extensively evaluate the performance of several algorithms on this task and discuss which provided better predictive models. Experimental results have shown that the mean accuracy for the classification of epicardial and mediastinal fats has been 98.4% with a mean true positive rate of 96.2%. On average, the Dice similarity index, regarding the segmented patients and the ground truth, was equal to 96.8%. Therfore, our technique has achieved the most accurate results for the automatic segmentation of cardiac fats, to date.

algorithm, epicardial fat, segmentation, (14 more...)

2208.14352

Country:

South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
South America > Brazil > Rio de Janeiro > Niterói (0.04)

Genre: Research Report (0.69)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Health Care Technology (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Non-readily identifiable data collaboration analysis for multiple datasets including personal information

Imakura, Akira, Sakurai, Tetsuya, Okada, Yukihiko, Fujii, Tomoya, Sakamoto, Teppei, Abe, Hiroyuki

Multi-source data fusion, in which multiple data sources are jointly analyzed to obtain improved information, has considerable research attention. For the datasets of multiple medical institutions, data confidentiality and cross-institutional communication are critical. In such cases, data collaboration (DC) analysis by sharing dimensionality-reduced intermediate representations without iterative cross-institutional communications may be appropriate. Identifiability of the shared data is essential when analyzing data including personal information. In this study, the identifiability of the DC analysis is investigated. The results reveals that the shared intermediate representations are readily identifiable to the original data for supervised learning. This study then proposes a non-readily identifiable DC analysis only sharing non-readily identifiable data for multiple medical datasets including personal information. The proposed method solves identifiability concerns based on a random sample permutation, the concept of interpretable DC analysis, and usage of functions that cannot be reconstructed. In numerical experiments on medical datasets, the proposed method exhibits a non-readily identifiability while maintaining a high recognition performance of the conventional DC analysis. For a hospital dataset, the proposed method exhibits a nine percentage point improvement regarding the recognition performance over the local analysis that uses only local dataset.

dc analysis, intermediate representation, representation, (14 more...)

2208.14611

Country:

Asia > Japan > Honshū > Kantō > Ibaraki Prefecture > Tsukuba (0.04)
North America > United States > California (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.54)
(4 more...)

Bionda, Andrea, Frittoli, Luca, Boracchi, Giacomo

Deep Autoencoders for Anomaly Detection in Textured Images using CW-SSIM

Detecting anomalous regions in images is a frequently encountered problem in industrial monitoring. A relevant example is the analysis of tissues and other products that in normal conditions conform to a specific texture, while defects introduce changes in the normal pattern. We address the anomaly detection problem by training a deep autoencoder, and we show that adopting a loss function based on Complex Wavelet Structural Similarity (CW-SSIM) yields superior detection performance on this type of images compared to traditional autoencoder loss functions. Our experiments on well-known anomaly detection benchmarks show that a simple model trained with this loss function can achieve comparable or superior performance to state-of-the-art methods leveraging deeper, larger and more computationally demanding neural networks.

anomaly detection, autoencoder, detection, (12 more...)

doi: 10.1007/978-3-031-06430-2_56

2208.14045

Country: Europe > Italy > Lombardy > Milan (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Multimodal Learning on Graphs for Disease Relation Extraction

Lin, Yucong, Lu, Keming, Yu, Sheng, Cai, Tianxi, Zitnik, Marinka

Objective: Disease knowledge graphs are a way to connect, organize, and access disparate information about diseases with numerous benefits for artificial intelligence (AI). To create knowledge graphs, it is necessary to extract knowledge from multimodal datasets in the form of relationships between disease concepts and normalize both concepts and relationship types. Methods: We introduce REMAP, a multimodal approach for disease relation extraction and classification. The REMAP machine learning approach jointly embeds a partial, incomplete knowledge graph and a medical language dataset into a compact latent vector space, followed by aligning the multimodal embeddings for optimal disease relation extraction. Results: We apply REMAP approach to a disease knowledge graph with 96,913 relations and a text dataset of 1.24 million sentences. On a dataset annotated by human experts, REMAP improves text-based disease relation extraction by 10.0% (accuracy) and 17.2% (F1-score) by fusing disease knowledge graphs with text information. Further, REMAP leverages text information to recommend new relationships in the knowledge graph, outperforming graph-based methods by 8.4% (accuracy) and 10.4% (F1-score). Conclusion: REMAP is a multimodal approach for extracting and classifying disease relationships by fusing structured knowledge and text information. REMAP provides a flexible neural architecture to easily find, access, and validate AI-driven relationships between disease concepts.

knowledge graph, relation, relation extraction, (15 more...)

2203.08893

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
Asia > China > Beijing > Beijing (0.05)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Health Care Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)