Goto

Collaborating Authors

 convolutional neural


Convolutional Neural Nets vs Vision Transformers: A SpaceNet Case Study with Balanced vs Imbalanced Regimes

arXiv.org Artificial Intelligence

We present a controlled comparison of a convolutional neural network (EfficientNet-B0) and a Vision Transformer (ViT-Base) on SpaceNet under two label-distribution regimes: a naturally imbalanced five-class split and a balanced-resampled split with 700 images per class (70:20:10 train/val/test). With matched preprocessing (224x224, ImageNet normalization), lightweight augmentations, and a 40-epoch budget on a single NVIDIA P100, we report accuracy, macro-F1, balanced accuracy, per-class recall, and deployment metrics (model size and latency). On the imbalanced split, EfficientNet-B0 reaches 93% test accuracy with strong macro-F1 and lower latency; ViT-Base is competitive at 93% with a larger parameter count and runtime. On the balanced split, both models are strong; EfficientNet-B0 reaches 99% while ViT-Base remains competitive, indicating that balancing narrows architecture gaps while CNNs retain an efficiency edge. We release manifests, logs, and per-image predictions to support reproducibility.


Implementation of AI Deep Learning Algorithm For Multi-Modal Sentiment Analysis

arXiv.org Artificial Intelligence

A multi-modal emotion recognition method was established by combining two-channel convolutional neural network with ring network. This method can extract emotional information effectively and improve learning efficiency. The words were vectorized with GloVe, and the word vector was input into the convolutional neural network. Combining attention mechanism and maximum pool converter BiSRU channel, the local deep emotion and pre-post sequential emotion semantics are obtained. Finally, multiple features are fused and input as the polarity of emotion, so as to achieve the emotion analysis of the target. Experiments show that the emotion analysis method based on feature fusion can effectively improve the recognition accuracy of emotion data set and reduce the learning time. The model has a certain generalization.


Hypergraph convolutional neural network-based clustering technique

arXiv.org Artificial Intelligence

This paper constitutes the novel hypergraph convolutional neural networkbased clustering technique. This technique is employed to solve the clustering problem for the Citeseer dataset and the Cora dataset. Each dataset contains the feature matrix and the incidence matrix of the hypergraph (i.e., constructed from the feature matrix). This novel clustering method utilizes both matrices. Initially, the hypergraph auto-encoders are employed to transform both the incidence matrix and the feature matrix from high dimensional space to low dimensional space. In the end, we apply the k-means clustering technique to the transformed matrix. The hypergraph convolutional neural network (CNN)-based clustering technique presented a better result on performance during experiments than those of the other classical clustering techniques.


Fully automated convolutional neural network-based affine algorithm improves liver registration and lesion co-localization on hepatobiliary phase T1-weighted MR images

#artificialintelligence

Liver alignment between series/exams is challenged by dynamic morphology or variability in patient positioning or motion. Image registration can improve image interpretation and lesion co-localization. We assessed the performance of a convolutional neural network algorithm to register cross-sectional liver imaging series and compared its performance to manual image registration. Three hundred fourteen patients, including internal and external datasets, who underwent gadoxetate disodium-enhanced magnetic resonance imaging for clinical care from 2011 to 2018, were retrospectively selected. Automated registration was applied to all 2,663 within-patient series pairs derived from these datasets.


Convolutional Neural Net in Tensorflow โ€“ Good Audience

#artificialintelligence

One of the most exciting areas of deep learning is computer vision. Through recent advances in convolutional neural nets we have been able to create self driving cars, facial detection systems and automated medical imagery analysis that out performs specialists just to name a few. In this article I will show you the fundamentals of convolutional neural nets and how you can create one yourself to classify hand written digits. Unlike many fields of deep learning which are hyped to the public to seem like they are replications of biological functions in the human brain, convolutional neural nets come very close. Back in 1959, David Hubel and Torsten Wiesel conducted expirements on cats and monkeys which gave important revelations of how the visual cortex functions. What they found was that many neurons have a small local receptive which only react to small finite areas of the total visual field.


Deep Learning Paves Way for Better Diagnostics

#artificialintelligence

Stanford researchers are leveraging GPU-based machines in the Amazon EC2 cloud to run deep learning workloads with the goal of improving diagnostics for a chronic eye disease, called diabetic retinopathy. The disease is a complication of diabetes that can lead to blindness if blood sugar is poorly controlled. It affects about 45 percent of diabetics and 100 million people worldwide, many in developing nations. Final-year Stanford PhD students Apaar Sadhwani and Jason Su got involved in developing the diagnostic solution as part of a class project and corresponding Kaggle competition that was held last year. Sponsor Amazon provided AWS cloud credits in support of the research.


softmax-classifiers-explained

#artificialintelligence

Last week, we discussed Multi-class SVM loss; specifically, the hinge loss and squared hinge loss functions. In reality, these values would not be randomly generated -- they would instead be the output of your scoring function f. Let's exponentiate the output of the scoring function, yielding our unnormalized probabilities: Figure 2: Exponentiating the output values from the scoring function gives us our unnormalized probabilities. Figure 4: Taking the negative log of the probability for the correct ground-truth class yields the final loss for the data point. To examine some actual probabilities, let's loop over a few randomly sampled training examples and examine the output probabilities returned by the classifier: Note: I'm randomly sampling from the training data rather than the testing data to demonstrate that there should be a noticeably large gap in between the probabilities for each class label.


SigOpt for ML: TensorFlow ConvNets on a Budget with Bayesian Optimization

#artificialintelligence

In this post on integrating SigOpt with machine learning frameworks, we will show you how to use SigOpt and TensorFlow to efficiently search for an optimal configuration of a convolutional neural network (CNN). There are a large number of tunable parameters associated with defining and training deep neural networks ( Bergstra [1]) and SigOpt accelerates searching through these settings to find optimal configurations. This search is typically a slow and expensive process, especially when using standard techniques like grid or random search, as evaluating each configuration can take multiple hours. SigOpt finds good combinations far more efficiently than these standard methods by employing an ensemble of state-of-the-art Bayesian optimization techniques, allowing users to arrive at the best models faster and cheaper. In this example, we consider the same optical character recognition task of the SVHN dataset as discussed in a previous post.