AITopics

2403.11483

Country:

Asia > China (0.04)
Oceania > Australia > Queensland (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Habchi, Yassine, Kheddar, Hamza, Himeur, Yassine, Boukabou, Abdelkrim, Chouchane, Ammar, Ouamane, Abdelmalik, Atalla, Shadi, Mansoor, Wathiq

Machine Learning and Vision Transformers for Thyroid Carcinoma Diagnosis: A review

arXiv.org Artificial IntelligenceMar-17-2024

The growing interest in developing smart diagnostic systems to help medical experts process extensive data for treating incurable diseases has been notable. In particular, the challenge of identifying thyroid cancer (TC) has seen progress with the use of machine learning (ML) and big data analysis, incorporating transformers to evaluate TC prognosis and determine the risk of malignancy in individuals. This review article presents a summary of various studies on AIbased approaches, especially those employing transformers, for diagnosing TC. It introduces a new categorization system for these methods based on artifcial intelligence (AI) algorithms, the goals of the framework, and the computing environments used. Additionally, it scrutinizes and contrasts the available TC datasets by their features. The paper highlights the importance of AI instruments in aiding the diagnosis and treatment of TC through supervised, unsupervised, or mixed approaches, with a special focus on the ongoing importance of transformers in medical diagnostics and disease management. It further discusses the progress made and the continuing obstacles in this area. Lastly, it explores future directions and focuses within this research feld.

classification, diagnosis, thyroid nodule, (11 more...)

2403.13843

Country:

North America > United States (0.27)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Asia > Middle East > UAE > Dubai Emirate > Dubai (0.04)
(16 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology > Thyroid Cancer (1.00)
Health & Medicine > Therapeutic Area > Endocrinology (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
(7 more...)

Uddin, Md. Ashraf, Aryal, Sunil, Bouadjenek, Mohamed Reda, Al-Hawawreh, Muna, Talukder, Md. Alamin

A Dual-Tier Adaptive One-Class Classification IDS for Emerging Cyberthreats

arXiv.org Artificial IntelligenceMar-17-2024

In today's digital age, our dependence on IoT (Internet of Things) and IIoT (Industrial IoT) systems has grown immensely, which facilitates sensitive activities such as banking transactions and personal, enterprise data, and legal document exchanges. Cyberattackers consistently exploit weak security measures and tools. The Network Intrusion Detection System (IDS) acts as a primary tool against such cyber threats. However, machine learning-based IDSs, when trained on specific attack patterns, often misclassify new emerging cyberattacks. Further, the limited availability of attack instances for training a supervised learner and the ever-evolving nature of cyber threats further complicate the matter. This emphasizes the need for an adaptable IDS framework capable of recognizing and learning from unfamiliar/unseen attacks over time. In this research, we propose a one-class classification-driven IDS system structured on two tiers. The first tier distinguishes between normal activities and attacks/threats, while the second tier determines if the detected attack is known or unknown. Within this second tier, we also embed a multi-classification mechanism coupled with a clustering algorithm. This model not only identifies unseen attacks but also uses them for retraining them by clustering unseen attacks. This enables our model to be future-proofed, capable of evolving with emerging threat patterns. Leveraging one-class classifiers (OCC) at the first level, our approach bypasses the need for attack samples, addressing data imbalance and zero-day attack concerns and OCC at the second level can effectively separate unknown attacks from the known attacks. Our methodology and evaluations indicate that the presented framework exhibits promising potential for real-world deployments.

category, dataset, unknown attack, (13 more...)

2403.1301

Country:

Oceania > Australia > New South Wales (0.04)
North America > Canada > New Brunswick > Fredericton (0.04)
Asia > Taiwan > Taiwan Province > Taipei (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.48)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Networks (1.00)
(4 more...)

arXiv.org Artificial IntelligenceMar-16-2024

Graph Regularized NMF with L20-norm for Unsupervised Feature Learning

Wang, Zhen, Min, Wenwen

Nonnegative Matrix Factorization (NMF) is a widely applied technique in the fields of machine learning and data mining. Graph Regularized Non-negative Matrix Factorization (GNMF) is an extension of NMF that incorporates graph regularization constraints. GNMF has demonstrated exceptional performance in clustering and dimensionality reduction, effectively discovering inherent low-dimensional structures embedded within high-dimensional spaces. However, the sensitivity of GNMF to noise limits its stability and robustness in practical applications. In order to enhance feature sparsity and mitigate the impact of noise while mining row sparsity patterns in the data for effective feature selection, we introduce the $\ell_{2,0}$-norm constraint as the sparsity constraints for GNMF. We propose an unsupervised feature learning framework based on GNMF\_$\ell_{20}$ and devise an algorithm based on PALM and its accelerated version to address this problem. Additionally, we establish the convergence of the proposed algorithms and validate the efficacy and superiority of our approach through experiments conducted on both simulated and real image data.

algorithm, constraint, matrix factorization, (15 more...)

2403.1091

Country:

Asia > China > Yunnan Province > Kunming (0.04)
Asia > China > Hubei Province > Wuhan (0.04)
Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

arXiv.org Artificial IntelligenceMar-16-2024

Advancing multivariate time series similarity assessment: an integrated computational approach

Tonle, Franck, Tonnang, Henri, Ndadji, Milliam, Tchendji, Maurice, Nzeukou, Armand, Senagi, Kennedy, Niassy, Saliou

Data mining, particularly the analysis of multivariate time series data, plays a crucial role in extracting insights from complex systems and supporting informed decision-making across diverse domains. However, assessing the similarity of multivariate time series data presents several challenges, including dealing with large datasets, addressing temporal misalignments, and the need for efficient and comprehensive analytical frameworks. To address all these challenges, we propose a novel integrated computational approach known as Multivariate Time series Alignment and Similarity Assessment (MTASA). MTASA is built upon a hybrid methodology designed to optimize time series alignment, complemented by a multiprocessing engine that enhances the utilization of computational resources. This integrated approach comprises four key components, each addressing essential aspects of time series similarity assessment, thereby offering a comprehensive framework for analysis. MTASA is implemented as an open-source Python library with a user-friendly interface, making it accessible to researchers and practitioners. To evaluate the effectiveness of MTASA, we conducted an empirical study focused on assessing agroecosystem similarity using real-world environmental data. The results from this study highlight MTASA's superiority, achieving approximately 1.5 times greater accuracy and twice the speed compared to existing state-of-the-art integrated frameworks for multivariate time series similarity assessment. It is hoped that MTASA will significantly enhance the efficiency and accessibility of multivariate time series analysis, benefitting researchers and practitioners across various domains. Its capabilities in handling large datasets, addressing temporal misalignments, and delivering accurate results make MTASA a valuable tool for deriving insights and aiding decision-making processes in complex systems.

2403.11044

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > Germany (0.04)
Asia > Vietnam (0.04)
(6 more...)

Genre: Research Report > Promising Solution (0.67)

Industry:

Health & Medicine (1.00)
Food & Agriculture > Agriculture (1.00)
Government (0.68)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Neural Information Processing SystemsMar-15-2024, 12:06:33 GMT

Clustering via Dirichlet Process Mixture Models for Portable Skill Discovery

Skill discovery algorithms in reinforcement learning typically identify single states or regions in state space that correspond to task-specific subgoals. However, such methods do not directly address the question of how many distinct skills are appropriate for solving the tasks that the agent faces. This can be highly inefficient when many identified subgoals correspond to the same underlying skill, but are all used individually as skill goals. Furthermore, skills created in this manner are often only transferable to tasks that share identical state spaces, since corresponding subgoals across tasks are not merged into a single skill goal. We show that these problems can be overcome by clustering subgoal data defined in an agent-space and using the resulting clusters as templates for skill termination conditions. Clustering via a Dirichlet process mixture model is used to discover a minimal, sufficient collection of portable skills.

agent, algorithm, termination, (16 more...)

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.51)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
(2 more...)

Neural Information Processing SystemsMar-15-2024, 12:06:19 GMT

A blind deconvolution method for neural spike identification

We consider the problem of estimating neural spikes from extracellular voltage recordings. Most current methods are based on clustering, which requires substantial human supervision and systematically mishandles temporally overlapping spikes. We formulate the problem as one of statistical inference, in which the recorded voltage is a noisy sum of the spike trains of each neuron convolved with its associated spike waveform. Joint maximum-a-posteriori (MAP) estimation of the waveforms and spikes is then a blind deconvolution problem in which the coefficients are sparse. We develop a block-coordinate descent procedure to approximate the MAP solution, based on our recently developed continuous basis pursuit method. We validate our method on simulated data as well as real data for which ground truth is available via simultaneous intracellular recordings. In both cases, our method substantially reduces the number of missed spikes and false positives when compared to a standard clustering algorithm, primarily by recovering overlapping spikes. The method offers a fully automated alternative to clustering methods that is less susceptible to systematic errors.

amplitude, spike, waveform, (16 more...)

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > New York > New York County > New York City (0.05)
North America > United States > California > Los Angeles County > Pasadena (0.04)

Genre: Workflow (0.46)

Industry: Health & Medicine (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.87)

Neural Information Processing SystemsMar-15-2024, 12:04:59 GMT

Noise Thresholds for Spectral Clustering

Although spectral clustering has enjoyed considerable empirical success in machine learning, its theoretical properties are not yet fully developed. We analyze the performance of a spectral algorithm for hierarchical clustering and show that on a class of hierarchically structured similarity matrices, this algorithm can tolerate noise that grows with the number of data points while still perfectly recovering the hierarchical clusters with high probability. We additionally improve upon previous results for k-way spectral clustering to derive conditions under which spectral clustering makes no mistakes. Further, using minimax analysis, we derive tight upper and lower bounds for the clustering problem and compare the performance of spectral clustering to these information theoretic limits.

algorithm, matrix, spectral, (15 more...)

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Instructional Material > Course Syllabus & Notes (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Neural Information Processing SystemsMar-15-2024, 11:03:55 GMT

Manifold Précis: An Annealing Technique for Diverse Sampling of Manifolds

In this paper, we consider the Précis problem of sampling K representative yet diverse data points from a large dataset. This problem arises frequently in applications such as video and document summarization, exploratory data analysis, and pre-filtering. We formulate a general theory which encompasses not just traditional techniques devised for vector spaces, but also non-Euclidean manifolds, thereby enabling these techniques to shapes, human activities, textures and many other image and video based datasets. We propose intrinsic manifold measures for measuring the quality of a selection of points with respect to their representative power, and their diversity. We then propose efficient algorithms to optimize the cost function using a novel annealing-based iterative alternation algorithm. The proposed formulation is applicable to manifolds of known geometry as well as to manifolds whose geometry needs to be estimated from samples. Experimental results show the strength and generality of the proposed approach.

algorithm, exemplar, manifold, (16 more...)

Country:

North America > United States > Maryland > Prince George's County > College Park (0.04)
North America > United States > Arizona (0.04)
Asia > Middle East > Lebanon (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Neural Information Processing SystemsMar-15-2024, 06:57:42 GMT

On U-processes and clustering performance

Many clustering techniques aim at optimizing empirical criteria that are of the form of a U-statistic of degree two. Given a measure of dissimilarity between pairs of observations, the goal is to minimize the within cluster point scatter over a class of partitions of the feature space. It is the purpose of this paper to define a general statistical framework, relying on the theory of U-processes, for studying the performance of such clustering methods.

partition, u-process, u-statistics, (15 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)