AITopics | semi-supervised algorithm

Collaborating Authors

semi-supervised algorithm

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Robot Learning with Sparsity and Scarcity

Xu, Jingxi

arXiv.org Artificial IntelligenceSep-23-2025

Unlike in language or vision, one of the fundamental challenges in robot learning is the lack of access to vast data resources. We can further break down the problem into (1) data sparsity from the angle of data representation and (2) data scarcity from the angle of data quantity. In this thesis, I will discuss selected works on two domains: (1) tactile sensing and (2) rehabilitation robots, which are exemplars of data sparsity and scarcity, respectively. Tactile sensing is an essential modality for robotics, but tactile data are often sparse, and for each interaction with the physical world, tactile sensors can only obtain information about the local area of contact. I will discuss my work on learning vision-free tactile-only exploration and manipulation policies through model-free reinforcement learning to make efficient use of sparse tactile information. On the other hand, rehabilitation robots are an example of data scarcity to the extreme due to the significant challenge of collecting biosignals from disabled-bodied subjects at scale for training. I will discuss my work in collaboration with the medical school and clinicians on intent inferral for stroke survivors, where a hand orthosis developed in our lab collects a set of biosignals from the patient and uses them to infer the activity that the patient intends to perform, so the orthosis can provide the right type of physical assistance at the right moment. My work develops machine learning algorithms that enable intent inferral with minimal data, including semi-supervised, meta-learning, and generative AI methods.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2509.16834

Country: North America > United States (0.27)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Manipulation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

Predicting Anthropometric Body Composition Variables Using 3D Optical Imaging and Machine Learning

Agrahari, Gyaneshwar, Bist, Kiran, Pandey, Monika, Kapita, Jacob, James, Zachary, Knox, Jackson, Heymsfield, Steven, Ramirez, Sophia, Wolenski, Peter, Drenska, Nadejda

arXiv.org Artificial IntelligenceJun-19-2025

Accurate prediction of anthropometric body composition variables, such as Appendicular Lean Mass (ALM), Body Fat Percentage (BFP), and Bone Mineral Density (BMD), is essential for early diagnosis of several chronic diseases. Currently, researchers rely on Dual-Energy X-ray Absorptiometry (DXA) scans to measure these metrics; however, DXA scans are costly and time-consuming. This work proposes an alternative to DXA scans by applying statistical and machine learning models on biomarkers (height, volume, left calf circumference, etc) obtained from 3D optical images. The dataset consists of 847 patients and was sourced from Pennington Biomedical Research Center. Extracting patients' data in healthcare faces many technical challenges and legal restrictions. However, most supervised machine learning algorithms are inherently data-intensive, requiring a large amount of training data. To overcome these limitations, we implemented a semi-supervised model, the $p$-Laplacian regression model. This paper is the first to demonstrate the application of a $p$-Laplacian model for regression. Our $p$-Laplacian model yielded errors of $\sim13\%$ for ALM, $\sim10\%$ for BMD, and $\sim20\%$ for BFP when the training data accounted for 10 percent of all data. Among the supervised algorithms we implemented, Support Vector Regression (SVR) performed the best for ALM and BMD, yielding errors of $\sim 8\%$ for both, while Least Squares SVR performed the best for BFP with $\sim 11\%$ error when trained on 80 percent of the data. Our findings position the $p$-Laplacian model as a promising tool for healthcare applications, particularly in a data-constrained environment.

algorithm, artificial intelligence, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2506.14815

Country: North America > United States (0.67)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.90)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.87)

Add feedback

A semi-supervised sparse K-Means algorithm

Vouros, Avgoustinos, Vasilaki, Eleni

arXiv.org Machine LearningMar-20-2020

We consider the problem of data clustering with unidentified feature quality but the existence of small amount of label data. In the first case a sparse clustering method can be employed in order to detect the subgroup of features necessary for clustering and in the second case a semi-supervised method can use the labelled data to create constraints and enhance the clustering solution. In this paper we propose a K-Means inspired algorithm that employs these techniques. We show that the algorithm maintains the high performance of other similar semi-supervised algorthms as well as keeping the ability to identify informative from uninformative features. We examine the performance of the algorithm on real world data sets with unknown features quality as well as a real world data set with a known uninformative feature. We use a series of scenarios with different number and types of constraints.

algorithm, constraint, k-means, (16 more...)

arXiv.org Machine Learning

2003.06973

Country:

Europe > United Kingdom (0.14)
South America > Paraguay > Asunción > Asunción (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

A Neural Network for Semi-Supervised Learning on Manifolds

Genkin, Alexander, Sengupta, Anirvan M., Chklovskii, Dmitri

arXiv.org Machine LearningAug-21-2019

Semi-supervised learning algorithms typically construct a weighted graph of data points to represent a manifold. However, an explicit graph representation is problematic for neural networks operating in the online setting. Here, we propose a feed-forward neural network capable of semi-supervised learning on manifolds without using an explicit graph representation. Our algorithm uses channels that represent localities on the manifold such that correlations between channels represent manifold structure. The proposed neural network has two layers. The first layer learns to build a representation of low-dimensional manifolds in the input data as proposed recently in [8]. The second learns to classify data using both occasional supervision and similarity of the manifold representation of the data. The channel carrying label information for the second layer is assumed to be "silent" most of the time. Learning in both layers is Hebbian, making our network design biologically plausible. We experimentally demonstrate the effect of semi-supervised learning on non-trivial manifolds.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Machine Learning

doi: 10.1007/978-3-030-30487-4_30

1908.08145

Country: North America > United States (0.29)

Genre: Research Report (0.83)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Credit Scoring for Micro-Loans

Dubina, Nikolay, Kang, Dasom, Suh, Alex

arXiv.org Machine LearningMay-10-2019

Credit Scores are ubiquitous and instrumental for loan providers and regulators. In this paper we showcase how micro-loan credit system can be developed in real setting. We show what challenges arise and discuss solutions. Particularly, we are concerned about model interpretability and data quality. In the final section, we introduce semi-supervised algorithm that aids model development and evaluate its performance.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Machine Learning

1905.03946

Country: North America > United States (0.68)

Genre: Research Report (0.53)

Industry:

Banking & Finance > Credit (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.93)

Add feedback

Improved Generalization of Heading Direction Estimation for Aerial Filming Using Semi-supervised Regression

Wang, Wenshan, Ahuja, Aayush, Zhang, Yanfu, Bonatti, Rogerio, Scherer, Sebastian

arXiv.org Artificial IntelligenceMar-26-2019

In the task of Autonomous aerial filming of a moving actor (e.g. a person or a vehicle), it is crucial to have a good heading direction estimation for the actor from the visual input. However, the models obtained in other similar tasks, such as pedestrian collision risk analysis and human-robot interaction, are very difficult to generalize to the aerial filming task, because of the difference in data distributions. Towards improving generalization with less amount of labeled data, this paper presents a semi-supervised algorithm for heading direction estimation problem. We utilize temporal continuity as the unsupervised signal to regularize the model and achieve better generalization ability. This semi-supervised algorithm is applied to both training and testing phases, which increases the testing performance by a large margin. We show that by leveraging unlabeled sequences, the amount of labeled data required can be significantly reduced. We also discuss several important details on improving the performance by balancing labeled and unlabeled loss, and making good combinations. Experimental results show that our approach robustly outputs the heading direction for different types of actor. The aesthetic value of the video is also improved in the aerial filming task.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

1903.11174

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Asia > Japan > Honshū > Chūbu > Shizuoka Prefecture > Shizuoka (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

A semi-supervised approach to message stance classification

Giasemidis, Georgios, Kaplis, Nikolaos, Agrafiotis, Ioannis, Nurse, Jason R. C.

arXiv.org Machine LearningJan-29-2019

Social media communications are becoming increasingly prevalent; some useful, some false, whether unwittingly or maliciously. An increasing number of rumours daily flood the social networks. Determining their veracity in an autonomous way is a very active and challenging field of research, with a variety of methods proposed. However, most of the models rely on determining the constituent messages' stance towards the rumour, a feature known as the "wisdom of the crowd". Although several supervised machine-learning approaches have been proposed to tackle the message stance classification problem, these have numerous shortcomings. In this paper we argue that semi-supervised learning is more effective than supervised models and use two graph-based methods to demonstrate it. This is not only in terms of classification accuracy, but equally important, in terms of speed and scalability. We use the Label Propagation and Label Spreading algorithms and run experiments on a dataset of 72 rumours and hundreds of thousands messages collected from Twitter. We compare our results on two available datasets to the state-of-the-art to demonstrate our algorithms' performance regarding accuracy, speed and scalability for real-time applications.

algorithm, dataset, tweet, (17 more...)

arXiv.org Machine Learning

doi: 10.1109/TKDE.2018.2880192

1902.03097

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > Washington > King County > Bellevue (0.04)
North America > United States > Massachusetts (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Media > News (0.93)
Information Technology (0.66)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Active Community Detection: A Maximum Likelihood Approach

Mirabelli, Benjamin, Kushnir, Dan

arXiv.org Machine LearningJan-10-2018

We propose novel semi-supervised and active learning algorithms for the problem of community detection on networks. The algorithms are based on optimizing the likelihood function of the community assignments given a graph and an estimate of the statistical model that generated it. The optimization framework is inspired by prior work on the unsupervised community detection problem in Stochastic Block Models (SBM) using Semi-Definite Programming (SDP). In this paper we provide the next steps in the evolution of learning communities in this context which involves a constrained semi-definite programming algorithm, and a newly presented active learning algorithm. The active learner intelligently queries nodes that are expected to maximize the change in the model likelihood. Experimental results show that this active learning algorithm outperforms the random-selection semi-supervised version of the same algorithm as well as other state-of-the-art active learning algorithms. Our algorithms significantly improved performance is demonstrated on both real-world and SBM-generated networks even when the SBM has a signal to noise ratio (SNR) below the known unsupervised detectability threshold.

algorithm, node, semi-supervised algorithm, (14 more...)

arXiv.org Machine Learning

1801.05856

Country: North America > United States > New Jersey > Mercer County > Princeton (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Semi-supervised Clustering Ensemble by Voting

Iqbal, Ashraf Mohammed, Moh'd, Abidalrahman, Khan, Zahoor

arXiv.org Machine LearningAug-20-2012

Clustering ensemble is one of the most recent advances in unsupervised learning. It aims to combine the clustering results obtained using different algorithms or from different runs of the same clustering algorithm for the same data set, this is accomplished using on a consensus function, the efficiency and accuracy of this method has been proven in many works in literature. In the first part of this paper we make a comparison among current approaches to clustering ensemble in literature. All of these approaches consist of two main steps: the ensemble generation and consensus function. In the second part of the paper, we suggest engaging supervision in the clustering ensemble procedure to get more enhancements on the clustering results. Supervision can be applied in two places: either by using semi-supervised algorithms in the clustering ensemble generation step or in the form of a feedback used by the consensus function stage. Also, we introduce a flexible two parameter weighting mechanism, the first parameter describes the compatibility between the datasets under study and the semi-supervised clustering algorithms used to generate the base partitions, the second parameter is used to provide the user feedback on the these partitions. The two parameters are engaged in a "relabeling and voting" based consensus function to produce the final clustering.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Machine Learning

1208.4138

Country: North America > Canada (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.73)

Add feedback

Learning from Concept Drifting Data Streams with Unlabeled Data

Li, Peipei (Hefei University of Technology) | Wu, Xindong (University of Vermont) | Hu, Xuegang (Hefei University of Technology)

AAAI ConferencesJul-15-2010

Contrary to the previous beliefs that all arrived streaming data are labeled and the class labels are immediately availa- ble, we propose a Semi-supervised classification algorithm for data streams with concept drifts and UNlabeled data, called SUN. SUN is based on an evolved decision tree. In terms of deviation between history concept clusters and new ones generated by a developed clustering algorithm of k-Modes, concept drifts are distinguished from noise at leaves. Extensive studies on both synthetic and real data demonstrate that SUN performs well compared to several known online algorithms on unlabeled data. A conclusion is hence drawn that a feasible reference framework is provided for tackling concept drifting data streams with unlabeled data.

algorithm, artificial intelligence, machine learning, (13 more...)

AAAI Conferences

Twenty-Fourth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Vermont > Chittenden County > Burlington (0.15)
Asia > China > Anhui Province > Hefei (0.05)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.37)

Add feedback