AITopics | realistic dataset

Collaborating Authors

realistic dataset

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning curves of generic features maps for realistic datasets with a teacher-student model

Neural Information Processing SystemsDec-24-2025, 13:06:18 GMT

Teacher-student models provide a framework in which the typical-case performance of high-dimensional supervised learning can be described in closed form.

generic feature map, realistic dataset, teacher-student model, (4 more...)

Neural Information Processing Systems

Industry: Education > Educational Technology > Educational Software (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.83)

Add feedback

Learning curves of generic features maps for realistic datasets with a teacher-student model

Neural Information Processing SystemsJan-17-2025, 20:05:58 GMT

Teacher-student models provide a framework in which the typical-case performance of high-dimensional supervised learning can be described in closed form. In this paper, we introduce a Gaussian covariate generalisation of the model where the teacher and student can act on different spaces, generated with fixed, but generic feature maps. While still solvable in a closed form, this generalization is able to capture the learning curves for a broad range of realistic data sets, thus redeeming the potential of the teacher-student framework. Our contribution is then two-fold: first, we prove a rigorous formula for the asymptotic training loss and generalisation error. Second, we present a number of situations where the learning curve of the model captures the one of a realistic data set learned with kernel regression and classification, with out-of-the-box feature maps such as random projections or scattering transforms, or with pre-learned ones - such as the features learned by training multi-layer neural networks.

generic feature map, realistic dataset, teacher-student model, (1 more...)

Neural Information Processing Systems

Industry: Education > Educational Technology > Educational Software (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

MACK: Mismodeling Addressed with Contrastive Knowledge

Sheldon, Liam Rankin, Rankin, Dylan Sheldon, Harris, Philip

arXiv.org Artificial IntelligenceOct-17-2024

The use of machine learning methods in high energy physics typically relies on large volumes of precise simulation for training. As machine learning models become more complex they can become increasingly sensitive to differences between this simulation and the real data collected by experiments. We present a generic methodology based on contrastive learning which is able to greatly mitigate this negative effect. Crucially, the method does not require prior knowledge of the specifics of the mismodeling. While we demonstrate the efficacy of this technique using the task of jet-tagging at the Large Hadron Collider, it is applicable to a wide array of different tasks both in and out of the field of high energy physics.

artificial intelligence, dataset, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2410.13947

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Federated Learning with Differential Privacy

Banse, Adrien, Kreischer, Jan, Jürgens, Xavier Oliva i

arXiv.org Artificial IntelligenceFeb-3-2024

Federated learning (FL), as a type of distributed machine learning, is capable of significantly preserving client's private data from being shared among different parties. Nevertheless, private information can still be divulged by analyzing uploaded parameter weights from clients. In this report, we showcase our empirical benchmark of the effect of the number of clients and the addition of differential privacy (DP) mechanisms on the performance of the model on different types of data. Our results show that non-i.i.d and small datasets have the highest decrease in performance in a distributed and differentially private setting.

dataset, experiment, privacy, (14 more...)

arXiv.org Artificial Intelligence

2402.0223

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Belgium (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Utilizing Large Language Models for Natural Interface to Pharmacology Databases

Lu, Hong, Li, Chuan, Li, Yinheng, Zhao, Jie

arXiv.org Artificial IntelligenceJul-26-2023

The drug development process necessitates that pharmacologists undertake various tasks, such as reviewing literature, formulating hypotheses, designing experiments, and interpreting results. Each stage requires accessing and querying vast amounts of information. In this abstract, we introduce a Large Language Model (LLM)-based Natural Language Interface designed to interact with structured information stored in databases. Our experiments demonstrate the feasibility and effectiveness of the proposed framework. This framework can generalize to query a wide range of pharmaceutical data and knowledge bases.

artificial intelligence, large language model, natural language, (15 more...)

arXiv.org Artificial Intelligence

2307.15717

Genre: Research Report (0.40)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.56)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Synthetic Data 101: What are the use cases for synthetic data?

#artificialintelligenceMar-24-2023, 13:50:14 GMT

Synthetic data accurately mimics real-world data. It serves as a placeholder for production data in development and testing workflows and is also used to improve the quality of machine learning algorithms. Common use cases revolve around product development/testing, machine learning, data analysis, and data privacy and security. For example, financial institutions use synthetic data to generate reliable market data for algorithmic trading and risk analysis, while healthcare providers use it to analyze patient data without compromising sensitive patient information. Additionally, synthetic data is used in machine learning algorithms to improve performance and accuracy and thus accelerate the development process.

synthetic data, synthetic data 101, use case, (10 more...)

#artificialintelligence

Country: Europe > Switzerland (0.06)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Banking & Finance (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.36)

Add feedback

Introduction to Machine Learning: K Nearest Neighbors (KNN) - PythonAlgos

#artificialintelligenceJun-27-2022, 17:50:21 GMT

K Nearest Neighbors or KNN is a standard Machine Learning algorithm used for classification. In KNN, we plot already labeled points with their label and then define decision boundaries based on the value of the hyperparameter "K". Hyperparameter just means a parameter that we control and can use for tuning. "K" is used to represent how many of the nearest neighbors we should take into account when determining the class of a new point. In this post we'll cover how to do KNN on two datasets, one contrived sample dataset and one more realistic dataset about wine from sklearn.

dataset, decision boundary, nearest neighbor, (13 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.62)

Add feedback

Posture Prediction for Healthy Sitting using a Smart Chair

Gelaw, Tariku Adane, Hagos, Misgina Tsighe

arXiv.org Artificial IntelligenceJan-5-2022

Poor sitting habits have been identified as a risk factor to musculoskeletal disorders and lower back pain especially on the elderly, disabled people, and office workers. In the current computerized world, even while involved in leisure or work activity, people tend to spend most of their days sitting at computer desks. This can result in spinal pain and related problems. Therefore, a means to remind people about their sitting habits and provide recommendations to counterbalance, such as physical exercise, is important. Posture recognition for seated postures have not received enough attention as most works focus on standing postures. Wearable sensors, pressure or force sensors, videos and images were used for posture recognition in the literature. The aim of this study is to build Machine Learning models for classifying sitting posture of a person by analyzing data collected from a chair platted with two 32 by 32 pressure sensors at its seat and backrest. Models were built using five algorithms: Random Forest (RF), Gaussian Na\"ive Bayes, Logistic Regression, Support Vector Machine and Deep Neural Network (DNN). All the models are evaluated using KFold cross-validation technique. This paper presents experiments conducted using the two separate datasets, controlled and realistic, and discusses results achieved at classifying six sitting postures. Average classification accuracies of 98% and 97% were achieved on the controlled and realistic datasets, respectively.

dataset, posture, sensor, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-030-93709-6_26

2201.02615

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Italy (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(5 more...)

Genre:

Research Report > Experimental Study (0.87)
Research Report > New Finding (0.68)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.68)
Health & Medicine > Therapeutic Area > Musculoskeletal (0.68)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.35)

Add feedback