AITopics | Vallevik, Vibeke Binz

Collaborating Authors

Vallevik, Vibeke Binz

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Rethinking Synthetic Data definitions: A privacy driven approach

Vallevik, Vibeke Binz, Marshall, Serena Elizabeth, Babic, Aleksandar, Nygaard, Jan Franz

arXiv.org Artificial IntelligenceMar-5-2025

Synthetic data is emerging as a cost-eective solution necessary to meet the increasing data demands of AI development and can be generated either from existing knowledge or derived from real data. The traditional classification of synthetic data into hybrid, partial or fully synthetic datasets has limited value and does not reflect the ever-increasing methods to generate synthetic data. The characteristics of synthetic data are greatly shaped by the generation method and their source, which in turn determines its practical applications. We suggest a dierent approach to grouping synthetic data types that better reflect privacy perspectives. This is a crucial step towards improved regulatory guidance in the generation and processing of synthetic data. This approach to classification provides flexibility to new advancements like deep generative methods and oers a more practical framework for future applications.

artificial intelligence, machine learning, synthetic data, (16 more...)

arXiv.org Artificial Intelligence

2503.03506

Country: Europe > Norway (0.15)

Genre: Research Report (0.50)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining (0.71)

Add feedback

Permissioned Blockchain-based Framework for Ranking Synthetic Data Generators

Veeraragavan, Narasimha Raghavan, Tabatabaei, Mohammad Hossein, Elvatun, Severin, Vallevik, Vibeke Binz, Larønningen, Siri, Nygård, Jan F

arXiv.org Artificial IntelligenceMay-12-2024

Synthetic data generation is increasingly recognized as a crucial solution to address data related challenges such as scarcity, bias, and privacy concerns. As synthetic data proliferates, the need for a robust evaluation framework to select a synthetic data generator becomes more pressing given the variety of options available. In this research study, we investigate two primary questions: 1) How can we select the most suitable synthetic data generator from a set of options for a specific purpose? 2) How can we make the selection process more transparent, accountable, and auditable? To address these questions, we introduce a novel approach in which the proposed ranking algorithm is implemented as a smart contract within a permissioned blockchain framework called Sawtooth. Through comprehensive experiments and comparisons with state-of-the-art baseline ranking solutions, our framework demonstrates its effectiveness in providing nuanced rankings that consider both desirable and undesirable properties. Furthermore, our framework serves as a valuable tool for selecting the optimal synthetic data generators for specific needs while ensuring compliance with data protection principles.

artificial intelligence, generator, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2405.07196

Country:

Europe > Norway (0.14)
Europe > Finland (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.67)

Technology:

Information Technology > e-Commerce > Financial Technology (1.00)
Information Technology > Security & Privacy (1.00)
Information Technology > Data Science (1.00)
(2 more...)

Add feedback

Can I trust my fake data -- A comprehensive quality assessment framework for synthetic tabular data in healthcare

Vallevik, Vibeke Binz, Babic, Aleksandar, Marshall, Serena Elizabeth, Elvatun, Severin, Brøgger, Helga, Alagaratnam, Sharmini, Edwin, Bjørn, Veeraragavan, Narasimha Raghavan, Befring, Anne Kjersti, Nygård, Jan Franz

arXiv.org Artificial IntelligenceJan-24-2024

Ensuring safe adoption of AI tools in healthcare hinges on access to sufficient data for training, testing and validation. In response to privacy concerns and regulatory requirements, using synthetic data has been suggested. Synthetic data is created by training a generator on real data to produce a dataset with similar statistical properties. Competing metrics with differing taxonomies for quality evaluation have been suggested, resulting in a complex landscape. Optimising quality entails balancing considerations that make the data fit for use, yet relevant dimensions are left out of existing frameworks. We performed a comprehensive literature review on the use of quality evaluation metrics on SD within the scope of tabular healthcare data and SD made using deep generative methods. Based on this and the collective team experiences, we developed a conceptual framework for quality assurance. The applicability was benchmarked against a practical case from the Dutch National Cancer Registry. We present a conceptual framework for quality assurance of SD for AI applications in healthcare that aligns diverging taxonomies, expands on common quality dimensions to include the dimensions of Fairness and Carbon footprint, and proposes stages necessary to support real-life applications. Building trust in synthetic data by increasing transparency and reducing the safety risk will accelerate the development and uptake of trustworthy AI tools for the benefit of patients. Despite the growing emphasis on algorithmic fairness and carbon footprint, these metrics were scarce in the literature review. The overwhelming focus was on statistical similarity using distance metrics while sequential logic detection was scarce. A consensus-backed framework that includes all relevant quality dimensions can provide assurance for safe and responsible real-life applications of SD.

data mining, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2401.13716

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Colorado (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre:

Research Report > Experimental Study (1.00)
Overview (1.00)
Research Report > New Finding (0.67)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
(3 more...)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(4 more...)

Add feedback