AITopics | Kuo, Tzu-Sheng

Collaborating Authors

Kuo, Tzu-Sheng

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Wikibench: Community-Driven Data Curation for AI Evaluation on Wikipedia

Kuo, Tzu-Sheng, Halfaker, Aaron, Cheng, Zirui, Kim, Jiwoo, Wu, Meng-Hsin, Wu, Tongshuang, Holstein, Kenneth, Zhu, Haiyi

arXiv.org Artificial IntelligenceFeb-21-2024

AI tools are increasingly deployed in community contexts. However, datasets used to evaluate AI are typically created by developers and annotators outside a given community, which can yield misleading conclusions about AI performance. How might we empower communities to drive the intentional design and curation of evaluation datasets for AI that impacts them? We investigate this question on Wikipedia, an online community with multiple AI-based content moderation tools deployed. We introduce Wikibench, a system that enables communities to collaboratively curate AI evaluation datasets, while navigating ambiguities and differences in perspective through discussion. A field study on Wikipedia shows that datasets curated using Wikibench can effectively capture community consensus, disagreement, and uncertainty. Furthermore, study participants used Wikibench to shape the overall data curation process, including refining label definitions, determining data inclusion criteria, and authoring data statements. Based on our findings, we propose future directions for systems that support community-driven data curation.

artificial intelligence, data quality, social media, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3613904.3642278

2402.14147

Country:

Asia (1.00)
Europe (0.67)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > New York (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Data Science > Data Quality > Data Cleaning (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)

Add feedback

DMLR: Data-centric Machine Learning Research -- Past, Present and Future

Oala, Luis, Maskey, Manil, Bat-Leah, Lilith, Parrish, Alicia, Gürel, Nezihe Merve, Kuo, Tzu-Sheng, Liu, Yang, Dror, Rotem, Brajovic, Danilo, Yao, Xiaozhe, Bartolo, Max, Rojas, William A Gaviria, Hileman, Ryan, Aliment, Rainier, Mahoney, Michael W., Risdal, Meg, Lease, Matthew, Samek, Wojciech, Dutta, Debojyoti, Northcutt, Curtis G, Coleman, Cody, Hancock, Braden, Koch, Bernard, Tadesse, Girmaw Abebe, Karlaš, Bojan, Alaa, Ahmed, Dieng, Adji Bousso, Noy, Natasha, Reddi, Vijay Janapa, Zou, James, Paritosh, Praveen, van der Schaar, Mihaela, Bollacker, Kurt, Aroyo, Lora, Zhang, Ce, Vanschoren, Joaquin, Guyon, Isabelle, Mattson, Peter

arXiv.org Artificial IntelligenceNov-21-2023

Drawing from discussions at the inaugural DMLR workshop at ICML 2023 and meetings prior, in this report we outline the relevance of community engagement and infrastructure development for the creation of next-generation public datasets that will advance machine learning science. We chart a path forward as a collective effort to sustain the creation and maintenance of these datasets and methods towards positive scientific, societal and business impact.

artificial intelligence, machine learning, university, (16 more...)

arXiv.org Artificial Intelligence

2311.13028

Country:

North America > United States > California (1.00)
Asia (1.00)
Europe > Netherlands (0.68)

Genre: Research Report (0.64)

Industry:

Health & Medicine (1.00)
Information Technology > Security & Privacy (0.68)
Education > Curriculum > Subject-Specific Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Measuring Adversarial Datasets

Bai, Yuanchen, Huang, Raoyi, Viswanathan, Vijay, Kuo, Tzu-Sheng, Wu, Tongshuang

arXiv.org Artificial IntelligenceNov-6-2023

In the era of widespread public use of AI systems across various domains, ensuring adversarial robustness has become increasingly vital to maintain safety and prevent undesirable errors. Researchers have curated various adversarial datasets (through perturbations) for capturing model deficiencies that cannot be revealed in standard benchmark datasets. However, little is known about how these adversarial examples differ from the original data points, and there is still no methodology to measure the intended and unintended consequences of those adversarial transformations. In this research, we conducted a systematic survey of existing quantifiable metrics that describe text instances in NLP tasks, among dimensions of difficulty, diversity, and disagreement. We selected several current adversarial effect datasets and compared the distributions between the original and their adversarial counterparts. The results provide valuable insights into what makes these datasets more challenging from a metrics perspective and whether they align with underlying assumptions.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2311.03566

Country: North America > United States > Minnesota (0.28)

Genre:

Research Report (1.00)
Overview (0.88)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

DataPerf: Benchmarks for Data-Centric AI Development

Mazumder, Mark, Banbury, Colby, Yao, Xiaozhe, Karlaš, Bojan, Rojas, William Gaviria, Diamos, Sudnya, Diamos, Greg, He, Lynn, Parrish, Alicia, Kirk, Hannah Rose, Quaye, Jessica, Rastogi, Charvi, Kiela, Douwe, Jurado, David, Kanter, David, Mosquera, Rafael, Ciro, Juan, Aroyo, Lora, Acun, Bilge, Chen, Lingjiao, Raje, Mehul Smriti, Bartolo, Max, Eyuboglu, Sabri, Ghorbani, Amirata, Goodman, Emmett, Inel, Oana, Kane, Tariq, Kirkpatrick, Christine R., Kuo, Tzu-Sheng, Mueller, Jonas, Thrush, Tristan, Vanschoren, Joaquin, Warren, Margaret, Williams, Adina, Yeung, Serena, Ardalani, Newsha, Paritosh, Praveen, Bat-Leah, Lilith, Zhang, Ce, Zou, James, Wu, Carole-Jean, Coleman, Cody, Ng, Andrew, Mattson, Peter, Reddi, Vijay Janapa

arXiv.org Artificial IntelligenceOct-13-2023

Machine learning research has long focused on models rather than datasets, and prominent datasets are used for common ML tasks without regard to the breadth, difficulty, and faithfulness of the underlying problems. Neglecting the fundamental importance of data has given rise to inaccuracy, bias, and fragility in real-world applications, and research is hindered by saturation across existing dataset benchmarks. In response, we present DataPerf, a community-led benchmark suite for evaluating ML datasets and data-centric algorithms. We aim to foster innovation in data-centric AI through competition, comparability, and reproducibility. We enable the ML community to iterate on datasets, instead of just architectures, and we provide an open, online platform with multiple rounds of challenges to support this iterative development. The first iteration of DataPerf contains five benchmarks covering a wide spectrum of data-centric techniques, tasks, and modalities in vision, speech, acquisition, debugging, and diffusion prompting, and we support hosting new contributed benchmarks from the community. The benchmarks, online evaluation platform, and baseline implementations are open source, and the MLCommons Association will maintain DataPerf to ensure long-term benefits to academia and industry.

benchmark, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2207.10062

Country: North America > United States (0.28)

Genre: Research Report > Promising Solution (0.46)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(3 more...)

Add feedback

LLMs as Workers in Human-Computational Algorithms? Replicating Crowdsourcing Pipelines with LLMs

Wu, Tongshuang, Zhu, Haiyi, Albayrak, Maya, Axon, Alexis, Bertsch, Amanda, Deng, Wenxing, Ding, Ziqi, Guo, Bill, Gururaja, Sireesh, Kuo, Tzu-Sheng, Liang, Jenny T., Liu, Ryan, Mandal, Ihita, Milbauer, Jeremiah, Ni, Xiaolin, Padmanabhan, Namrata, Ramkumar, Subhashini, Sudjianto, Alexis, Taylor, Jordan, Tseng, Ying-Jui, Vaidos, Patricia, Wu, Zhijin, Wu, Wei, Yang, Chenyang

arXiv.org Artificial IntelligenceJul-19-2023

LLMs have shown promise in replicating human-like behavior in crowdsourcing tasks that were previously thought to be exclusive to human abilities. However, current efforts focus mainly on simple atomic tasks. We explore whether LLMs can replicate more complex crowdsourcing pipelines. We find that modern LLMs can simulate some of crowdworkers' abilities in these "human computation algorithms," but the level of success is variable and influenced by requesters' understanding of LLM capabilities, the specific skills required for sub-tasks, and the optimal interaction modality for performing these sub-tasks. We reflect on human and LLMs' different sensitivities to instructions, stress the importance of enabling human-facing safeguards for LLMs, and discuss the potential of training humans and LLMs with complementary skill sets. Crucially, we show that replicating crowdsourcing pipelines offers a valuable platform to investigate (1) the relative strengths of LLMs on different tasks (by cross-comparing their performances on sub-tasks) and (2) LLMs' potential in complex tasks, where they can complete part of the tasks while leaving others to humans.

artificial intelligence, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2307.10168

Country:

Europe (0.67)
North America > United States > California (0.28)

Genre: Research Report (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Understanding Frontline Workers' and Unhoused Individuals' Perspectives on AI Used in Homeless Services

Kuo, Tzu-Sheng, Shen, Hong, Geum, Jisoo, Jones, Nev, Hong, Jason I., Zhu, Haiyi, Holstein, Kenneth

arXiv.org Artificial IntelligenceMar-16-2023

Recent years have seen growing adoption of AI-based decision-support systems (ADS) in homeless services, yet we know little about stakeholder desires and concerns surrounding their use. In this work, we aim to understand impacted stakeholders' perspectives on a deployed ADS that prioritizes scarce housing resources. We employed AI lifecycle comicboarding, an adapted version of the comicboarding method, to elicit stakeholder feedback and design ideas across various components of an AI system's design. We elicited feedback from county workers who operate the ADS daily, service providers whose work is directly impacted by the ADS, and unhoused individuals in the region. Our participants shared concerns and design suggestions around the AI system's overall objective, specific model design choices, dataset selection, and use in deployment. Our findings demonstrate that stakeholders, even without AI knowledge, can provide specific and critical feedback on an AI system's design and deployment, if empowered to do so.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3544548.3580882

2303.09743

Country: North America > United States (1.00)

Genre: Research Report > New Finding (1.00)

Industry:

Law (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.68)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining (0.67)

Add feedback