AITopics | Potosí

Collaborating Authors

Potosí

Fairness-in-the-Workflow: How Machine Learning Practitioners at Big Tech Companies Approach Fairness in Recommender Systems

Yan, Jing Nathan, Harvey, Emma, Wang, Junxiong, Rzeszotarski, Jeffrey M., Koenecke, Allison

arXiv.org Artificial IntelligenceSep-22-2025

Recommender systems (RS), which are widely deployed across high-stakes domains, are susceptible to biases that can cause large-scale societal impacts. Researchers have proposed methods to measure and mitigate such biases -- but translating academic theory into practice is inherently challenging. RS practitioners must balance the competing interests of diverse stakeholders, including providers and users, and operate in dynamic environments. Through a semi-structured interview study (N=11), we map the RS practitioner workflow within large technology companies, focusing on how technical teams consider fairness internally and in collaboration with other (legal, data, and fairness) teams. We identify key challenges to incorporating fairness into existing RS workflows: defining fairness in RS contexts, particularly when navigating multi-stakeholder and dynamic fairness considerations. We also identify key organization-wide challenges: making time for fairness work and facilitating cross-team communication. Finally, we offer actionable recommendations for the RS community, including HCI researchers and practitioners.

artificial intelligence, machine learning, social media, (14 more...)

arXiv.org Artificial Intelligence

2505.19441

Country:

Europe > Austria > Vienna (0.14)
North America > United States > New York > New York County > New York City (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture > Yokohama (0.04)
(13 more...)

Genre:

Workflow (1.00)
Research Report (1.00)
Questionnaire & Opinion Survey (1.00)
Personal > Interview (1.00)

Industry:

Law (1.00)
Information Technology > Services (0.67)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(2 more...)

Add feedback

Rethinking LLM Bias Probing Using Lessons from the Social Sciences

Morehouse, Kirsten N., Swaroop, Siddharth, Pan, Weiwei

arXiv.org Artificial IntelligenceFeb-28-2025

The proliferation of LLM bias probes introduces three significant challenges: (1) we lack principled criteria for choosing appropriate probes, (2) we lack a system for reconciling conflicting results across probes, and (3) we lack formal frameworks for reasoning about when (and why) probe results will generalize to real user behavior. We address these challenges by systematizing LLM social bias probing using actionable insights from social sciences. We then introduce EcoLevels - a framework that helps (a) determine appropriate bias probes, (b) reconcile conflicting findings across probes, and (c) generate predictions about bias generalization. Overall, we ground our analysis in social science research because many LLM probes are direct applications of human probes, and these fields have faced similar challenges when studying social bias in humans. Based on our work, we suggest how the next generation of LLM bias probing can (and should) benefit from decades of social science research.

bias probe, probe, rethinking llm bias, (14 more...)

arXiv.org Artificial Intelligence

2503.00093

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
South America > Bolivia > Potosí Department > Tomás Frías Province > Potosí (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(11 more...)

Genre:

Research Report (1.00)
Overview (0.93)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

NYT-Connections: A Deceptively Simple Text Classification Task that Stumps System-1 Thinkers

Lopez, Angel Yahir Loredo, McDonald, Tyler, Emami, Ali

arXiv.org Artificial IntelligenceDec-2-2024

Large Language Models (LLMs) have shown impressive performance on various benchmarks, yet their ability to engage in deliberate reasoning remains questionable. We present NYT-Connections, a collection of 358 simple word classification puzzles derived from the New York Times Connections game. This benchmark is designed to penalize quick, intuitive "System 1" thinking, isolating fundamental reasoning skills. We evaluated six recent LLMs, a simple machine learning heuristic, and humans across three configurations: single-attempt, multiple attempts without hints, and multiple attempts with contextual hints. Our findings reveal a significant performance gap: even top-performing LLMs like GPT-4 fall short of human performance by nearly 30%. Notably, advanced prompting techniques such as Chain-of-Thought and Self-Consistency show diminishing returns as task difficulty increases. NYT-Connections uniquely combines linguistic isolation, resistance to intuitive shortcuts, and regular updates to mitigate data leakage, offering a novel tool for assessing LLM reasoning capabilities.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2412.01621

Country:

South America > Bolivia > Potosí Department > Tomás Frías Province > Potosí (0.04)
North America > Mexico > San Luis Potosí (0.04)
North America > Mexico > Mexico City > Mexico City (0.04)
(4 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

TrustUQA: A Trustful Framework for Unified Structured Data Question Answering

Zhang, Wen, Jin, Long, Zhu, Yushan, Chen, Jiaoyan, Huang, Zhiwei, Wang, Junjie, Hua, Yin, Liang, Lei, Chen, Huajun

arXiv.org Artificial IntelligenceJun-27-2024

Natural language question answering (QA) over structured data sources such as tables and knowledge graphs (KGs) have been widely investigated, for example with Large Language Models (LLMs). The main solutions include question to formal query parsing and retrieval-based answer generation. However, current methods of the former often suffer from weak generalization, failing to dealing with multiple sources simultaneously, while the later is limited in trustfulness. In this paper, we propose UnifiedTQA, a trustful QA framework that can simultaneously support multiple types of structured data in a unified way. To this end, it adopts an LLM-friendly and unified knowledge representation method called Condition Graph (CG), and uses an LLM and demonstration-based two-level method for CG querying. For enhancement, it is also equipped with dynamic demonstration retrieval. We have evaluated UnifiedTQA with 5 benchmarks covering 3 types of structured data. It outperforms 2 existing unified structured data QA methods and in comparison with the baselines that are specific to a data type, it achieves state-of-the-art on 2 of them. Further more, we demonstrates potential of our method for more general QA tasks, QA over mixed structured data and QA across structured data.

information, query, relation, (16 more...)

arXiv.org Artificial Intelligence

2406.18916

Country:

Europe > Russia (0.14)
Asia > Russia (0.14)
Europe > United Kingdom > England > Tyne and Wear > Sunderland (0.04)
(20 more...)

Genre:

Research Report (0.63)
Personal > Honors (0.46)

Industry:

Media > Film (1.00)
Government (0.93)
Leisure & Entertainment > Sports > Soccer (0.67)
Leisure & Entertainment > Sports > Football (0.46)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

FREB-TQA: A Fine-Grained Robustness Evaluation Benchmark for Table Question Answering

Zhou, Wei, Mesgar, Mohsen, Adel, Heike, Friedrich, Annemarie

arXiv.org Artificial IntelligenceApr-29-2024

Table Question Answering (TQA) aims at composing an answer to a question based on tabular data. While prior research has shown that TQA models lack robustness, understanding the underlying cause and nature of this issue remains predominantly unclear, posing a significant obstacle to the development of robust TQA systems. In this paper, we formalize three major desiderata for a fine-grained evaluation of robustness of TQA systems. They should (i) answer questions regardless of alterations in table structure, (ii) base their responses on the content of relevant cells rather than on biases, and (iii) demonstrate robust numerical reasoning capabilities. To investigate these aspects, we create and publish a novel TQA evaluation benchmark in English. Our extensive experimental analysis reveals that none of the examined state-of-the-art TQA systems consistently excels in these three aspects. Our benchmark is a crucial instrument for monitoring the behavior of TQA systems and paves the way for the development of robust TQA systems. We release our benchmark publicly.

perturbation, robustness, tqa system, (14 more...)

arXiv.org Artificial Intelligence

2404.18585

Country:

Europe > United Kingdom > England (0.28)
Asia > China (0.04)
South America > Bolivia > Potosí Department > Tomás Frías Province > Potosí (0.04)
(25 more...)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Sports (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.98)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Unlock the Future of Autonomous Drones with Innovative Secure Runtime Assurance (SRTA)

IEEE Spectrum RoboticsApr-18-2024, 15:52:55 GMT

By submitting this content request, I have legitimate interest in the content and agree that Technology Innovation Institute, their partners, and the creators of any other content I have selected may contact me regarding news, products, and services that may be of interest to me. By submitting this content request, I have legitimate interest in the content and agree that Technology Innovation Institute, their partners, and the creators of any other content I have selected may contact me regarding news, products, and services that may be of interest to me. I agree to the IEEE Privacy Policy Are you an IEEE member?

autonomous drone, information, innovative secure runtime assurance, (9 more...)

IEEE Spectrum Robotics

Country:

Oceania > Australia > Australian Indian Ocean Territories > Territory of Cocos (Keeling) Islands (0.15)
Asia > China > Hong Kong (0.15)
Oceania > Samoa (0.07)
(285 more...)

Industry:

Health & Medicine (0.49)
Consumer Products & Services (0.49)
Government (0.31)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.43)

Add feedback

Small Area Estimation of Case Growths for Timely COVID-19 Outbreak Detection

She, Zhaowei, Wang, Zilong, Chhatwal, Jagpreet, Ayer, Turgay

arXiv.org Machine LearningDec-7-2023

The COVID-19 pandemic has exerted a profound impact on the global economy and continues to exact a significant toll on human lives. The COVID-19 case growth rate stands as a key epidemiological parameter to estimate and monitor for effective detection and containment of the resurgence of outbreaks. A fundamental challenge in growth rate estimation and hence outbreak detection is balancing the accuracy-speed tradeoff, where accuracy typically degrades with shorter fitting windows. In this paper, we develop a machine learning (ML) algorithm, which we call Transfer Learning Generalized Random Forest (TLGRF), that balances this accuracy-speed tradeoff. Specifically, we estimate the instantaneous COVID-19 exponential growth rate for each U.S. county by using TLGRF that chooses an adaptive fitting window size based on relevant day-level and county-level features affecting the disease spread. Through transfer learning, TLGRF can accurately estimate case growth rates for counties with small sample sizes. Out-of-sample prediction analysis shows that TLGRF outperforms established growth rate estimation methods. Furthermore, we conducted a case study based on outbreak case data from the state of Colorado and showed that the timely detection of outbreaks could have been improved by up to 224% using TLGRF when compared to the decisions made by Colorado's Department of Health and Environment (CDPHE). To facilitate implementation, we have developed a publicly available outbreak detection tool for timely detection of COVID-19 outbreaks in each U.S. county, which received substantial attention from policymakers.

artificial intelligence, estimator, machine learning, (19 more...)

arXiv.org Machine Learning

2312.0411

Country:

North America > United States > Maryland (0.14)
North America > United States > Colorado > Mesa County (0.04)
North America > United States > Washington > Kittitas County (0.04)
(13 more...)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Epidemiology (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

Bridging Background Knowledge Gaps in Translation with Automatic Explicitation

Han, HyoJung, Boyd-Graber, Jordan Lee, Carpuat, Marine

arXiv.org Artificial IntelligenceDec-3-2023

Translations help people understand content written in another language. However, even correct literal translations do not fulfill that goal when people lack the necessary background to understand them. Professional translators incorporate explicitations to explain the missing context by considering cultural differences between source and target audiences. Despite its potential to help users, NLP research on explicitation is limited because of the dearth of adequate evaluation methods. This work introduces techniques for automatically generating explicitations, motivated by WikiExpl: a dataset that we collect from Wikipedia and annotate with human translators. The resulting explicitations are useful as they help answer questions more accurately in a multilingual question answering framework.

computational linguistic, explicitation, translation, (14 more...)

arXiv.org Artificial Intelligence

2312.01308

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > France (0.05)
Asia > Middle East > Jordan (0.04)
(26 more...)

Genre: Research Report (0.50)

Industry:

Media (0.67)
Government (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.93)

Add feedback

Experimental Assessment of a Forward-Collision Warning System Fusing Deep Learning and Decentralized Radio Sensing

Cardenas, Jorge D., Contreras-Ponce, Omar, Gutierrez, Carlos A., Aguilar-Ponce, Ruth, Castillo-Soria, Francisco R., Azurdia-Meza, Cesar A.

arXiv.org Artificial IntelligenceSep-15-2023

This paper presents the idea of an automatic forward-collision warning system based on a decentralized radio sensing (RS) approach. In this framework, a vehicle in receiving mode employs a continuous waveform (CW) transmitted by a second vehicle as a probe signal to detect oncoming vehicles and warn the driver of a potential forward collision. Such a CW can easily be incorporated as a pilot signal within the data frame of current multicarrier vehicular communication systems. Detection of oncoming vehicles is performed by a deep learning (DL) module that analyzes the features of the Doppler signature imprinted on the CW probe signal by a rapidly approaching vehicle. This decentralized CW RS approach was assessed experimentally using data collected by a series of field trials conducted in a two-lanes high-speed highway. Detection performance was evaluated for two different DL models: a long short-term memory network and a convolutional neural network. The obtained results demonstrate the feasibility of the envisioned forward-collision warning system based on the fusion of DL and decentralized CW RS.

doppler signature, frequency, vehicle, (14 more...)

arXiv.org Artificial Intelligence

2309.08737

Country:

South America > Bolivia > Potosí Department > Tomás Frías Province > Potosí (0.05)
North America > Mexico > San Luis Potosí (0.05)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(8 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Information Technology (1.00)
Telecommunications (0.93)
Commercial Services & Supplies > Security & Alarm Services (0.90)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Enhancing Translation for Indigenous Languages: Experiments with Multilingual Models

Tonja, Atnafu Lambebo, Nigatu, Hellina Hailu, Kolesnikova, Olga, Sidorov, Grigori, Gelbukh, Alexander, Kalita, Jugal

arXiv.org Artificial IntelligenceMay-27-2023

This paper describes CIC NLP's submission to the AmericasNLP 2023 Shared Task on machine translation systems for indigenous languages of the Americas. We present the system descriptions for three methods. We used two multilingual models, namely M2M-100 and mBART50, and one bilingual (one-to-one) -- Helsinki NLP Spanish-English translation model, and experimented with different transfer learning setups. We experimented with 11 languages from America and report the setups we used as well as the results we achieved. Overall, the mBART setup was able to improve upon the baseline for three out of the eleven languages.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2305.17406

Country:

Europe > Finland > Uusimaa > Helsinki (0.27)
South America > Paraguay (0.14)
South America > Peru (0.05)
(27 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback