Goto

Collaborating Authors

 Cramer, Henriette


Sociotechnical Implications of Generative Artificial Intelligence for Information Access

arXiv.org Artificial Intelligence

Robust access to trustworthy information is a critical need for society including implications for knowledge production, public health education, and promoting informed citizenry in democratic societies. Generative AI technologies such as large language models (LLMs) may enable new ways to access information and improve effectiveness of existing information retrieval (IR) systems. More efficient basic task execution with the help of LLMs can also enable people to focus on the more challenging aspects of information retrieval related tasks and research. However, the long-term social implications of deploying these technologies in the context of information access are not yet well-understood. Existing research has focused on how these models may generate biased and harmful content [11, 23, 69, 80, 124, 158, 236] as well as the environmental costs [23, 31, 61, 166, 167, 241] of developing and deploying these models at scale. In the context of information access, Shah and Bender [187] have argued that certain framings of LLMs as "search engines" lack the necessary theoretical underpinnings and may constitute as a category error. In this current work, we present a broader perspective on the sociotechnical implications of generative AI for information access. Our perspective is informed by existing literature and aims to provide a summary of known challenges viewed through a systemic lens that we hope will serve as a useful resource for future critical research in this area. We present a summary of these implications next followed by recommendations for evaluation and mitigation later in this chapter.


Challenges and Methods in Design of Domain-specific Voice Assistants

AAAI Conferences

Most of the currently existing voice assistants, like Alexa, Siri, Google Assistant, and Cortana, are generalists. They act as a unifying voice interface to a myriad of controls but rarely support domain-specific expert functionalities. There are efforts to provide more targeted assistant experiences and capabilities around specific areas of applications. In this paper, we discuss several challenges and opportunities in the design of domain-specific voice assistants. We outline a variety of methods to create and utilize an understanding of domain-specific user language and ideas to prototype and study the envisioned user experiences.


Describing and Understanding Neighborhood Characteristics through Online Social Media

arXiv.org Machine Learning

Geotagged data can be used to describe regions in the world and discover local themes. However, not all data produced within a region is necessarily specifically descriptive of that area. To surface the content that is characteristic for a region, we present the geographical hierarchy model (GHM), a probabilistic model based on the assumption that data observed in a region is a random mixture of content that pertains to different levels of a hierarchy. We apply the GHM to a dataset of 8 million Flickr photos in order to discriminate between content (i.e., tags) that specifically characterizes a region (e.g., neighborhood) and content that characterizes surrounding areas or more general themes. Knowledge of the discriminative and non-discriminative terms used throughout the hierarchy enables us to quantify the uniqueness of a given region and to compare similar but distant regions. Our evaluation demonstrates that our model improves upon traditional Naive Bayes classification by 47% and hierarchical TF-IDF by 27%. We further highlight the differences and commonalities with human reasoning about what is locally characteristic for a neighborhood, distilled from ten interviews and a survey that covered themes such as time, events, and prior regional knowledge