Goto

Collaborating Authors

 Generative AI


California examines benefits, risks of using artificial intelligence in state government

Los Angeles Times

Artificial intelligence that can generate text, images and other content could help improve state programs but also poses risks, according to a report released by the governor's office on Tuesday. Generative AI could help quickly translate government materials into multiple languages, analyze tax claims to detect fraud, summarize public comments and answer questions about state services. Still, deploying the technology, the analysis warned, also comes with concerns around data privacy, misinformation, equity and bias. "When used ethically and transparently, GenAI has the potential to dramatically improve service delivery outcomes and increase access to and utilization of government programs," the report stated. The 34-page report, ordered by Gov. Gavin Newsom, provides a glimpse into how California could apply the technology to state programs even as lawmakers grapple with how to protect people without hindering innovation.


FOX Sports expands Google Cloud partnership, generative AI to automate archived sports video search

FOX News

Fox News Flash top sports headlines are here. Check out what's clicking on Foxnews.com. Over the past nearly three decades, FOX Sports, a unit of FOX Corp., parent to Fox News and FOX Business, has accumulated a countless amount of video footage. Millions of hours' worth of sports-related content live within vast archives. At any given time, various individuals have been tasked with sorting through the seemingly endless amount of footage in order to produce new pieces of content.


Can You Trust a Robot With Your Holiday Shopping?

WSJ.com: WSJD - Technology

Artificial intelligence could upend the way we shop for loved ones around the holidays. It's just not ready to do everything yet. A third of people say they plan to use ChatGPT or some other generative AI while holiday shopping this year, according to a September analytics survey by Adobe. Those surveyed said they wanted AI to surface good deals, brand recommendations and gift alternatives.


Gender inference: can chatGPT outperform common commercial tools?

arXiv.org Artificial Intelligence

An increasing number of studies use gender information to understand phenomena such as gender bias, inequity in access and participation, or the impact of the Covid pandemic response. Unfortunately, most datasets do not include self-reported gender information, making it necessary for researchers to infer gender from other information, such as names or names and country information. An important limitation of these tools is that they fail to appropriately capture the fact that gender exists on a non-binary scale, however, it remains important to evaluate and compare how well these tools perform in a variety of contexts. In this paper, we compare the performance of a generative Artificial Intelligence (AI) tool ChatGPT with three commercially available list-based and machine learning-based gender inference tools (Namsor, Gender-API, and genderize.io) on a unique dataset. Specifically, we use a large Olympic athlete dataset and report how variations in the input (e.g., first name and first and last name, with and without country information) impact the accuracy of their predictions. We report results for the full set, as well as for the subsets: medal versus non-medal winners, athletes from the largest English-speaking countries, and athletes from East Asia. On these sets, we find that Namsor is the best traditional commercially available tool. However, ChatGPT performs at least as well as Namsor and often outperforms it, especially for the female sample when country and/or last name information is available. All tools perform better on medalists versus non-medalists and on names from English-speaking countries. Although not designed for this purpose, ChatGPT may be a cost-effective tool for gender prediction. In the future, it might even be possible for ChatGPT or other large scale language models to better identify self-reported gender rather than report gender on a binary scale.


Who is leading in AI? An analysis of industry AI research

arXiv.org Artificial Intelligence

AI research is increasingly industry-driven, making it crucial to understand company contributions to this field. We compare leading AI companies by research publications, citations, size of training runs, and contributions to algorithmic innovations. Our analysis reveals the substantial role played by Google, OpenAI and Meta. We find that these three companies have been responsible for some of the largest training runs, developed a large fraction of the algorithmic innovations that underpin large language models, and led in various metrics of citation impact. In contrast, leading Chinese companies such as Tencent and Baidu had a lower impact on many of these metrics compared to US counterparts. We observe many industry labs are pursuing large training runs, and that training runs from relative newcomers -- such as OpenAI and Anthropic -- have matched or surpassed those of long-standing incumbents such as Google. The data reveals a diverse ecosystem of companies steering AI progress, though US labs such as Google, OpenAI and Meta lead across critical metrics.


Potential Societal Biases of ChatGPT in Higher Education: A Scoping Review

arXiv.org Artificial Intelligence

ChatGPT and other Generative Artificial Intelligence (GAI) models tend to inherit and even amplify prevailing societal biases as they are trained on large amounts of existing data. Given the increasing usage of ChatGPT and other GAI by students, faculty members, and staff in higher education institutions (HEIs), there is an urgent need to examine the ethical issues involved such as its potential biases. In this scoping review, we clarify the ways in which biases related to GAI in higher education settings have been discussed in recent academic publications and identify what type of potential biases are commonly reported in this body of literature. We searched for academic articles written in English, Chinese, and Japanese across four main databases concerned with GAI usage in higher education and bias. Our findings show that while there is an awareness of potential biases around large language models (LLMs) and GAI, the majority of articles touch on ``bias'' at a relatively superficial level. Few identify what types of bias may occur under what circumstances. Neither do they discuss the possible implications for the higher education, staff, faculty members, or students. There is a notable lack of empirical work at this point, and we call for higher education researchers and AI experts to conduct more research in this area.


Ethical implications of ChatGPT in higher education: A scoping review

arXiv.org Artificial Intelligence

This scoping review explores the ethical challenges of using ChatGPT in education, focusing particularly on issues related to higher education. By reviewing recent academic articles written in English, Chinese, and Japanese, we aimed to provide a comprehensive overview of relevant research while identifying gaps for future considerations. Drawing on Arksey and O'Malley's (2005) five-stage scoping review framework, we identified research questions, search terms, and conducted article search from four databases in the target three languages. Each article was reviewed by at least two researchers identifying the main ethical issues of utilizing AI in education, particularly higher education. Our analysis of ethical issues followed the framework developed by DeepMind (Weiginger et al., 2021) to identify six main areas of ethical concern in Language Models. The majority of papers were concerned with misinformation harms (n=25) and/or human-computer interaction related harms (n=24). Given the rapid deployment of Generative Artificial Intelligence (GAI), it is imperative for educators to conduct more empirical studies to develop sound ethical policies for the use of GAI.


One Pass Streaming Algorithm for Super Long Token Attention Approximation in Sublinear Space

arXiv.org Machine Learning

Deploying Large Language Models (LLMs) in streaming applications that involve long contexts, particularly for extended dialogues and text analysis, is of paramount importance but presents two significant challenges. Firstly, the memory consumption is substantial during the decoding phase due to the caching of Key and Value states (KV) of previous tokens. Secondly, attention computation is time-consuming with a time complexity of $O(n^2)$ for the generation of each token. In recent OpenAI DevDay (Nov 6, 2023), OpenAI released a new model that is able to support a 128K-long document, in our paper, we focus on the memory-efficient issue when context length $n$ is much greater than 128K ($n \gg 2^d$). Considering a single-layer self-attention with Query, Key, and Value matrices $Q, K, V \in \mathbb{R}^{n \times d}$, the polynomial method approximates the attention output $T \in \mathbb{R}^{n \times d}$. It accomplishes this by constructing $U_1, U_2 \in \mathbb{R}^{n \times t}$ to expedite attention ${\sf Attn}(Q, K, V)$ computation within $n^{1+o(1)}$ time executions. Despite this, storing the Key and Value matrices $K, V \in \mathbb{R}^{n \times d}$ still necessitates $O( n d)$ space, leading to significant memory usage. In response to these challenges, we introduce a new algorithm that only reads one pass of the data in streaming fashion. This method employs sublinear space $o(n)$ to store three sketch matrices, alleviating the need for exact $K, V$ storage. Notably, our algorithm exhibits exceptional memory-efficient performance with super-long tokens. As the token length $n$ increases, our error guarantee diminishes while the memory usage remains nearly constant. This unique attribute underscores the potential of our technique in efficiently handling LLMs in streaming applications.


OpenAI's directors have been anything but open. What the hell happened?

The Guardian

The OpenAI farce has moved at such speed in the past week that it is easy to forget that nobody has yet said in clear terms why Sam Altman – the returning chief executive and all-round genius, according to his vocal fanclub – was fired in the first place. Since we are constantly told, not least by Altman himself, that the worst outcome from the adoption of artificial general intelligence could be "lights out for all of us", somebody needs to find a voice here. If the old board judged, for example, that Altman was unfit for the job because he was taking OpenAI down a reckless path, lights-wise, there would plainly be an obligation to speak up. Or, if the fear is unfounded, the architects of the failed boardroom coup could do everybody a favour and say so. Saying nothing useful, especially when your previous stance has been that transparency and safety go hand in hand, is indefensible.


'Huge egos are in play': behind the firing and rehiring of OpenAI's Sam Altman

The Guardian

OpenAI's messy firing and re-hiring of its powerful chief executive this week shocked the tech world. But the power struggle has implications beyond the company's boardroom, AI experts said. It throws into relief the greenness of the AI industry and the strong desire in Silicon Valley to be first, and raises urgent questions about the safety of the technology. "The AI that we're looking at now is immature. There are no standards, no professional body, no certifications. Everybody figures out how to do it, figures out their own internal norms," said Rayid Ghani, a professor of machine learning and public policy at Carnegie Mellon University.