Goto

Collaborating Authors

 pittman


Towards a Method for Synthetic Generation of Persons with Aphasia Transcripts

arXiv.org Artificial Intelligence

Towards a Method for Synthetic Generation of Persons with Aphasia Transcripts Jason M. Pittman1, Anton Phillips Jr.2, Yesenia Medina-Santos2, Brielle C. Stark2 1University of Maryland Global Campus 2Indiana University Bloomington, Department of Speech, Language and Hearing Sciences ABSTRACT In aphasia research, Speech-Language Pathologists (SLPs) devote extensive time to manually coding speech samples using Correct Information Units (CIUs), a measure of how informative an individual sample of speech is. Developing automated systems to recognize aphasic language is limited by data scarcity. For example, only about 600 transcripts are available in AphasiaBank yet billions of tokens are used to train large language models (LLMs). In the broader field of machine learning (ML), researchers increasingly turn to synthetic data when such are sparse. Therefore, this study constructs and validates two methods to generate synthetic transcripts of the AphasiaBank Cat Rescue picture description task. One method leverages a procedural programming approach while the second uses Mistral 7b Instruct and Llama 3.1 8b Instruct LLMs. The methods generate transcripts across four severity levels (Mild, Moderate, Severe, Very Severe) through word dropping, filler insertion, and paraphasia substitution. Overall, we found, compared to human-elicited transcripts, Mistral 7b Instruct best captures key aspects of linguistic degradation observed in aphasia, showing realistic directional changes in NDW, word count, and word length amongst the synthetic generation methods. Based on the results, future work should plan to create a larger dataset, fine-tune models for better aphasic representation, and have SLPs assess the realism and usefulness of the synthetic transcripts. Keywords: aphasia, synthetic data, natural language processing, machine learning Introduction Per Nicholas and Brookshire (1993), coding Correct Information Units (CIUs) involves transcribing a connected speech sample verbatim, counting all intelligible words, and then identifying each word that is intelligible, accurate, relevant, and informative about the topic as a CIU--excluding fillers, repetitions, and tangential remarks. From these counts, clinicians calculate the percentage of CIUs and CIUs per minute to quantify communicative informativeness and efficiency.


Facebook and Matterport collaborate on realistic virtual training environments for AI โ€“ TechCrunch

#artificialintelligence

To train a robot to navigate a house, you either need to give it a lot of real time in a lot of real houses, or a lot of virtual time in a lot of virtual houses. The latter is definitely the better option, and Facebook and Matterport are working together to make thousands of virtual, interactive digital twins of real spaces available for researchers and their voracious young AIs. On Facebook's side the big advance is in two parts: the new Habitat 2.0 training environment and the dataset they created to enable it. You may remember Habitat from a couple years back; in the pursuit of what it calls "embodied AI," which is to say AI models that interact with the real world, Facebook assembled a number of passably photorealistic virtual environments for them to navigate. Many robots and AIs have learned things like movement and object recognition in idealized, unrealistic spaces that resemble games more than reality.


Virtual house hunting gets a pandemic boost

BBC News

Temporarily forgetting she is sitting beside me, I shout to my wife: "I'm in the children's bedroom." We can't go to the Republic of Ireland ourselves to do this. Travellers from Great Britain need to restrict their movements for a fortnight, so nipping over and back is off the cards. But I can take several paces through a virtual seaside flat in Dublin's Dรบn Laoghaire, while based in our south London home. Circles appear on the floor of the Dublin flat and, using hand controls, I can glide between them and explore.


Finding the Root - Jason M. Pittman

#artificialintelligence

You may have thought we were done with decisions trees. I am done with respect to discussing general approaches and types of problems. You could say that we're moving from a view of the forest, to finding the root for our tree. However, there is a bit more to explore when it comes to the underlying mathematical functions associated with navigating data to construct our trees. In our last discussion, I introduced the concept of a cost function and gave a specific example in the Gini coefficient.


Artificial Intelligence Dominates The Retail Conversation At Shoptalk Europe

#artificialintelligence

Target is using Pinterest's Lens visual search technology If there was one overarching term at Shoptalk Europe this week, it was artificial intelligence. From machine learning to visual search, natural language processing and more, the role of systems that facilitate smarter and more personalized customer experiences was key. Keynote talks from Google, Alibaba, Westfield and more all referenced such a focus, with repeats of numerous big stats bandied about in terms of where this space is moving. By 2020, 85% of customer interaction in retail will be managed by AI, according to Gartner, multiple speakers said. And 30% of all companies will employ AI to augment at least one of their primary sales processes by the same time period, they further added.


eBay's new Google Home chatbot can tell you how much your stuff is worth

PCWorld

Determining the value of some of that old gear taking up space in your closet or garage might become a lot easier if you have a Google Home. In an onstage demo at Google Cloud Next, eBay chief product officer RJ Pittman showed how the online auctioneer might tie into Google's digital assistant. He started in by asking the bot if eBay could find the value of his Canon digital camera. You can ask me what something is worth," the bot introduced itself. The chatbot asked a couple of follow-up questions, including the model of the camera (it was an EOS 5D), if it was new, and its overall condition.


What to see in L.A. galleries: An ode to a black sci-fi trailblazer and Lari Pittman 'Mood Books'

Los Angeles Times

The best part of "Radio Imagination: Artists in the Archive of Octavia E. Butler" is the view it provides into Butler's archive itself. Butler, who died in 2006, was a bestselling Pasadena novelist and the only science fiction writer to win a MacArthur fellowship. She was also African American, and her novels reworked the sci-fi genre with far-reaching insights on race, sex and gender. The exhibition at the Armory Center for the Arts in Pasadena features works by eight artists who were granted access to Butler's archives at the Huntington Library in San Marino. Their responses take a variety of forms, including sound and video as well as photography, drawing and installation.


EBay eyes 'huge opportunities' to personalize shopping through artificial intelligence

#artificialintelligence

LAS VEGAS--Cutting-edge artificial intelligence (AI) technologies are the keys to delivering richer, more sophisticated shopping experiences, eBay Chief Product Officer RJ Pittman said during a presentation Tuesday here at the Shoptalk 2016 retail conference. "I've yet to see great, personalized shopping experiences at scale," Pittman said. "We think there's huge opportunities to move forward through AI, machine learning and predictive modeling." Pittman cited the potential commerce benefits to be gained from AI technologies like natural language understanding, which enables software systems to understand human speech as it is spoken. For example, by processing the information in a shopper statement like "Going with my wife on a camping trip in Tahoe next month, need a tent," eBay would automatically suggest a tent large enough for two people that's also appropriate for Tahoe's altitude and average temperatures in June.