Unsupervised Speech Representation Learning for Behavior Modeling using Triplet Enhanced Contextualized Networks
Li, Haoqi, Baucom, Brian, Narayanan, Shrikanth, Georgiou, Panayiotis
–arXiv.org Artificial Intelligence
Human behavior refers to the way humans act and interact in response to a stimulus, internal or external. Understanding human behavior through observational study is one of the core methodologies in fields such as psychology and sociology (Margolin, Oliver, Gordis, O'hearn, Medina, Ghosh and Morland, 1998). Human behaviors encompass rich information: from emotional expression, processing, and regulation to the intricate dynamics of interactions, including the context and knowledge of interlocutors and their thinking and problem-solving intent (Li, Baucom and Georgiou, 2020). Furthermore, the behavioral constructs of interest are often dependent on the domain of interaction (Narayanan and Georgiou, 2013). Hence characterization of human behavior usually requires domain-specific knowledge and adequate windows of observation. Notably, across psychological health science and practice (Bone, Lee, Chaspari, Gibson and Narayanan, 2017) such as couple therapy (Christensen, Atkins, Berns, Wheeler, Baucom and Simpson, 2004), suicide cognition evaluation (Bryan, Rudd, Wertenberger, Etienne, Ray-Sannerud, Morrow, Peterson and Young-McCaughon, 2014) and addiction counseling (Xiao, Imel, Georgiou, Atkins and Narayanan, 2015), this is exemplified in the definition and derivation of a variety of domain-specific behavior constructs (e.g., blame and affect patterns exhibited by partners, suicidal ideation of an individual at risk, and empathy expressed by a therapist in the respective aforementioned domains) to support specific subsequent plan of action. Human speech offers rich information about the mental state and traits of the talkers. Vocal cues, including speech and spoken language as well as nonverbal vocalizations and disfluency patterns, have been shown to be informationally relevant in the context of human behavior (e.g., in marital interaction (Baucom, Atkins, Simpson and Christensen, 2009), in motivational interviewing (Amrhein, Miller, Yahne, Palmer and Fulcher, 2003; Imel, Barco, Brown, Baucom, Baer, Kircher and Atkins, 2014; Miller, Benefield and Tonigan, 1993)). Many automatic computational approaches that support measurement, analysis, and modeling of human behaviors from speech have been investigated in affective computing (Lee and Narayanan, 2005), social signal processing (Vinciarelli, Pantic and Bourlard, 2009) and behavioral signal processing (BSP) (Narayanan and Georgiou, 2013).
arXiv.org Artificial Intelligence
Apr-1-2021
- Country:
- North America > United States > California (0.46)
- Genre:
- Research Report
- Experimental Study (0.46)
- New Finding (0.46)
- Observational Study (0.34)
- Strength Medium (0.34)
- Research Report
- Industry:
- Government (0.93)
- Health & Medicine > Therapeutic Area
- Psychiatry/Psychology (1.00)
- Technology:
- Information Technology > Artificial Intelligence
- Cognitive Science > Emotion (0.68)
- Machine Learning
- Neural Networks > Deep Learning (0.46)
- Statistical Learning (0.93)
- Natural Language (1.00)
- Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence