Goto

Collaborating Authors

 Garmash, Ekaterina


PODTILE: Facilitating Podcast Episode Browsing with Auto-generated Chapters

arXiv.org Artificial Intelligence

Listeners of long-form talk-audio content, such as podcast episodes, often find it challenging to understand the overall structure and locate relevant sections. A practical solution is to divide episodes into chapters--semantically coherent segments labeled with titles and timestamps. Since most episodes on our platform at Spotify currently lack creator-provided chapters, automating the creation of chapters is essential. Scaling the chapterization of podcast episodes presents unique challenges. First, episodes tend to be less structured than written texts, featuring spontaneous discussions with nuanced transitions. Second, the transcripts are usually lengthy, averaging about 16,000 tokens, which necessitates efficient processing that can preserve context. To address these challenges, we introduce PODTILE, a fine-tuned encoder-decoder transformer to segment conversational data. The model simultaneously generates chapter transitions and titles for the input transcript. To preserve context, each input text is augmented with global context, including the episode's title, description, and previous chapter titles. In our intrinsic evaluation, PODTILE achieved an 11% improvement in ROUGE score over the strongest baseline. Additionally, we provide insights into the practical benefits of auto-generated chapters for listeners navigating episode content. Our findings indicate that auto-generated chapters serve as a useful tool for engaging with less popular podcasts. Finally, we present empirical evidence that using chapter titles can enhance effectiveness of sparse retrieval in search tasks.


Cem Mil Podcasts: A Spoken Portuguese Document Corpus For Multi-modal, Multi-lingual and Multi-Dialect Information Access Research

arXiv.org Artificial Intelligence

In this paper we describe the Portuguese-language podcast dataset we have released for academic research purposes. We give an overview of how the data was sampled, descriptive statistics over the collection, as well as information about the distribution over Brazilian and Portuguese dialects. We give results from experiments on multi-lingual summarization, showing that summarizing podcast transcripts can be performed well by a system supporting both English and Portuguese. We also show experiments on Portuguese podcast genre classification using text metadata. Combining this collection with previously released English-language collection opens up the potential for multi-modal, multi-lingual and multi-dialect podcast information access research.


Connecting degree and polarity: An artificial language learning study

arXiv.org Artificial Intelligence

One prominent Linguistic expressions can be characterized along method is Artificial Language Learning (Friederici a variety of properties: what they mean, what parts et al., 2002; Motamedi et al., 2019; Kanwal et al., they consist of, how they combine with other expressions 2017; Culbertson et al., 2012; Ettlinger et al., 2014; and so on. Some of these properties are Finley and Badecker, 2009). It has the following systematically related to each other. When these main ingredients: relations appear systematically in language after language, they can be grounds for implicational linguistic 1. fragment of an artificial language in the universals, for example, Greenberg's Universal form of expressions that do not belong to the 37: A language never has more gender categories language that participants are speakers of; in nonsingular numbers than in the singular. (Greenberg, 1963). Here, two properties of linguistic 2. training phase, where some information expressions are related: the grammatical number about the language fragment is given to the of an expression and how many gender distinctions participants; are available for this expression. More complex 3. testing phase, where it is checked what other generalizations may concern correlation between knowledge, beside the provided, was inferred continuous properties A and B.