Segou
Dealing with the Hard Facts of Low-Resource African NLP
Diarra, Yacouba, Coulibaly, Nouhoum Souleymane, Kamaté, Panga Azazia, Tall, Madani Amadou, Koné, Emmanuel Élisé, Dembélé, Aymane, Leventhal, Michael
Creating speech datasets, models, and evaluation frameworks for low-resource languages remains challenging given the lack of a broad base of pertinent experience to draw from. This paper reports on the field collection of 612 hours of spontaneous speech in Bambara, a low-resource West African language; the semi-automated annotation of that dataset with transcriptions; the creation of several monolingual ultra-compact and small models using the dataset; and the automatic and human evaluation of their output. We offer practical suggestions for data collection protocols, annotation, and model design, as well as evidence for the importance of performing human evaluation. In addition to the main dataset, multiple evaluation datasets, models, and code are made publicly available.
- Africa > Mali > Bamako > Bamako (0.05)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > Texas > Dallas County > Dallas (0.04)
- (8 more...)
Real-time Spatial Retrieval Augmented Generation for Urban Environments
Campo, David Nazareno, Conde, Javier, Alonso, Álvaro, Huecas, Gabriel, Salvachúa, Joaquín, Reviriego, Pedro
The proliferation of Generative Artificial Ingelligence (AI), especially Large Language Models, presents transformative opportunities for urban applications through Urban Foundation Models. However, base models face limitations, as they only contain the knowledge available at the time of training, and updating them is both time-consuming and costly. Retrieval Augmented Generation (RAG) has emerged in the literature as the preferred approach for injecting contextual information into Foundation Models. It prevails over techniques such as fine-tuning, which are less effective in dynamic, real-time scenarios like those found in urban environments. However, traditional RAG architectures, based on semantic databases, knowledge graphs, structured data, or AI-powered web searches, do not fully meet the demands of urban contexts. Urban environments are complex systems characterized by large volumes of interconnected data, frequent updates, real-time processing requirements, security needs, and strong links to the physical world. This work proposes a real-time spatial RAG architecture that defines the necessary components for the effective integration of generative AI into cities, leveraging temporal and spatial filtering capabilities through linked data. The proposed architecture is implemented using FIWARE, an ecosystem of software components to develop smart city solutions and digital twins. The design and implementation are demonstrated through the use case of a tourism assistant in the city of Madrid. The use case serves to validate the correct integration of Foundation Models through the proposed RAG architecture.
- Europe > Spain > Galicia > Madrid (0.26)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- Asia > Singapore (0.04)
- (6 more...)
- Information Technology > Security & Privacy (0.93)
- Transportation (0.68)
Sequential Change Point Detection via Denoising Score Matching
Zhou, Wenbin, Xie, Liyan, Peng, Zhigang, Zhu, Shixiang
Sequential change-point detection plays a critical role in numerous real-world applications, where timely identification of distributional shifts can greatly mitigate adverse outcomes. Classical methods commonly rely on parametric density assumptions of pre- and post-change distributions, limiting their effectiveness for high-dimensional, complex data streams. This paper proposes a score-based CUSUM change-point detection, in which the score functions of the data distribution are estimated by injecting noise and applying denoising score matching. We consider both offline and online versions of score estimation. Through theoretical analysis, we demonstrate that denoising score matching can enhance detection power by effectively controlling the injected noise scale. Finally, we validate the practical efficacy of our method through numerical experiments on two synthetic datasets and a real-world earthquake precursor detection task, demonstrating its effectiveness in challenging scenarios.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > New York (0.04)
- (5 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
- Information Technology > Data Science > Data Mining (0.68)
Coming of Age: Emerging Technologies And The World's Children
Read "technology" and "children" in the same sentence, and you'll probably think about screen time or social media. But technology's implications are vastly more profound: AI, machine learning, big data and automation will fundamentally reshape the lives of our youngest generation. How might we direct the power of emerging innovations to fulfill their rights? One trailblazer addressing this question is Erica Kochi, Co-Founder of UNICEF Innovation at the United Nations Children's Fund, who was named one of TIME's most influential people in the world. Erica continues to accelerate action – unveiling a new urban tech bets opportunity just this week – and to drive crucial dialogues as Co-Chair of the World Economic Forum's Global Future Council on Human Rights.
- South America > Colombia (0.16)
- Africa > Senegal (0.05)
- Africa > Mali > Segou > Segou (0.05)
- (2 more...)
- Government > Intergovernmental Programs (0.56)
- Health & Medicine > Therapeutic Area > Immunology (0.49)
- Banking & Finance > Economy (0.36)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.31)