Segou
Dealing with the Hard Facts of Low-Resource African NLP
Diarra, Yacouba, Coulibaly, Nouhoum Souleymane, Kamaté, Panga Azazia, Tall, Madani Amadou, Koné, Emmanuel Élisé, Dembélé, Aymane, Leventhal, Michael
Creating speech datasets, models, and evaluation frameworks for low-resource languages remains challenging given the lack of a broad base of pertinent experience to draw from. This paper reports on the field collection of 612 hours of spontaneous speech in Bambara, a low-resource West African language; the semi-automated annotation of that dataset with transcriptions; the creation of several monolingual ultra-compact and small models using the dataset; and the automatic and human evaluation of their output. We offer practical suggestions for data collection protocols, annotation, and model design, as well as evidence for the importance of performing human evaluation. In addition to the main dataset, multiple evaluation datasets, models, and code are made publicly available.
- Africa > Mali > Bamako > Bamako (0.05)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > Texas > Dallas County > Dallas (0.04)
- (8 more...)
Real-time Spatial Retrieval Augmented Generation for Urban Environments
Campo, David Nazareno, Conde, Javier, Alonso, Álvaro, Huecas, Gabriel, Salvachúa, Joaquín, Reviriego, Pedro
The proliferation of Generative Artificial Ingelligence (AI), especially Large Language Models, presents transformative opportunities for urban applications through Urban Foundation Models. However, base models face limitations, as they only contain the knowledge available at the time of training, and updating them is both time-consuming and costly. Retrieval Augmented Generation (RAG) has emerged in the literature as the preferred approach for injecting contextual information into Foundation Models. It prevails over techniques such as fine-tuning, which are less effective in dynamic, real-time scenarios like those found in urban environments. However, traditional RAG architectures, based on semantic databases, knowledge graphs, structured data, or AI-powered web searches, do not fully meet the demands of urban contexts. Urban environments are complex systems characterized by large volumes of interconnected data, frequent updates, real-time processing requirements, security needs, and strong links to the physical world. This work proposes a real-time spatial RAG architecture that defines the necessary components for the effective integration of generative AI into cities, leveraging temporal and spatial filtering capabilities through linked data. The proposed architecture is implemented using FIWARE, an ecosystem of software components to develop smart city solutions and digital twins. The design and implementation are demonstrated through the use case of a tourism assistant in the city of Madrid. The use case serves to validate the correct integration of Foundation Models through the proposed RAG architecture.
- Europe > Spain > Galicia > Madrid (0.26)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- Asia > Singapore (0.04)
- (6 more...)
- Information Technology > Security & Privacy (0.93)
- Transportation (0.68)
Sequential Change Point Detection via Denoising Score Matching
Zhou, Wenbin, Xie, Liyan, Peng, Zhigang, Zhu, Shixiang
Sequential change-point detection plays a critical role in numerous real-world applications, where timely identification of distributional shifts can greatly mitigate adverse outcomes. Classical methods commonly rely on parametric density assumptions of pre- and post-change distributions, limiting their effectiveness for high-dimensional, complex data streams. This paper proposes a score-based CUSUM change-point detection, in which the score functions of the data distribution are estimated by injecting noise and applying denoising score matching. We consider both offline and online versions of score estimation. Through theoretical analysis, we demonstrate that denoising score matching can enhance detection power by effectively controlling the injected noise scale. Finally, we validate the practical efficacy of our method through numerical experiments on two synthetic datasets and a real-world earthquake precursor detection task, demonstrating its effectiveness in challenging scenarios.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > New York (0.04)
- (5 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
- Information Technology > Data Science > Data Mining (0.68)
True-data Testbed for 5G/B5G Intelligent Network
Huang, Yongming, Liu, Shengheng, Zhang, Cheng, You, Xiaohu, Wu, Hequan
Future beyond fifth-generation (B5G) and sixth-generation (6G) mobile communications will shift from facilitating interpersonal communications to supporting Internet of Everything (IoE), where intelligent communications with full integration of big data and artificial intelligence (AI) will play an important role in improving network efficiency and providing high-quality service. As a rapid evolving paradigm, the AI-empowered mobile communications demand large amounts of data acquired from real network environment for systematic test and verification. Hence, we build the world's first true-data testbed for 5G/B5G intelligent network (TTIN), which comprises 5G/B5G on-site experimental networks, data acquisition & data warehouse, and AI engine & network optimization. In the TTIN, true network data acquisition, storage, standardization, and analysis are available, which enable system-level online verification of B5G/6G-orientated key technologies and support data-driven network optimization through the closed-loop control mechanism. This paper elaborates on the system architecture and module design of TTIN. Detailed technical specifications and some of the established use cases are also showcased.
- Asia > China > Jiangsu Province > Nanjing (0.04)
- North America > United States (0.04)
- Europe > United Kingdom (0.04)
- (9 more...)
- Telecommunications (1.00)
- Information Technology > Networks (1.00)
- Energy (1.00)
- Information Technology > Communications > Networks (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Data Science > Data Mining > Big Data (0.91)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Coming of Age: Emerging Technologies And The World's Children
Read "technology" and "children" in the same sentence, and you'll probably think about screen time or social media. But technology's implications are vastly more profound: AI, machine learning, big data and automation will fundamentally reshape the lives of our youngest generation. How might we direct the power of emerging innovations to fulfill their rights? One trailblazer addressing this question is Erica Kochi, Co-Founder of UNICEF Innovation at the United Nations Children's Fund, who was named one of TIME's most influential people in the world. Erica continues to accelerate action – unveiling a new urban tech bets opportunity just this week – and to drive crucial dialogues as Co-Chair of the World Economic Forum's Global Future Council on Human Rights.
- South America > Colombia (0.16)
- Africa > Senegal (0.05)
- Africa > Mali > Segou > Segou (0.05)
- (2 more...)
- Government > Intergovernmental Programs (0.56)
- Health & Medicine > Therapeutic Area > Immunology (0.49)
- Banking & Finance > Economy (0.36)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.31)