Overview
Fanar: An Arabic-Centric Multimodal Generative AI Platform
Fanar Team, null, Abbas, Ummar, Ahmad, Mohammad Shahmeer, Alam, Firoj, Altinisik, Enes, Asgari, Ehsannedin, Boshmaf, Yazan, Boughorbel, Sabri, Chawla, Sanjay, Chowdhury, Shammur, Dalvi, Fahim, Darwish, Kareem, Durrani, Nadir, Elfeky, Mohamed, Elmagarmid, Ahmed, Eltabakh, Mohamed, Fatehkia, Masoomali, Fragkopoulos, Anastasios, Hasanain, Maram, Hawasly, Majd, Husaini, Mus'ab, Jung, Soon-Gyo, Lucas, Ji Kim, Magdy, Walid, Messaoud, Safa, Mohamed, Abubakr, Mohiuddin, Tasnim, Mousi, Basel, Mubarak, Hamdy, Musleh, Ahmad, Naeem, Zan, Ouzzani, Mourad, Popovic, Dorde, Sadeghi, Amin, Sencar, Husrev Taha, Shinoy, Mohammed, Sinan, Omar, Zhang, Yifan, Ali, Ahmed, Kheir, Yassine El, Ma, Xiaosong, Ruan, Chaoyi
We present Fanar, a platform for Arabic-centric multimodal generative AI systems, that supports language, speech and image generation tasks. At the heart of Fanar are Fanar Star and Fanar Prime, two highly capable Arabic Large Language Models (LLMs) that are best in the class on well established benchmarks for similar sized models. Fanar Star is a 7B (billion) parameter model that was trained from scratch on nearly 1 trillion clean and deduplicated Arabic, English and Code tokens. Fanar Prime is a 9B parameter model continually trained on the Gemma-2 9B base model on the same 1 trillion token set. Both models are concurrently deployed and designed to address different types of prompts transparently routed through a custom-built orchestrator. The Fanar platform provides many other capabilities including a customized Islamic Retrieval Augmented Generation (RAG) system for handling religious prompts, a Recency RAG for summarizing information about current or recent events that have occurred after the pre-training data cut-off date. The platform provides additional cognitive capabilities including in-house bilingual speech recognition that supports multiple Arabic dialects, voice and image generation that is fine-tuned to better reflect regional characteristics. Finally, Fanar provides an attribution service that can be used to verify the authenticity of fact based generated content. The design, development, and implementation of Fanar was entirely undertaken at Hamad Bin Khalifa University's Qatar Computing Research Institute (QCRI) and was sponsored by Qatar's Ministry of Communications and Information Technology to enable sovereign AI technology development.
In the Picture: Medical Imaging Datasets, Artifacts, and their Living Review
Jiménez-Sánchez, Amelia, Avlona, Natalia-Rozalia, de Boer, Sarah, Campello, Víctor M., Feragen, Aasa, Ferrante, Enzo, Ganz, Melanie, Gichoya, Judy Wawira, González, Camila, Groefsema, Steff, Hering, Alessa, Hulman, Adam, Joskowicz, Leo, Juodelyte, Dovile, Kandemir, Melih, Kooi, Thijs, Lérida, Jorge del Pozo, Li, Livie Yumeng, Pacheco, Andre, Rädsch, Tim, Reyes, Mauricio, Sourget, Théo, van Ginneken, Bram, Wen, David, Weng, Nina, Xu, Jack Junchi, Zając, Hubert Dariusz, Zuluaga, Maria A., Cheplygina, Veronika
Datasets play a critical role in medical imaging research, yet issues such as label quality, shortcuts, and metadata are often overlooked. This lack of attention may harm the generalizability of algorithms and, consequently, negatively impact patient outcomes. While existing medical imaging literature reviews mostly focus on machine learning (ML) methods, with only a few focusing on datasets for specific applications, these reviews remain static -- they are published once and not updated thereafter. This fails to account for emerging evidence, such as biases, shortcuts, and additional annotations that other researchers may contribute after the dataset is published. We refer to these newly discovered findings of datasets as research artifacts. To address this gap, we propose a living review that continuously tracks public datasets and their associated research artifacts across multiple medical imaging applications. Our approach includes a framework for the living review to monitor data documentation artifacts, and an SQL database to visualize the citation relationships between research artifact and dataset. Lastly, we discuss key considerations for creating medical imaging datasets, review best practices for data annotation, discuss the significance of shortcuts and demographic diversity, and emphasize the importance of managing datasets throughout their entire lifecycle. Our demo is publicly available at http://130.226.140.142.
Mixture of Experts (MoE): A Big Data Perspective
Gan, Wensheng, Ning, Zhenyao, Qi, Zhenlian, Yu, Philip S.
As the era of big data arrives, traditional artificial intelligence algorithms have difficulty processing the demands of massive and diverse data. Mixture of experts (MoE) has shown excellent performance and broad application prospects. This paper provides an in-depth review and analysis of the latest progress in this field from multiple perspectives, including the basic principles, algorithmic models, key technical challenges, and application practices of MoE. First, we introduce the basic concept of MoE and its core idea and elaborate on its advantages over traditional single models. Then, we discuss the basic architecture of MoE and its main components, including the gating network, expert networks, and learning algorithms. Next, we review the applications of MoE in addressing key technical issues in big data. For each challenge, we provide specific MoE solutions and their innovations. Furthermore, we summarize the typical use cases of MoE in various application domains. This fully demonstrates the powerful capability of MoE in big data processing. We also analyze the advantages of MoE in big data environments. Finally, we explore the future development trends of MoE. We believe that MoE will become an important paradigm of artificial intelligence in the era of big data. In summary, this paper systematically elaborates on the principles, techniques, and applications of MoE in big data processing, providing theoretical and practical references to further promote the application of MoE in real scenarios.
Harnessing the Potential of Large Language Models in Modern Marketing Management: Applications, Future Directions, and Strategic Recommendations
Aghaei, Raha, Kiaei, Ali A., Boush, Mahnaz, Vahidi, Javad, Zavvar, Mohammad, Barzegar, Zeynab, Rofoosheh, Mahan
Large Language Models (LLMs) have revolutionized the process of customer engagement, campaign optimization, and content generation, in marketing management. In this paper, we explore the transformative potential of LLMs along with the current applications, future directions, and strategic recommendations for marketers. In particular, we focus on LLMs major business drivers such as personalization, real-time-interactive customer insights, and content automation, and how they enable customers and business outcomes. For instance, the ethical aspects of AI with respect to data privacy, transparency, and mitigation of bias are also covered, with the goal of promoting responsible use of the technology through best practices and the use of new technologies businesses can tap into the LLM potential, which help growth and stay one step ahead in the turmoil of digital marketing. This article is designed to give marketers the necessary guidance by using best industry practices to integrate these powerful LLMs into their marketing strategy and innovation without compromising on the ethos of their brand.
Gradient-Based Multi-Objective Deep Learning: Algorithms, Theories, Applications, and Beyond
Chen, Weiyu, Zhang, Xiaoyuan, Lin, Baijiong, Lin, Xi, Zhao, Han, Zhang, Qingfu, Kwok, James T.
Multi-objective optimization (MOO) in deep learning aims to simultaneously optimize multiple conflicting objectives, a challenge frequently encountered in areas like multi-task learning and multi-criteria learning. Recent advancements in gradient-based MOO methods have enabled the discovery of diverse types of solutions, ranging from a single balanced solution to finite or even infinite Pareto sets, tailored to user needs. These developments have broad applications across domains such as reinforcement learning, computer vision, recommendation systems, and large language models. This survey provides the first comprehensive review of gradient-based MOO in deep learning, covering algorithms, theories, and practical applications. By unifying various approaches and identifying critical challenges, it serves as a foundational resource for driving innovation in this evolving field. A comprehensive list of MOO algorithms in deep learning is available at \url{https://github.com/Baijiong-Lin/Awesome-Multi-Objective-Deep-Learning}.
CSMeD: Bridging the Dataset Gap in Automated Citation Screening for Systematic Literature Reviews
Systematic literature reviews (SLRs) play an essential role in summarising, synthesising and validating scientific evidence. In recent years, there has been a growing interest in using machine learning techniques to automate the identification of relevant studies for SLRs. However, the lack of standardised evaluation datasets makes comparing the performance of such automated literature screening systems difficult. In this paper, we analyse the citation screening evaluation datasets, revealing that many of the available datasets are either too small, suffer from data leakage or have limited applicability to systems treating automated literature screening as a classification task, as opposed to, for example, a retrieval or question-answering task. To address these challenges, we introduce CSMED, a meta-dataset consolidating nine publicly released collections, providing unified access to 325 SLRs from the fields of medicine and computer science.
A Comprehensive Insights into Drones: History, Classification, Architecture, Navigation, Applications, Challenges, and Future Trends
Singh, Ruchita, Kumar, Sandeep
Unmanned Aerial Vehicles (UAVs), commonly known as Drones, are one of 21st century most transformative technologies. Emerging first for military use, advancements in materials, electronics, and software have catapulted drones into multipurpose tools for a wide range of industries. In this paper, we have covered the history, taxonomy, architecture, navigation systems and branched activities for the same. It explores important future trends like autonomous navigation, AI integration, and obstacle avoidance systems, emphasizing how they contribute to improving the efficiency and versatility of drones. It also looks at the major challenges like technical, environmental, economic, regulatory and ethical, that limit the actual take-up of drones, as well as trends that are likely to mitigate these obstacles in the future. This work offers a structured synthesis of existing studies and perspectives that enable insights about how drones will transform agriculture, logistics, healthcare, disaster management, and other areas, while also identifying new opportunities for innovation and development.
New Fashion Products Performance Forecasting: A Survey on Evolutions, Models and Emerging Trends
Avogaro, Andrea, Capogrosso, Luigi, Toaiari, Andrea, Fummi, Franco, Cristani, Marco
The fast fashion industry's insatiable demand for new styles and rapid production cycles has led to a significant environmental burden. Overproduction, excessive waste, and harmful chemicals have contributed to the negative environmental impact of the industry. To mitigate these issues, a paradigm shift that prioritizes sustainability and efficiency is urgently needed. Integrating learning-based predictive analytics into the fashion industry represents a significant opportunity to address environmental challenges and drive sustainable practices. By forecasting fashion trends and optimizing production, brands can reduce their ecological footprint while remaining competitive in a rapidly changing market. However, one of the key challenges in forecasting fashion sales is the dynamic nature of consumer preferences. Fashion is acyclical, with trends constantly evolving and resurfacing. In addition, cultural changes and unexpected events can disrupt established patterns. This problem is also known as New Fashion Products Performance Forecasting (NFPPF), and it has recently gained more and more interest in the global research landscape. Given its multidisciplinary nature, the field of NFPPF has been approached from many different angles. This comprehensive survey wishes to provide an up-to-date overview that focuses on learning-based NFPPF strategies. The survey is based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) methodological flow, allowing for a systematic and complete literature review. In particular, we propose the first taxonomy that covers the learning panorama for NFPPF, examining in detail the different methodologies used to increase the amount of multimodal information, as well as the state-of-the-art available datasets. Finally, we discuss the challenges and future directions.
3rd Workshop on Maritime Computer Vision (MaCVi) 2025: Challenge Results
Kiefer, Benjamin, Žust, Lojze, Muhovič, Jon, Kristan, Matej, Perš, Janez, Teršek, Matija, Desai, Uma Mudenagudi Chaitra, Wiliem, Arnold, Kreis, Marten, Akalwadi, Nikhil, Quan, Yitong, Zhong, Zhiqiang, Zhang, Zhe, Liu, Sujie, Chen, Xuran, Yang, Yang, Fabijanić, Matej, Ferreira, Fausto, Lee, Seongju, Lee, Junseok, Lee, Kyoobin, Yao, Shanliang, Guan, Runwei, Huang, Xiaoyu, Ni, Yi, Kumar, Himanshu, Feng, Yuan, Cheng, Yi-Ching, Lin, Tzu-Yu, Lee, Chia-Ming, Hsu, Chih-Chung, Sheikh, Jannik, Michel, Andreas, Gross, Wolfgang, Weinmann, Martin, Šarić, Josip, Lin, Yipeng, Yang, Xiang, Jiang, Nan, Lu, Yutang, Feng, Fei, Awad, Ali, Lucas, Evan, Saleem, Ashraf, Cheng, Ching-Heng, Lin, Yu-Fan, Lin, Tzu-Yu, Hsu, Chih-Chung
The 3rd Workshop on Maritime Computer Vision (MaCVi) 2025 addresses maritime computer vision for Unmanned Surface Vehicles (USV) and underwater. This report offers a comprehensive overview of the findings from the challenges. We provide both statistical and qualitative analyses, evaluating trends from over 700 submissions. All datasets, evaluation code, and the leaderboard are available to the public at https://macvi.org/workshop/macvi25.
A Survey on LLM Test-Time Compute via Search: Tasks, LLM Profiling, Search Algorithms, and Relevant Frameworks
LLM test-time compute (or LLM inference) via search has emerged as a promising research area with rapid developments. However, current frameworks often adopt distinct perspectives on three key aspects (task definition, LLM profiling, and search procedures), making direct comparisons challenging. Moreover, the search algorithms employed often diverge from standard implementations, and their specific characteristics are not thoroughly specified. In this survey, we provide a comprehensive technical review that unifies task definitions and provides modular definitions of LLM profiling and search procedures. The definitions enable precise comparisons of various LLM inference frameworks while highlighting their departures from conventional search algorithms. We also discuss the applicability, performance, and efficiency of these methods. For further details and ongoing updates, please refer to our GitHub repository: https://github.com/xinzhel/LLM-Agent-Survey/blob/main/search.md