South America
Objective Soups: Multilingual Multi-Task Modeling for Speech Processing
Saif, A F M, Chen, Lisha, Cui, Xiaodong, Lu, Songtao, Kingsbury, Brian, Chen, Tianyi
Training a single model for multilingual, multi-task speech processing (MSP) is severely hampered by conflicting objectives between tasks like speech recognition and translation. While multi-objective optimization (MOO) aims to align gradient updates, its effectiveness diminishes as the number of tasks grows, making it difficult to find a common descent direction. This raises a fundamental question: should highly conflicting objectives be optimized jointly or separated into a hierarchical structure? To address this question, this paper investigates three multi-objective MSP formulations, which we refer to as \textbf{objective soup recipes}. These formulations apply multi-objective optimization at different optimization levels to mitigate potential conflicts among all objectives. To ensure efficiency, we introduce a lightweight layer-selection mechanism that computes the conflict-avoiding gradient using only the most problematic layers, minimizing computational and memory overhead. Extensive experiments on CoVoST v2, LibriSpeech, and AISHELL-1 reveal that a bi-level recipe separating recognition and translation tasks consistently outperforms standard flat optimization. Our work demonstrates that hierarchical MOO is a more effective and scalable approach for building state-of-the-art MSP models. Our code has been released at https://github.com/afmsaif/Objective_Soups.
A deep learning and machine learning approach to predict neonatal death in the context of São Paulo
Raihan, Mohon, Saha, Plabon Kumar, Gupta, Rajan Das, Kabir, A Z M Tahmidul, Tamanna, Afia Anjum, Harun-Ur-Rashid, Md., Salam, Adnan Bin Abdus, Anjum, Md Tanvir, Kabir, A Z M Ahteshamul
Neonatal death is still a concerning reality for underdeveloped and even some developed countries. Worldwide data indicate that 26.693 babies out of 1,000 births die, according to Macro Trades. To reduce this number, early prediction of endangered babies is crucial. Such prediction enables the opportunity to take ample care of the child and mother so that early child death can be avoided. In this context, machine learning was used to determine whether a newborn baby is at risk. To train the predictive model, historical data of 1.4 million newborns was used. Machine learning and deep learning techniques such as logical regression, K-nearest neighbor, random forest classifier, extreme gradient boosting (XGBoost), convolutional neural network, and long short-term memory (LSTM) were implemented using the dataset to identify the most accurate model for predicting neonatal mortality. Among the machine learning algorithms, XGBoost and random forest classifier achieved the best accuracy with 94%, while among the deep learning models, LSTM delivered the highest accuracy with 99%. Therefore, using LSTM appears to be the most suitable approach to predict whether precautionary measures for a child are necessary.
Claire's on brink of collapse putting 2,150 jobs at risk
Claire's on brink of collapse putting 2,150 jobs at risk 15 minutes agoShareSaveTom EspinerBusiness reporter, BBC NewsShareSaveEPA Claire's will appoint administrators after struggles with online competition. Fashion accessories chain Claire's is on the brink of collapse after the retailer said it will appoint administrators in the UK and Ireland, putting 2,150 jobs at risk. The company has 278 stores in the UK and 28 in Ireland but has been struggling with falling sales and fierce competition. All the shops will continue trading while administrators at Interpath, once appointed, will "assess options for the company". Interpath chief executive Will Wright, said options include "exploring the possibility of a sale which would secure a future for this well-loved brand". Claire's in the US filed for bankruptcy in the US earlier this month.
Graph Reordering for Cache-Efficient Near Neighbor Search
Graph search is one of the most successful algorithmic trends in near neighbor search. Several of the most popular and empirically successful algorithms are, at their core, a greedy walk along a pruned near neighbor graph. However, graph traversal applications often suffer from poor memory access patterns, and near neighbor search is no exception to this rule. Our measurements show that popular search indices such as the hierarchical navigable small-world graph (HNSW) can have poor cache miss performance. To address this issue, we formulate the graph traversal problem as a cache hit maximization task and propose multiple graph reordering as a solution.
Integrating attention into explanation frameworks for language and vision transformers
Eggen, Marte, Lysnæs-Larsen, Jacob, Strümke, Inga
The attention mechanism lies at the core of the transformer architecture, providing an interpretable model-internal signal that has motivated a growing interest in attention-based model explanations. Although attention weights do not directly determine model outputs, they reflect patterns of token influence that can inform and complement established explainability techniques. This work studies the potential of utilising the information encoded in attention weights to provide meaningful model explanations by integrating them into explainable AI (XAI) frameworks that target fundamentally different aspects of model behaviour. To this end, we develop two novel explanation methods applicable to both natural language processing and computer vision tasks. The first integrates attention weights into the Shapley value decomposition by redefining the characteristic function in terms of pairwise token interactions via attention weights, thus adapting this widely used game-theoretic solution concept to provide attention-driven attributions for local explanations. The second incorporates attention weights into token-level directional derivatives defined through concept activation vectors to measure concept sensitivity for global explanations. Our empirical evaluations on standard benchmarks and in a comparison study with widely used explanation methods show that attention weights can be meaningfully incorporated into the studied XAI frameworks, highlighting their value in enriching transformer explainability.