Goto

Collaborating Authors

 Aswan


Mystery item spotted in 2,000-year-old Egyptian child mummy

Popular Science

Critical information about this unknown boy was destroyed during World War II. More information Adding us as a Preferred Source in Google by using this link indicates that you would like to see more of our content in Google News results. CT scanning and X-ray imaging allowed archaeologists to examine the mummy in extreme detail. Breakthroughs, discoveries, and DIY tips sent six days a week. Archaeologists in Poland are finally solving an over 2,000-year-old mummy mystery.


Language Model Tokenizers Introduce Unfairness Between Languages

Neural Information Processing Systems

Recent language models have shown impressive multilingual performance, even when not explicitly trained for it. Despite this, there are concerns about the quality of their outputs across different languages. In this paper, we show how disparity in the treatment of different languages arises at the tokenization stage, well before a model is even invoked. The same text translated into different languages can have drastically different tok-enization lengths, with differences up to 15 times in some cases. These disparities persist even for tokenizers that are intentionally trained for multilingual support.


The Longest Solar Eclipse for 100 Years Is Coming. Don't Miss It

WIRED

The Longest Solar Eclipse for 100 Years Is Coming. NASA has announced when the longest total solar eclipse of the century will occur--and you won't have to wait long. Here's what you should know. The duration of a total solar eclipse always varies. In April 2024, the eclipse that crossed North America lasted 4 minutes and 28 seconds.



A Appendix

Neural Information Processing Systems

It suggests that, for any m { k,...,n 1 } and z R, L A.2 Proofs for Lemma 2 and 3 for the case when K is unknown in 4 Lemma 2 . It suggests that, for any m { 0,...,n 1 } and z R, L For any m { 0,...,n 1 } and z R, we have L A.3 Additional tricks for methods proposed in 3. Finding optimal CP vector when z = in paraCP(n,k, ˆ T Additional pruning condition for parametric DP when K is fixed. In 3.3, we showed that Lemma 4. F orn [ N ], and k [ K ], let T Therefore, it fails to control the false positive rate. This is asymptotic test for multiple detected CPs. Fused Lasso (proposed by the same authors), is worse than BinSeg-SI. BinSeg-SI had been considered as a computationally efficient approximation of the problem in (7), where the authors additionally condition on extra information for computational tractability, e.g., the order that CPs are detected.


The Impact of Code-switched Synthetic Data Quality is Task Dependent: Insights from MT and ASR

arXiv.org Artificial Intelligence

Code-switching, the act of alternating between languages, emerged as a prevalent global phenomenon that needs to be addressed for building user-friendly language technologies. A main bottleneck in this pursuit is data scarcity, motivating research in the direction of code-switched data augmentation. However, current literature lacks comprehensive studies that enable us to understand the relation between the quality of synthetic data and improvements on NLP tasks. We extend previous research conducted in this direction on machine translation (MT) with results on automatic speech recognition (ASR) and cascaded speech translation (ST) to test generalizability of findings. Our experiments involve a wide range of augmentation techniques, covering lexical replacements, linguistic theories, and back-translation. Based on the results of MT, ASR, and ST, we draw conclusions and insights regarding the efficacy of various augmentation techniques and the impact of quality on performance.


TeaserGen: Generating Teasers for Long Documentaries

arXiv.org Artificial Intelligence

Teasers are an effective tool for promoting content in entertainment, commercial and educational fields. However, creating an effective teaser for long videos is challenging for it requires long-range multimodal modeling on the input videos, while necessitating maintaining audiovisual alignments, managing scene changes and preserving factual accuracy for the output teasers. Due to the lack of a publicly-available dataset, progress along this research direction has been hindered. In this work, we present DocumentaryNet, a collection of 1,269 documentaries paired with their teasers, featuring multimodal data streams of video, speech, music, sound effects and narrations. With DocumentaryNet, we propose a new two-stage system for generating teasers from long documentaries. The proposed TeaserGen system first generates the teaser narration from the transcribed narration of the documentary using a pretrained large language model, and then selects the most relevant visual content to accompany the generated narration through language-vision models. For narration-video matching, we explore two approaches: a pretraining-based model using pretrained contrastive language-vision models and a deep sequential model that learns the mapping between the narrations and visuals. Our experimental results show that the pretraining-based approach is more effective at identifying relevant visual content than directly trained deep autoregressive models.


A Survey of Large Language Models for Arabic Language and its Dialects

arXiv.org Artificial Intelligence

This survey offers a comprehensive overview of Large Language Models (LLMs) designed for Arabic language and its dialects. It covers key architectures, including encoder-only, decoder-only, and encoder-decoder models, along with the datasets used for pre-training, spanning Classical Arabic, Modern Standard Arabic, and Dialectal Arabic. The study also explores monolingual, bilingual, and multilingual LLMs, analyzing their architectures and performance across downstream tasks, such as sentiment analysis, named entity recognition, and question answering. Furthermore, it assesses the openness of Arabic LLMs based on factors, such as source code availability, training data, model weights, and documentation. The survey highlights the need for more diverse dialectal datasets and attributes the importance of openness for research reproducibility and transparency. It concludes by identifying key challenges and opportunities for future research and stressing the need for more inclusive and representative models.


FGR-Net:Interpretable fundus imagegradeability classification based on deepreconstruction learning

arXiv.org Artificial Intelligence

The performance of diagnostic Computer-Aided Design (CAD) systems for retinal diseases depends on the quality of the retinal images being screened. Thus, many studies have been developed to evaluate and assess the quality of such retinal images. However, most of them did not investigate the relationship between the accuracy of the developed models and the quality of the visualization of interpretability methods for distinguishing between gradable and non-gradable retinal images. Consequently, this paper presents a novel framework called FGR-Net to automatically assess and interpret underlying fundus image quality by merging an autoencoder network with a classifier network. The FGR-Net model also provides an interpretable quality assessment through visualizations. In particular, FGR-Net uses a deep autoencoder to reconstruct the input image in order to extract the visual characteristics of the input fundus images based on self-supervised learning. The extracted features by the autoencoder are then fed into a deep classifier network to distinguish between gradable and ungradable fundus images. FGR-Net is evaluated with different interpretability methods, which indicates that the autoencoder is a key factor in forcing the classifier to focus on the relevant structures of the fundus images, such as the fovea, optic disk, and prominent blood vessels. Additionally, the interpretability methods can provide visual feedback for ophthalmologists to understand how our model evaluates the quality of fundus images. The experimental results showed the superiority of FGR-Net over the state-of-the-art quality assessment methods, with an accuracy of 89% and an F1-score of 87%.


Application Research On Real-Time Perception Of Device Performance Status

arXiv.org Artificial Intelligence

In order to accurately identify the performance status of mobile devices and finely adjust the user experience, a real-time performance perception evaluation method based on TOPSIS (Technique for Order Preference by Similarity to Ideal Solution) combined with entropy weighting method and time series model construction was studied. After collecting the performance characteristics of various mobile devices, the device performance profile was fitted by using PCA (principal component analysis) dimensionality reduction and feature engineering methods such as descriptive time series analysis. The ability of performance features and profiles to describe the real-time performance status of devices was understood and studied by applying the TOPSIS method and multi-level weighting processing. A time series model was constructed for the feature set under objective weighting, and multiple sensitivity (real-time, short-term, long-term) performance status perception results were provided to obtain real-time performance evaluation data and long-term stable performance prediction data. Finally, by configuring dynamic AB experiments and overlaying fine-grained power reduction strategies, the usability of the method was verified, and the accuracy of device performance status identification and prediction was compared with the performance of the profile features including dimensionality reduction time series modeling, TOPSIS method and entropy weighting method, subjective weighting, HMA method. The results show that accurate real-time performance perception results can greatly enhance business value, and this research has application effectiveness and certain forward-looking significance.