helsinki
Translation Entropy: A Statistical Framework for Evaluating Translation Systems
Gross, Ronit D., Harel, Yanir, Kanter, Ido
The translation of written language has been known since the 3rd century BC; however, its necessity has become increasingly common in the information age. Today, many translators exist, based on encoder-decoder deep architectures, nevertheless, no quantitative objective methods are available to assess their performance, likely because the entropy of even a single language remains unknown. This study presents a quantitative method for estimating translation entropy, with the following key finding. Given a translator, several sentences that differ by only one selected token of a given pivot sentence yield identical translations. Analyzing the statistics of this phenomenon across an ensemble of such sentences, consisting each of a pivot selected token, yields the probabilities of replacing this specific token with others while preserving the translation. These probabilities constitute the entropy of the selected token, and the average across all selected pivot tokens provides an estimate of the translator's overall translation entropy, which is enhanced along the decoder blocks. This entropic measure allows for the quantitative ranking of several publicly available translators and reveals whether mutual translation entropy is symmetric. Extending the proposed method to include the replacement of two tokens in a given pivot sentence demonstrates a multiplicative effect, where translation degeneracy is proportional to the product of the degeneracies of the two tokens. These findings establish translation entropy as a measurable property and objective benchmarking of artificial translators. Results are based on MarianMT, T5-Base and NLLB-200 translators.
- Europe > Finland > Uusimaa > Helsinki (0.06)
- Asia > Middle East > Israel (0.04)
- North America > United States > Pennsylvania (0.04)
- (3 more...)
- Asia > Afghanistan (0.14)
- Europe > Finland > Uusimaa > Helsinki (0.05)
- Asia > Middle East > Israel (0.04)
- (20 more...)
- Leisure & Entertainment > Sports > Football (1.00)
- Law Enforcement & Public Safety > Terrorism (0.67)
- Government > Regional Government > North America Government > United States Government (0.67)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (0.95)
- Information Technology > Communications (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (0.46)
Do Street View Imagery and Public Participation GIS align: Comparative Analysis of Urban Attractiveness
Malekzadeh, Milad, Willberg, Elias, Torkko, Jussi, Korpilo, Silviya, Hasanzadeh, Kamyar, Järv, Olle, Toivonen, Tuuli
As digital tools increasingly shape spatial planning practices, understanding how different data sources reflect human experiences of urban environments is essential. Street View Imagery (SVI) and Public Participation GIS (PPGIS) represent two prominent approaches for capturing place-based perceptions that can support urban planning decisions, yet their comparability remains underexplored. This study investigates the alignment between SVI-based perceived attractiveness and residents' reported experiences gathered via a city-wide PPGIS survey in Helsinki, Finland. Using participant-rated SVI data and semantic image segmentation, we trained a machine learning model to predict perceived attractiveness based on visual features. We compared these predictions to PPGIS-identified locations marked as attractive or unattractive, calculating agreement using two sets of strict and moderate criteria. Our findings reveal only partial alignment between the two datasets. While agreement (with a moderate threshold) reached 67% for attractive and 77% for unattractive places, agreement (with a strict threshold) dropped to 27% and 29%, respectively. By analysing a range of contextual variables, including noise, traffic, population presence, and land use, we found that non-visual cues significantly contributed to mismatches. The model failed to account for experiential dimensions such as activity levels and environmental stressors that shape perceptions but are not visible in images. These results suggest that while SVI offers a scalable and visual proxy for urban perception, it cannot fully substitute the experiential richness captured through PPGIS. We argue that both methods are valuable but serve different purposes; therefore, a more integrated approach is needed to holistically capture how people perceive urban environments.
- Europe > Finland > Uusimaa > Helsinki (0.26)
- North America > United States > California (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- (3 more...)
- Health & Medicine (0.68)
- Transportation > Infrastructure & Services (0.46)
- Transportation > Ground > Rail (0.46)
- Asia > Afghanistan (0.14)
- Europe > Finland > Uusimaa > Helsinki (0.05)
- Asia > Middle East > Israel (0.04)
- (19 more...)
- Leisure & Entertainment > Sports > Football (1.00)
- Law Enforcement & Public Safety > Terrorism (0.67)
- Government > Regional Government > North America Government > United States Government (0.67)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (0.95)
- Information Technology > Communications (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (0.46)
Neural Machine Translation for Coptic-French: Strategies for Low-Resource Ancient Languages
Chaoui, Nasma, Khoury, Richard
This paper presents the first systematic study of strategies for translating Coptic into French. Our comprehensive pipeline systematically evaluates: pivot versus direct translation, the impact of pre-training, the benefits of multi-version fine-tuning, and model robustness to noise. Utilizing aligned biblical corpora, we demonstrate that fine-tuning with a stylistically-varied and noise-aware training corpus significantly enhances translation quality. Our findings provide crucial practical insights for developing translation tools for historical languages in general.
- Europe > Finland > Uusimaa > Helsinki (0.08)
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
- (5 more...)
Federated Learning for Epileptic Seizure Prediction Across Heterogeneous EEG Datasets
Baykara, Cem Ata, Pandey, Saurav Raj, Ünal, Ali Burak, Lee, Harlin, Akgün, Mete
--Developing accurate and generalizable epileptic seizure prediction models from electroencephalography (EEG) data across multiple clinical sites is hindered by patient privacy regulations and significant data heterogeneity (non-IID characteristics). Federated Learning (FL) offers a privacy-preserving framework for collaborative training, but standard aggregation methods like Federated A veraging (FedA vg) can be biased by dominant datasets in heterogeneous settings. This paper investigates FL for seizure prediction using a single EEG channel across four diverse public datasets (Siena, CHB-MIT, Helsinki, NCH), representing distinct patient populations (adult, pediatric, neonate) and recording conditions. We implement privacy-preserving global normalization and propose a Random Subset Aggregation strategy, where each client trains on a fixed-size random subset of its data per round, ensuring equal contribution during aggregation. Our results show that locally trained models fail to generalize across sites, and standard weighted FedA vg yields highly skewed performance (e.g., 89.0% accuracy on CHB-MIT but only 50.8% on Helsinki and 50.6% on NCH). In contrast, Random Subset Aggregation significantly improves performance on under-represented clients (accuracy increases to 81.7% on Helsinki and 68.7% on NCH) and achieves a superior macro-average accuracy of 77.1% and pooled accuracy of 80.0% across all sites, demonstrating a more robust and fair global model. This work highlights the potential of balanced FL approaches for building effective and generalizable seizure prediction systems in realistic, heterogeneous multi-hospital environments while respecting data privacy. Epilepsy is a prevalent neurological disorder characterized by recurrent, unpredictable seizures, significantly impacting the quality of life and safety of millions worldwide [1].
- Europe > Finland > Uusimaa > Helsinki (0.67)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
- North America > United States > North Carolina > Orange County > Chapel Hill (0.04)
- North America > United States > Massachusetts (0.04)
SETS: Leveraging Self-Verification and Self-Correction for Improved Test-Time Scaling
Chen, Jiefeng, Ren, Jie, Chen, Xinyun, Yang, Chengrun, Sun, Ruoxi, Arık, Sercan Ö
Recent advancements in Large Language Models (LLMs) have created new opportunities to enhance performance on complex reasoning tasks by leveraging test-time computation. However, conventional approaches such as repeated sampling with majority voting or reward model scoring, often face diminishing returns as test-time compute scales, in addition to requiring costly task-specific reward model training. In this paper, we present Self-Enhanced Test-Time Scaling (SETS), a novel method that leverages the self-verification and self-correction capabilities of recent advanced LLMs to overcome these limitations. SETS integrates sampling, self-verification, and self-correction into a unified framework, enabling efficient and scalable test-time computation for improved capabilities at complex tasks. Through extensive experiments on challenging planning and reasoning benchmarks, compared to the alternatives, we demonstrate that SETS achieves significant performance improvements and more favorable test-time scaling laws.
- Europe > Finland > Uusimaa > Helsinki (0.07)
- Europe > Romania > București - Ilfov Development Region > Municipality of Bucharest > Bucharest (0.04)
- Europe > Hungary > Budapest > Budapest (0.04)
- (2 more...)
Urban Visual Appeal According to ChatGPT: Contrasting AI and Human Insights
Malekzadeh, Milad, Willberg, Elias, Torkko, Jussi, Toivonen, Tuuli
The visual appeal of urban environments significantly impacts residents' satisfaction with their living spaces and their overall mood, which in turn, affects their health and well-being. Given the resource-intensive nature of gathering evaluations on urban visual appeal through surveys or inquiries from residents, there is a constant quest for automated solutions to streamline this process and support spatial planning. In this study, we applied an off-the-shelf AI model to automate the analysis of urban visual appeal, using over 1,800 Google Street View images of Helsinki, Finland. By incorporating the GPT-4 model with specified criteria, we assessed these images. Simultaneously, 24 participants were asked to rate the images. Our results demonstrated a strong alignment between GPT-4 and participant ratings, although geographic disparities were noted. Specifically, GPT-4 showed a preference for suburban areas with significant greenery, contrasting with participants who found these areas less appealing. Conversely, in the city centre and densely populated urban regions of Helsinki, GPT-4 assigned lower visual appeal scores than participant ratings. While there was general agreement between AI and human assessments across various locations, GPT-4 struggled to incorporate contextual nuances into its ratings, unlike participants, who considered both context and features of the urban environment. The study suggests that leveraging AI models like GPT-4 allows spatial planners to gather insights into the visual appeal of different areas efficiently, aiding decisions that enhance residents' and travellers' satisfaction and mental health. Although AI models provide valuable insights, human perspectives are essential for a comprehensive understanding of urban visual appeal. This will ensure that planning and design decisions promote healthy living environments effectively.
- Europe > Finland > Uusimaa > Helsinki (0.47)
- North America > United States > New York (0.04)
- North America > United States > Minnesota (0.04)
- (6 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Consumer Health (0.68)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.66)
- Transportation > Ground > Road (0.46)
NATURAL PLAN: Benchmarking LLMs on Natural Language Planning
Zheng, Huaixiu Steven, Mishra, Swaroop, Zhang, Hugh, Chen, Xinyun, Chen, Minmin, Nova, Azade, Hou, Le, Cheng, Heng-Tze, Le, Quoc V., Chi, Ed H., Zhou, Denny
We introduce NATURAL PLAN, a realistic planning benchmark in natural language containing 3 key tasks: Trip Planning, Meeting Planning, and Calendar Scheduling. We focus our evaluation on the planning capabilities of LLMs with full information on the task, by providing outputs from tools such as Google Flights, Google Maps, and Google Calendar as contexts to the models. This eliminates the need for a tool-use environment for evaluating LLMs on Planning. We observe that NATURAL PLAN is a challenging benchmark for state of the art models. For example, in Trip Planning, GPT-4 and Gemini 1.5 Pro could only achieve 31.1% and 34.8% solve rate respectively. We find that model performance drops drastically as the complexity of the problem increases: all models perform below 5% when there are 10 cities, highlighting a significant gap in planning in natural language for SoTA LLMs. We also conduct extensive ablation studies on NATURAL PLAN to further shed light on the (in)effectiveness of approaches such as self-correction, few-shot generalization, and in-context planning with long-contexts on improving LLM planning.
- Europe > Finland > Uusimaa > Helsinki (0.08)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
- Pacific Ocean > North Pacific Ocean > San Francisco Bay > Golden Gate (0.05)
- (3 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Combining Qualitative and Computational Approaches for Literary Analysis of Finnish Novels
DOI and link will be added once available. What can we learn from the classics of Finnish literature by using computational emotion analysis? This article tries to answer this question by examining how computational methods of sentiment analysis can be used in the study of literary works in conjunction with a qualitative or more'traditional' approach to literature and affect. We present and develop a simple but robust computational approach of affect analysis that uses a carefully curated emotion lexicon adapted to Finnish turn-of-the-century literary texts combined with word embeddings to map out the semantic emotional spaces of seminal works of Finnish literature. We focus our qualitative analysis on selected case studies: four works by Juhani Aho, Minna Canth, Maria Jotuni and F. E. Sillanpää, but provide emotion arcs for a total of 975 Finnish novels. We argue that a computational analysis of a text's lexicon can be valuable in evaluating the large distribution of the emotional valence in a text and provide guidelines to help other researchers replicate our findings. We show that computational approaches have a place in traditional studies on affect in literature as a support tool for close-reading-based analyses, but also allowing for large-scale comparison between for example, genres or national canons. Introduction The study of literature provides interesting insights for an interdisciplinary study of emotion since literature can be considered a genre in which the affective functions of language are of principal importance (Hogan 2011, 1).
- Europe > Finland > Uusimaa > Helsinki (0.05)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (5 more...)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science > Emotion (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.87)