Tirana County
- Europe > United Kingdom > Scotland (0.05)
- Europe > Albania > Tirana County > Tirana (0.04)
- Europe > Germany > Hesse > Darmstadt Region > Frankfurt (0.04)
- (17 more...)
- Research Report > New Finding (0.68)
- Personal (0.46)
- Leisure & Entertainment > Sports (1.00)
- Leisure & Entertainment > Games > Computer Games (0.46)
- Europe > Ukraine > Kyiv Oblast > Kyiv (0.14)
- Europe > Austria > Vienna (0.14)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- (96 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Education > Health & Safety > School Nutrition (0.93)
- Health & Medicine > Consumer Health (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.73)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.52)
Four of the Strangest AI Moments in 2025
Pillay is an editorial fellow at TIME. Albania's new AI-generated minister Diella speaks during the parliamentary session for the voting of the new government of Albania, in Tirana on Sept. 18, 2025. Albania's new AI-generated minister Diella speaks during the parliamentary session for the voting of the new government of Albania, in Tirana on Sept. 18, 2025. Pillay is an editorial fellow at TIME. It's been three years since the launch of ChatGPT gave hundreds of millions of people access to a kind of digital genie in their pocket--and things have been getting stranger by the month. Besides billions of AI-generated emails and the technology's widespread disruption of education and cognitive work, in 2025, some people began to fall in love with their AIs.
- Europe > Albania > Tirana County > Tirana (0.46)
- North America > United States (0.05)
- Europe > Hungary (0.05)
- (4 more...)
MAGE-ID: A Multimodal Generative Framework for Intrusion Detection Systems
Loodaricheh, Mahdi Arab, Manshaei, Mohammad Hossein, Raja, Anita
Abstract--Modern Intrusion Detection Systems (IDS) face severe challenges due to heterogeneous network traffic, evolving cyber threats, and pronounced data imbalance between benign and attack flows. While generative models have shown promise in data augmentation, existing approaches are limited to single modalities and fail to capture cross-domain dependencies. This paper introduces MAGE-ID (Multimodal Attack Generator for Intrusion Detection), a diffusion-based generative framework that couples tabular flow features with their transformed images through a unified latent prior . By jointly training Transformer-and CNN-based variational encoders with an EDM-style denoiser, MAGE-ID achieves balanced and coherent multimodal synthesis. Evaluations on CIC-IDS-2017 and NSL-KDD demonstrate significant improvements in fidelity, diversity, and downstream detection performance over T abSyn and T abDDPM, highlighting MAGE-ID's effectiveness for multimodal IDS augmentation.
- North America > United States (0.04)
- Europe > Albania > Tirana County (0.04)
Data-efficient U-Net for Segmentation of Carbide Microstructures in SEM Images of Steel Alloys
Gerçek, Alinda Ezgi, Korten, Till, Chekhonin, Paul, Hassan, Maleeha, Steinbach, Peter
Understanding reactor-pressure-vessel steel microstructure is crucial for predicting mechanical properties, as carbide precipitates both strengthen the alloy and can initiate cracks. In scanning electron microscopy images, gray-value overlap between carbides and matrix makes simple thresholding ineffective. We present a data-efficient segmentation pipeline using a lightweight U-Net (30.7~M parameters) trained on just \textbf{10 annotated scanning electron microscopy images}. Despite limited data, our model achieves a \textbf{Dice-Sørensen coefficient of 0.98}, significantly outperforming the state-of-the-art in the field of metallurgy (classical image analysis: 0.85), while reducing annotation effort by one order of magnitude compared to the state-of-the-art data efficient segmentation model. This approach enables rapid, automated carbide quantification for alloy design and generalizes to other steel types, demonstrating the potential of data-efficient deep learning in reactor-pressure-vessel steel analysis.
- Europe > Germany > Saxony > Dresden (0.05)
- South America > Peru > Loreto Department (0.04)
- North America > United States (0.04)
- Europe > Albania > Tirana County (0.04)
- Energy (0.55)
- Materials > Metals & Mining > Steel (0.41)
Data Heterogeneity and Forgotten Labels in Split Federated Learning
Tirana, Joana, Tsigkari, Dimitra, Noguero, David Solans, Kourtellis, Nicolas
In Split Federated Learning (SFL), the clients collaboratively train a model with the help of a server by splitting the model into two parts. Part-1 is trained locally at each client and aggregated by the aggregator at the end of each round. Part-2 is trained at a server that sequentially processes the intermediate activations received from each client. We study the phenomenon of catastrophic forgetting (CF) in SFL in the presence of data heterogeneity. In detail, due to the nature of SFL, local updates of part-1 may drift away from global optima, while part-2 is sensitive to the processing sequence, similar to forgetting in continual learning (CL). Specifically, we observe that the trained model performs better in classes (labels) seen at the end of the sequence. We investigate this phenomenon with emphasis on key aspects of SFL, such as the processing order at the server and the cut layer. Based on our findings, we propose Hydra, a novel mitigation method inspired by multi-head neural networks and adapted for the SFL's setting. Extensive numerical evaluations show that Hydra outperforms baselines and methods from the literature.
AutoLibra: Agent Metric Induction from Open-Ended Human Feedback
Zhu, Hao, Cuvin, Phil, Yu, Xinkai, Yan, Charlotte Ka Yee, Zhang, Jason, Yang, Diyi
Agents are predominantly evaluated and optimized via task success metrics, which are coarse, rely on manual design from experts, and fail to reward intermediate emergent behaviors. We propose **AutoLibra**, a framework for agent evaluation, that transforms open-ended human feedback *e.g.* "If you find that the button is disabled, don't click it again", or "This agent has too much autonomy to decide what to do on its own" into metrics for evaluating fine-grained behaviors in agent trajectories. AutoLibra accomplishes this by grounding feedback to an agent's behavior, clustering similar positive and negative behaviors, and creating concrete metrics with clear definitions and concrete examples, which can be used for prompting LLM-as-a-Judge as evaluators. We further propose two meta metrics to evaluate the alignment of a set of (induced) metrics with open feedback: "coverage" and "redundancy". Through optimizing these meta-metrics, we experimentally demonstrate AutoLibra's ability to induce more concrete agent evaluation metrics than the ones proposed in previous agent evaluation benchmarks and discover new metrics to analyze agents. We also present two applications of AutoLibra in agent improvement: First, we show that AutoLibra serve human prompt engineers for diagonalize agent failures and improve prompts iterative. Moreover, we find that AutoLibra can induce metrics for automatic optimization for agents, which makes agents improve through self-regulation. Our results suggest that AutoLibra is a powerful task-agnostic tool for evaluating and improving language agents.
- Asia > Middle East > Jordan (0.04)
- North America > United States > Virginia (0.04)
- North America > United States > Pennsylvania (0.04)
- (8 more...)
- Education (0.92)
- Leisure & Entertainment > Games (0.67)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.34)
Aligning LLMs for Multilingual Consistency in Enterprise Applications
Agarwal, Amit, Meghwani, Hansa, Patel, Hitesh Laxmichand, Sheng, Tao, Ravi, Sujith, Roth, Dan
Large language models (LLMs) remain unreliable for global enterprise applications due to substantial performance gaps between high-resource and mid/low-resource languages, driven by English-centric pretraining and internal reasoning biases. This inconsistency undermines customer experience and operational reliability in multilingual settings such as customer support, content moderation, and information retrieval. Even with advanced Retrieval-Augmented Generation (RAG) systems, we observe up to an 29% accuracy drop in non-English languages compared to English. We propose a practical, batch-wise alignment strategy for fine-tuning LLMs, leveraging semantically equivalent multilingual data in each training batch to directly align model outputs across languages. This approach improves non-English accuracy by up to 23.9% without compromising English performance, model reasoning, or retrieval quality. Our method is simple to implement, scalable, and integrates seamlessly with existing LLM training & deployment pipelines, enabling more robust and equitable multilingual AI solutions in industry.
- Europe > Austria > Vienna (0.14)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- North America > Mexico > Mexico City > Mexico City (0.04)
- (6 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.93)
Who's Asking? Investigating Bias Through the Lens of Disability Framed Queries in LLMs
Hari, Vishnu, Panda, Kalpana, Panda, Srikant, Agarwal, Amit, Patel, Hitesh Laxmichand
Large Language Models (LLMs) routinely infer users demographic traits from phrasing alone, which can result in biased responses, even when no explicit demographic information is provided. The role of disability cues in shaping these inferences remains largely uncharted. Thus, we present the first systematic audit of disability-conditioned demographic bias across eight state-of-the-art instruction-tuned LLMs ranging from 3B to 72B parameters. Using a balanced template corpus that pairs nine disability categories with six real-world business domains, we prompt each model to predict five demographic attributes - gender, socioeconomic status, education, cultural background, and locality - under both neutral and disability-aware conditions. Across a varied set of prompts, models deliver a definitive demographic guess in up to 97\% of cases, exposing a strong tendency to make arbitrary inferences with no clear justification. Disability context heavily shifts predicted attribute distributions, and domain context can further amplify these deviations. We observe that larger models are simultaneously more sensitive to disability cues and more prone to biased reasoning, indicating that scale alone does not mitigate stereotype amplification. Our findings reveal persistent intersections between ableism and other demographic stereotypes, pinpointing critical blind spots in current alignment strategies. We release our evaluation framework and results to encourage disability-inclusive benchmarking and recommend integrating abstention calibration and counterfactual fine-tuning to curb unwarranted demographic inference. Code and data will be released on acceptance.
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Education (1.00)
The World's First AI-Powered Minister Tests the Future of Government
Pillay is an editorial fellow at TIME. Albania's new AI-generated minister Diella speaks during the parliamentary session for the voting of the new government of Albania, in Tirana, on September 18, 2025. Albania's new AI-generated minister Diella speaks during the parliamentary session for the voting of the new government of Albania, in Tirana, on September 18, 2025. Pillay is an editorial fellow at TIME. In September, Albania appointed an AI system to a cabinet-level position--a world-first. Called Diella (Albanian for "sun"), the system was declared "Minister of State for Artificial Intelligence," and tasked by Albania's Prime Minister with addressing corruption in government contracting.
- Europe > Albania > Tirana County > Tirana (0.46)
- North America > United States > Pennsylvania (0.05)
- Europe > Ukraine > Kyiv Oblast > Chernobyl (0.05)