Not enough data to create a plot.
Try a different view from the menu above.
Hungary
Dogs can fulfill our need to nurture
Breakthroughs, discoveries, and DIY tips sent every weekday. Just as birth rates decline in many wealthy and developed nations, dog parenting is remaining steady and even gaining in popularity. Up to half of households in Europe and 66 percent of homes in the United States have at least one dog and these pets are often regarded as a family member or "fur baby." To dig into what this shift says about our society, researchers from Eötvös Loránd University in Budapest, Hungary conducted a literature review to analyze the data. They propose that while dogs do not replace children, they can offer a chance to fulfill an innate nurturing drive similar to parenting, but with fewer demands than raising biological children.
Your eyes can reveal the accuracy of your memories
Breakthroughs, discoveries, and DIY tips sent every weekday. We like to think our brains are reliable recorders--but reality says otherwise. From misremembered childhood moments to mistakenly "recalling" that you took your pills when you didn't, false memories are surprisingly common. And in high-stakes situations like courtroom testimony, these errors can have devastating consequences. Wouldn't it be amazing if there were an objective way to measure just how accurate someone's memory really is? New research suggests we might be able to do just that--by watching the eyes.
ALT: A Python Package for Lightweight Feature Representation in Time Series Classification
Halmos, Balázs P., Hajós, Balázs, Molnár, Vince Á., Kurbucz, Marcell T., Jakovác, Antal
We introduce ALT, an open-source Python package created for efficient and accurate time series classification (TSC). The package implements the adaptive law-based transformation (ALT) algorithm, which transforms raw time series data into a linearly separable feature space using variable-length shifted time windows. This adaptive approach enhances its predecessor, the linear law-based transformation (LLT), by effectively capturing patterns of varying temporal scales. The software is implemented for scalability, interpretability, and ease of use, achieving state-of-the-art performance with minimal computational overhead. Extensive benchmarking on real-world datasets demonstrates the utility of ALT for diverse TSC tasks in physics and related domains.
OpenHuEval: Evaluating Large Language Model on Hungarian Specifics
Yang, Haote, Wei, Xingjian, Wu, Jiang, Ligeti-Nagy, Noémi, Sun, Jiaxing, Wang, Yinfan, Yang, Zijian Győző, Gao, Junyuan, Wang, Jingchao, Jiang, Bowen, Wang, Shasha, Yu, Nanjun, Zhang, Zihao, Hong, Shixin, Liu, Hongwei, Li, Wei, Zhang, Songyang, Lin, Dahua, Wu, Lijun, Prószéky, Gábor, He, Conghui
We introduce OpenHuEval, the first benchmark for LLMs focusing on the Hungarian language and specifics. OpenHuEval is constructed from a vast collection of Hungarian-specific materials sourced from multiple origins. In the construction, we incorporated the latest design principles for evaluating LLMs, such as using real user queries from the internet, emphasizing the assessment of LLMs' generative capabilities, and employing LLM-as-judge to enhance the multidimensionality and accuracy of evaluations. Ultimately, OpenHuEval encompasses eight Hungarian-specific dimensions, featuring five tasks and 3953 questions. Consequently, OpenHuEval provides the comprehensive, in-depth, and scientifically accurate assessment of LLM performance in the context of the Hungarian language and its specifics. We evaluated current mainstream LLMs, including both traditional LLMs and recently developed Large Reasoning Models. The results demonstrate the significant necessity for evaluation and model optimization tailored to the Hungarian language and specifics. We also established the framework for analyzing the thinking processes of LRMs with OpenHuEval, revealing intrinsic patterns and mechanisms of these models in non-English languages, with Hungarian serving as a representative example. We will release OpenHuEval at https://github.com/opendatalab/OpenHuEval .
Adaptive Exploration for Data-Efficient General Value Function Evaluations
General Value Functions (GVFs) (Sutton et al., 2011) represent predictive knowledge in reinforcement learning. Each GVF computes the expected return for a given policy, based on a unique reward. Existing methods relying on fixed behavior policies or pre-collected data often face data efficiency issues when learning multiple GVFs in parallel using off-policy methods. To address this, we introduce GVFExplorer, which adaptively learns a single behavior policy that efficiently collects data for evaluating multiple GVFs in parallel.
Predicting performance-related properties of refrigerant based on tailored small-molecule functional group contribution
Cao, Peilin, Geng, Ying, Feng, Nan, Zhang, Xiang, Qi, Zhiwen, Song, Zhen, Gani, Rafiqul
As current group contribution (GC) methods are mostly proposed for a wide size-range of molecules, applying them to property prediction of small refrigerant molecules could lead to unacceptable errors. In this sense, for the design of novel refrigerants and refrigeration systems, tailoring GC-based models specifically fitted to refrigerant molecules is of great interest. In this work, databases of potential refrigerant molecules are first collected, focusing on five key properties related to the operational efficiency of refrigeration systems, namely normal boiling point, critical temperature, critical pressure, enthalpy of vaporization, and acentric factor. Based on tailored small-molecule groups, the GC method is combined with machine learning (ML) to model these performance-related properties. Following the development of GC-ML models, their performance is analyzed to highlight the potential group-to-property contributions. Additionally, the refrigerant property databases are extended internally and externally, based on which examples are presented to highlight the significance of the developed models.
MASS: Mathematical Data Selection via Skill Graphs for Pretraining Large Language Models
Li, Jiazheng, Yu, Lu, Cui, Qing, Zhang, Zhiqiang, Zhou, Jun, Ye, Yanfang, Zhang, Chuxu
High-quality data plays a critical role in the pretraining and fine-tuning of large language models (LLMs), even determining their performance ceiling to some degree. Consequently, numerous data selection methods have been proposed to identify subsets of data that can effectively and efficiently enhance model performance. However, most of these methods focus on general data selection and tend to overlook the specific nuances of domain-related data. In this paper, we introduce MASS, a \textbf{MA}thematical data \textbf{S}election framework using the \textbf{S}kill graph for pretraining LLMs in the mathematical reasoning domain. By taking into account the unique characteristics of mathematics and reasoning, we construct a skill graph that captures the mathematical skills and their interrelations from a reference dataset. This skill graph guides us in assigning quality scores to the target dataset, enabling us to select the top-ranked subset which is further used to pretrain LLMs. Experimental results demonstrate the efficiency and effectiveness of MASS across different model sizes (1B and 7B) and pretraining datasets (web data and synthetic data). Specifically, in terms of efficiency, models trained on subsets selected by MASS can achieve similar performance to models trained on the original datasets, with a significant reduction in the number of trained tokens - ranging from 50\% to 70\% fewer tokens. In terms of effectiveness, when trained on the same amount of tokens, models trained on the data selected by MASS outperform those trained on the original datasets by 3.3\% to 5.9\%. These results underscore the potential of MASS to improve both the efficiency and effectiveness of pretraining LLMs.
A finite-sample bound for identifying partially observed linear switched systems from a single trajectory
Racz, Daniel, Petreczky, Mihaly, Daroczy, Balint
We derive a finite-sample probabilistic bound on the parameter estimation error of a system identification algorithm for Linear Switched Systems. The algorithm estimates Markov parameters from a single trajectory and applies a variant of the Ho-Kalman algorithm to recover the system matrices. Our bound guarantees statistical consistency under the assumption that the true system exhibits quadratic stability. The proof leverages the theory of weakly dependent processes. To the best of our knowledge, this is the first finite-sample bound for this algorithm in the single-trajectory setting.
Mesters\'eges Intelligencia Kutat\'asok Magyarorsz\'agon
Benczúr, András A., Gyimóthy, Tibor, Szegedy, Balázs
Artificial intelligence (AI) has undergone remarkable development since the mid-2000s, particularly in the fields of machine learning and deep learning, driven by the explosive growth of large databases and computational capacity. Hungarian researchers recognized the significance of AI early on, actively participating in international research and achieving significant results in both theoretical and practical domains. This article presents some key achievements in Hungarian AI research. It highlights the results from the period before the rise of deep learning (the early 2010s), then discusses major theoretical advancements in Hungary after 2010. Finally, it provides a brief overview of AI-related applied scientific achievements from 2010 onward.