AITopics

Large language models (LLMs) have significantly impacted human society, influencing various domains. Among them, academia is not simply a domain affected by LLMs, but it is also the pivotal force in the development of LLMs. In academic publications, this phenomenon is represented during the incorporation of LLMs into the peer review mechanism for reviewing manuscripts. We proposed the concept of automated scholarly paper review (ASPR) in our previous paper. As the incorporation grows, it now enters the coexistence phase of ASPR and peer review, which is described in that paper. LLMs hold transformative potential for the full-scale implementation of ASPR, but they also pose new issues and challenges that need to be addressed. In this survey paper, we aim to provide a holistic view of ASPR in the era of LLMs. We begin with a survey to find out which LLMs are used to conduct ASPR. Then, we review what ASPR-related technological bottlenecks have been solved with the incorporation of LLM technology. After that, we move on to explore new methods, new datasets, new source code, and new online systems that come with LLMs for ASPR. Furthermore, we summarize the performance and issues of LLMs in ASPR, and investigate the attitudes and reactions of publishers and academia to ASPR. Lastly, we discuss the challenges associated with the development of LLMs for ASPR. We hope this survey can serve as an inspirational reference for the researchers and promote the progress of ASPR for its actual implementation.

large language model, machine learning, natural language, (19 more...)

2501.10326

Country:

Asia > China > Guangdong Province > Guangzhou (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Bobojonova, Latofat, Akhundjanova, Arofat, Ostheimer, Phil, Fellenz, Sophie

BBPOS: BERT-based Part-of-Speech Tagging for Uzbek

This paper advances NLP research for the low-resource Uzbek language by evaluating two previously untested monolingual Uzbek BERT models on the part-of-speech (POS) tagging task and introducing the first publicly available UPOS-tagged benchmark dataset for Uzbek. Our fine-tuned models achieve 91% average accuracy, outperforming the baseline multi-lingual BERT as well as the rule-based tagger. Notably, these models capture intermediate POS changes through affixes and demonstrate context sensitivity, unlike existing rule-based taggers.

artificial intelligence, machine learning, natural language, (18 more...)

2501.10107

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Germany > Saarland (0.04)
Europe > Slovenia > Coastal-Karst > Municipality of Koper > Koper (0.04)
(3 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Generative Artificial Intelligence: Implications for Biomedical and Health Professions Education

Hersh, William

Generative AI has had a profound impact on biomedicine and health, both in professional work and in education. Based on large language models (LLMs), generative AI has been found to perform as well as humans in simulated situations taking medical board exams, answering clinical questions, solving clinical cases, applying clinical reasoning, and summarizing information. Generative AI is also being used widely in education, performing well in academic courses and their assessments. This review summarizes the successes of LLMs and highlights some of their challenges in the context of education, most notably aspects that may undermines the acquisition of knowledge and skills for professional work. It then provides recommendations for best practices overcoming shortcomings for LLM use in education. Although there are challenges for use of generative AI in education, all students and faculty, in biomedicine and health and beyond, must have understanding and be competent in its use.

generative ai, large language model, machine learning, (17 more...)

2501.10186

Country:

North America > United States > New York > Monroe County > Rochester (0.05)
Asia > Singapore (0.04)
Asia > Middle East > Israel (0.04)
(4 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)
(2 more...)

Industry:

Law (1.00)
Information Technology (1.00)
Health & Medicine > Therapeutic Area > Oncology (1.00)
(7 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Agrawal, Roshni, Subramanian, Sivakumar, Runkana, Venkataramana

Two-level Solar Irradiance Clustering with Season Identification: A Comparative Analysis

Solar irradiance clustering can enhance solar power capacity planning and help improve forecasting models by identifying similar irradiance patterns influenced by seasonal and weather changes. In this study, we adopt an efficient two-level clustering approach to automatically identify seasons using the clear sky irradiance in first level and subsequently to identify daily cloud level as clear, cloudy and partly cloudy within each season in second level. In the second level of clustering, three methods are compared, namely, Daily Irradiance Index (DII or $\beta$), Euclidean Distance (ED), and Dynamic Time Warping (DTW) distance. The DII is computed as the ratio of time integral of measured irradiance to time integral of the clear sky irradiance. The identified clusters were compared quantitatively using established clustering metrics and qualitatively by comparing the mean irradiance profiles. The results clearly establish the superiority of the $\beta$-based clustering approach as the leader, setting a new benchmark for solar irradiance clustering studies. Moreover, $\beta$-based clustering remains effective even for annual data unlike the time-series methods which suffer significant performance degradation. Interestingly, contrary to expectations, ED-based clustering outperforms the more compute-intensive DTW distance-based clustering. The method has been rigorously validated using data from two distinct US locations, demonstrating robust scalability for larger datasets and potential applicability for other locations.

data mining, irradiance, machine learning, (19 more...)

2501.10084

Country:

North America > United States > Colorado > Jefferson County > Golden (0.05)
North America > United States > Hawaii > Honolulu County > Kailua (0.04)
Africa > Southern Africa (0.04)
(13 more...)

Genre: Research Report (0.84)

Industry:

Energy > Renewable > Solar (1.00)
Energy > Power Industry (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Karaca, Ali Can, Ozelbas, M. Enes, Berber, Saadettin, Karimli, Orkhan, Yildirim, Turabi, Amasyali, M. Fatih

Robust Change Captioning in Remote Sensing: SECOND-CC Dataset and MModalCC Framework

Remote sensing change captioning (RSICC) aims to describe changes between bitemporal images in natural language. Existing methods often fail under challenges like illumination differences, viewpoint changes, blur effects, leading to inaccuracies, especially in no-change regions. Moreover, the images acquired at different spatial resolutions and have registration errors tend to affect the captions. To address these issues, we introduce SECOND-CC, a novel RSICC dataset featuring high-resolution RGB image pairs, semantic segmentation maps, and diverse real-world scenarios. SECOND-CC which contains 6,041 pairs of bitemporal RS images and 30,205 sentences describing the differences between images. Additionally, we propose MModalCC, a multimodal framework that integrates semantic and visual data using advanced attention mechanisms, including Cross-Modal Cross Attention (CMCA) and Multimodal Gated Cross Attention (MGCA). Detailed ablation studies and attention visualizations further demonstrate its effectiveness and ability to address RSICC challenges. Comprehensive experiments show that MModalCC outperforms state-of-the-art RSICC methods, including RSICCformer, Chg2Cap, and PSNet with +4.6% improvement on BLEU4 score and +9.6% improvement on CIDEr score. We will make our dataset and codebase publicly available to facilitate future research at https://github.com/ChangeCapsInRS/SecondCC

artificial intelligence, machine learning, natural language, (20 more...)

2501.10075

Country:

Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
Asia > Middle East > UAE > Dubai Emirate > Dubai (0.04)
Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision (0.93)
(2 more...)

Contributions to the Decision Theoretic Foundations of Machine Learning and Robust Statistics under Weakly Structured Information

Jansen, Christoph

This habilitation thesis is cumulative and, therefore, is collecting and connecting research that I (together with several co-authors) have conducted over the last few years. Thus, the absolute core of the work is formed by the ten publications listed on page 5 under the name Contributions 1 to 10. The references to the complete versions of these articles are also found in this list, making them as easily accessible as possible for readers wishing to dive deep into the different research projects. The chapters following this thesis, namely Parts A to C and the concluding remarks, serve to place the articles in a larger scientific context, to (briefly) explain their respective content on a less formal level, and to highlight some interesting perspectives for future research in their respective contexts. Naturally, therefore, the following presentation has neither the level of detail nor the formal rigor that can (hopefully) be found in the papers. The purpose of the following text is to provide the reader an easy and high-level access to this interesting and important research field as a whole, thereby, advertising it to a broader audience.

artificial intelligence, decision support system, machine learning, (18 more...)

2501.10195

Country:

Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.92)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(3 more...)

Mutual Regression Distance

Qiao, Dong, Fan, Jicong

The maximum mean discrepancy and Wasserstein distance are popular distance measures between distributions and play important roles in many machine learning problems such as metric learning, generative modeling, domain adaption, and clustering. However, since they are functions of pair-wise distances between data points in two distributions, they do not exploit the potential manifold properties of data such as smoothness and hence are not effective in measuring the dissimilarity between the two distributions in the form of manifolds. In this paper, different from existing measures, we propose a novel distance called Mutual Regression Distance (MRD) induced by a constrained mutual regression problem, which can exploit the manifold property of data. We prove that MRD is a pseudometric that satisfies almost all the axioms of a metric. Since the optimization of the original MRD is costly, we provide a tight MRD and a simplified MRD, based on which a heuristic algorithm is established. We also provide kernel variants of MRDs that are more effective in handling nonlinear data. Our MRDs especially the simplified MRDs have much lower computational complexity than the Wasserstein distance. We provide theoretical guarantees, such as robustness, for MRDs. Finally, we apply MRDs to distribution clustering, generative models, and domain adaptation. The numerical results demonstrate the effectiveness and superiority of MRDs compared to the baselines.

artificial intelligence, machine learning, mrd, (15 more...)

2501.10617

Country:

Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)
North America > United States > Michigan (0.04)
(2 more...)

Genre: Research Report (0.83)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Hashimoto, Ichiro, Volgushev, Stanislav, Zwiernik, Piotr

Universality of Benign Overfitting in Binary Linear Classification

The practical success of deep learning has led to the discovery of several surprising phenomena. One of these phenomena, that has spurred intense theoretical research, is ``benign overfitting'': deep neural networks seem to generalize well in the over-parametrized regime even though the networks show a perfect fit to noisy training data. It is now known that benign overfitting also occurs in various classical statistical models. For linear maximum margin classifiers, benign overfitting has been established theoretically in a class of mixture models with very strong assumptions on the covariate distribution. However, even in this simple setting, many questions remain open. For instance, most of the existing literature focuses on the noiseless case where all true class labels are observed without errors, whereas the more interesting noisy case remains poorly understood. We provide a comprehensive study of benign overfitting for linear maximum margin classifiers. We discover a phase transition in test error bounds for the noisy model which was previously unknown and provide some geometric intuition behind it. We further considerably relax the required covariate assumptions in both, the noisy and noiseless case. Our results demonstrate that benign overfitting of maximum margin classifiers holds in a much wider range of scenarios than was previously known and provide new insights into the underlying mechanisms.

artificial intelligence, deep learning, machine learning, (17 more...)

2501.10538

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

DPERC: Direct Parameter Estimation for Mixed Data

Vo, Tuan L., Do, Quan Huu, Dang, Uyen, Nguyen, Thu, Halvorsen, Pål, Riegler, Michael A., Nguyen, Binh T.

The covariance matrix is a foundation in numerous statistical and machine-learning applications such as Principle Component Analysis, Correlation Heatmap, etc. However, missing values within datasets present a formidable obstacle to accurately estimating this matrix. While imputation methods offer one avenue for addressing this challenge, they often entail a trade-off between computational efficiency and estimation accuracy. Consequently, attention has shifted towards direct parameter estimation, given its precision and reduced computational burden. In this paper, we propose Direct Parameter Estimation for Randomly Missing Data with Categorical Features (DPERC), an efficient approach for direct parameter estimation tailored to mixed data that contains missing values within continuous features. Our method is motivated by leveraging information from categorical features, which can significantly enhance covariance matrix estimation for continuous features. Our approach effectively harnesses the information embedded within mixed data structures. Through comprehensive evaluations of diverse datasets, we demonstrate the competitive performance of DPERC compared to various contemporary techniques. In addition, we also show by experiments that DPERC is a valuable tool for visualizing the correlation heatmap.

artificial intelligence, covariance matrix, machine learning, (17 more...)

2501.1054

Country:

Asia > Vietnam > Hồ Chí Minh City > Hồ Chí Minh City (0.05)
Europe > Norway > Eastern Norway > Oslo (0.04)
North America > United States > California (0.04)
Europe > Switzerland (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Differentiable Adversarial Attacks for Marked Temporal Point Processes

Chakraborty, Pritish, Gupta, Vinayak, R, Rahul, Bedathur, Srikanta J., De, Abir

Marked temporal point processes (MTPPs) have been shown to be extremely effective in modeling continuous time event sequences (CTESs). In this work, we present adversarial attacks designed specifically for MTPP models. A key criterion for a good adversarial attack is its imperceptibility. For objects such as images or text, this is often achieved by bounding perturbation in some fixed $L_p$ norm-ball. However, similarly minimizing distance norms between two CTESs in the context of MTPPs is challenging due to their sequential nature and varying time-scales and lengths. We address this challenge by first permuting the events and then incorporating the additive noise to the arrival timestamps. However, the worst case optimization of such adversarial attacks is a hard combinatorial problem, requiring exploration across a permutation space that is factorially large in the length of the input sequence. As a result, we propose a novel differentiable scheme PERMTPP using which we can perform adversarial attacks by learning to minimize the likelihood, while minimizing the distance between two CTESs. Our experiments on four real-world datasets demonstrate the offensive and defensive capabilities, and lower inference times of PERMTPP.

artificial intelligence, data mining, machine learning, (19 more...)

2501.10606

Country:

North America > United States > Washington > King County > Seattle (0.04)
Europe > United Kingdom (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.68)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Data Science > Data Mining (0.68)
(2 more...)