Goto

Collaborating Authors

 mercer





novelty of the proposed method, which addresses how to embed continuous time to differentiable functional domain

Neural Information Processing Systems

We'd like to thank the reviewers for their careful reading and valuable comments. Second, we apologize for typos, grammar mistakes and unclear notations. They will be corrected in the final version. Third, we provide additional experiment results in Table 1. Additional experiment results (converted to percentage by multiplying by 100).


BioMNER: A Dataset for Biomedical Method Entity Recognition

Tang, Chen, Yang, Bohao, Zhao, Kun, Lv, Bo, Xiao, Chenghao, Guerin, Frank, Lin, Chenghua

arXiv.org Artificial Intelligence

Named entity recognition (NER) stands as a fundamental and pivotal task within the realm of Natural Language Processing. Particularly within the domain of Biomedical Method NER, this task presents notable challenges, stemming from the continual influx of domain-specific terminologies in scholarly literature. Current research in Biomedical Method (BioMethod) NER suffers from a scarcity of resources, primarily attributed to the intricate nature of methodological concepts, which necessitate a profound understanding for precise delineation. In this study, we propose a novel dataset for biomedical method entity recognition, employing an automated BioMethod entity recognition and information retrieval system to assist human annotation. Furthermore, we comprehensively explore a range of conventional and contemporary open-domain NER methodologies, including the utilization of cutting-edge large-scale language models (LLMs) customised to our dataset. Our empirical findings reveal that the large parameter counts of language models surprisingly inhibit the effective assimilation of entity extraction patterns pertaining to biomedical methods. Remarkably, the approach, leveraging the modestly sized ALBERT model (only 11MB), in conjunction with conditional random fields (CRF), achieves state-of-the-art (SOTA) performance.


Evaluating Large Language Models in Analysing Classroom Dialogue

Long, Yun, Luo, Haifeng, Zhang, Yu

arXiv.org Artificial Intelligence

This study explores the application of Large Language Models (LLMs), specifically GPT-4, in the analysis of classroom dialogue, a crucial research task for both teaching diagnosis and quality improvement. Recognizing the knowledge-intensive and labor-intensive nature of traditional qualitative methods in educational research, this study investigates the potential of LLM to streamline and enhance the analysis process. The study involves datasets from a middle school, encompassing classroom dialogues across mathematics and Chinese classes. These dialogues were manually coded by educational experts and then analyzed using a customised GPT-4 model. This study focuses on comparing manual annotations with the outputs of GPT-4 to evaluate its efficacy in analyzing educational dialogues. Time efficiency, inter-coder agreement, and inter-coder reliability between human coders and GPT-4 are evaluated. Results indicate substantial time savings with GPT-4, and a high degree of consistency in coding between the model and human coders, with some discrepancies in specific codes. These findings highlight the strong potential of LLM in teaching evaluation and facilitation.


On the speed of uniform convergence in Mercer's theorem

Takhanov, Rustem

arXiv.org Artificial Intelligence

Mercer kernels play an important role in machine learning and is a mathematical basis of such techniques as kernel density estimation and spline models [14], Support Vector Machines [11], kernel principal components analysis [10], regularization of neural networks [13] and many others. According to Aronszajn's theorem, any Mercer kernel induces a reproducing kernel Hilbert space (RKHS) and vice versa, any RKHS corresponds to a kernel. A relationship between the latter two notions is decribed in the classical Mercer's theorem. A goal of this note is torefine this theoremandgive some estimates onthe speedof uniformconvergencestated in it.


Uniform Generalization Bounds for Overparameterized Neural Networks

Vakili, Sattar, Bromberg, Michael, Shiu, Da-shan, Bernacchia, Alberto

arXiv.org Machine Learning

An interesting observation in artificial neural networks is their favorable generalization error despite typically being extremely overparameterized. It is well known that classical statistical learning methods often result in vacuous generalization errors in the case of overparameterized neural networks. Adopting the recently developed Neural Tangent (NT) kernel theory, we prove uniform generalization bounds for overparameterized neural networks in kernel regimes, when the true data generating model belongs to the reproducing kernel Hilbert space (RKHS) corresponding to the NT kernel. Importantly, our bounds capture the exact error rates depending on the differentiability of the activation functions. In order to establish these bounds, we propose the information gain of the NT kernel as a measure of complexity of the learning problem. Our analysis uses a Mercer decomposition of the NT kernel in the basis of spherical harmonics and the decay rate of the corresponding eigenvalues. As a byproduct of our results, we show the equivalence between the RKHS corresponding to the NT kernel and its counterpart corresponding to the Mat\'ern family of kernels, that induces a very general class of models. We further discuss the implications of our analysis for some recent results on the regret bounds for reinforcement learning algorithms, which use overparameterized neural networks.


AI Continues DevOps Expansion

#artificialintelligence

AI gives us the potential to look through the clutter and pick out pieces of data that really matter. It's no wonder, then, that AI is increasingly being used to target complex IT tasks, including DevOps. For instance, the Swedish company CodeScene is finding success in using machine learning to analyze source code. The company's offering, which is partly based on co-founder Adam Tornhill book "Your Code As A Crime Scene," analyzes version control metadata over time to determine where "hot spots" in the code that companies should be paying more attention to. CodeScene, which was founded in 2015, is owned by Empear AB and raised 30 million Swedish Kronor (about $3.6 million) earlier this year.


Can We Trust the Presidential-Election Polls?

The New Yorker

On October 18, 2016, the New York Times gave Hillary Clinton a ninety-one-per-cent chance of beating Donald Trump. Five days later, ABC News released a tracking poll showing her ahead of Trump by twelve points. Buoyed by the polls, Democrats--especially Democratic women--approached November 8th with a joyful sense of inevitability. The collective disbelief when Clinton lost was tinged with confusion: How could the pollsters have been so wrong? Now, with Joe Biden leading Trump by double digits in the lead-up to Election Day, according to the latest NPR/PBS NewsHour/Marist survey, the question has to be asked: Are voters hoping for a Biden victory about to fall in the same trap?