AITopics | Hou, Jue

Collaborating Authors

Hou, Jue

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Implicit assessment of language learning during practice as accurate as explicit testing

Hou, Jue, Katinskaia, Anisia, Vu, Anh-Duc, Yangarber, Roman

arXiv.org Artificial IntelligenceSep-24-2024

Assessment of proficiency of the learner is an essential part of Intelligent Tutoring Systems (ITS). We use Item Response Theory (IRT) in computer-aided language learning for assessment of student ability in two contexts: in test sessions, and in exercises during practice sessions. Exhaustive testing across a wide range of skills can provide a detailed picture of proficiency, but may be undesirable for a number of reasons. Therefore, we first aim to replace exhaustive tests with efficient but accurate adaptive tests. We use learner data collected from exhaustive tests under imperfect conditions, to train an IRT model to guide adaptive tests. Simulations and experiments with real learner data confirm that this approach is efficient and accurate. Second, we explore whether we can accurately estimate learner ability directly from the context of practice with exercises, without testing. We transform learner data collected from exercise sessions into a form that can be used for IRT modeling. This is done by linking the exercises to {\em linguistic constructs}; the constructs are then treated as "items" within IRT. We present results from large-scale studies with thousands of learners. Using teacher assessments of student ability as "ground truth," we compare the estimates obtained from tests vs. those from exercises. The experiments confirm that the IRT models can produce accurate ability estimation based on exercises.

artificial intelligence, learner, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2409.16133

Country:

Europe (0.29)
North America > United States > New York (0.14)

Genre: Research Report (0.82)

Industry:

Education > Curriculum > Subject-Specific Education (0.85)
Education > Educational Technology > Educational Software > Computer Based Training (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

What do Transformers Know about Government?

Hou, Jue, Katinskaia, Anisia, Kotilainen, Lari, Trangcasanchai, Sathianpong, Vu, Anh-Duc, Yangarber, Roman

arXiv.org Artificial IntelligenceApr-22-2024

This paper investigates what insights about linguistic features and what knowledge about the structure of natural language can be obtained from the encodings in transformer language models.In particular, we explore how BERT encodes the government relation between constituents in a sentence. We use several probing classifiers, and data from two morphologically rich languages. Our experiments show that information about government is encoded across all transformer layers, but predominantly in the early layers of the model. We find that, for both languages, a small number of attention heads encode enough information about the government relations to enable us to train a classifier capable of discovering new, previously unknown types of government, never seen in the training data. Currently, data is lacking for the research community working on grammatical constructions, and government in particular. We release the Government Bank -- a dataset defining the government relations for thousands of lemmas in the languages in our experiments.

classifier, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2404.1427

Country:

Asia (0.68)
Europe > Finland (0.68)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:

Research Report > New Finding (0.88)
Research Report > Experimental Study (0.69)

Industry: Government > Regional Government > Europe Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)

Add feedback

Effects of sub-word segmentation on performance of transformer language models

Hou, Jue, Katinskaia, Anisia, Vu, Anh-Duc, Yangarber, Roman

arXiv.org Artificial IntelligenceOct-26-2023

Language modeling is a fundamental task in natural language processing, which has been thoroughly explored with various architectures and hyperparameters. However, few studies focus on the effect of sub-word segmentation on the performance of language models (LMs). In this paper, we compare GPT and BERT models trained with the statistical segmentation algorithm BPE vs. two unsupervised algorithms for morphological segmentation -- Morfessor and StateMorph. We train the models for several languages -- including ones with very rich morphology -- and compare their performance with different segmentation algorithms, vocabulary sizes, and model sizes. The results show that training with morphological segmentation allows the LMs to: 1. achieve lower perplexity, 2. converge more efficiently in terms of training time, and 3. achieve equivalent or better evaluation scores on downstream tasks. Lastly, we show 4. that LMs of smaller size using morphological segmentation can perform comparably to models of larger size trained with BPE -- both in terms of (1) perplexity and (3) scores on downstream tasks. Points (2) and (4) impact on sustainability of LMs, since they reduce the model cost: size and computation time. While (2) reduces cost only in the training phase, (4) does so also in the inference phase.

large language model, machine learning, segmentation, (21 more...)

arXiv.org Artificial Intelligence

2305.0548

Country:

Europe > Germany (0.28)
Asia > Middle East > UAE (0.14)
Asia > Middle East > Qatar (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.70)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.69)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.48)
(2 more...)

Add feedback

LATTE: Label-efficient Incident Phenotyping from Longitudinal Electronic Health Records

Wen, Jun, Hou, Jue, Bonzel, Clara-Lea, Zhao, Yihan, Castro, Victor M., Gainer, Vivian S., Weisenfeld, Dana, Cai, Tianrun, Ho, Yuk-Lam, Panickan, Vidul A., Costa, Lauren, Hong, Chuan, Gaziano, J. Michael, Liao, Katherine P., Lu, Junwei, Cho, Kelly, Cai, Tianxi

arXiv.org Artificial IntelligenceMay-18-2023

Electronic health record (EHR) data are increasingly used to support real-world evidence (RWE) studies. Yet its ability to generate reliable RWE is limited by the lack of readily available precise information on the timing of clinical events such as the onset time of heart failure. We propose a LAbel-efficienT incidenT phEnotyping (LATTE) algorithm to accurately annotate the timing of clinical events from longitudinal EHR data. By leveraging the pre-trained semantic embedding vectors from large-scale EHR data as prior knowledge, LATTE selects predictive EHR features in a concept re-weighting module by mining their relationship to the target event and compresses their information into longitudinal visit embeddings through a visit attention learning network. LATTE employs a recurrent neural network to capture the sequential dependency between the target event and visit embeddings before/after it. To improve label efficiency, LATTE constructs highly informative longitudinal silver-standard labels from large-scale unlabeled patients to perform unsupervised pre-training and semi-supervised joint training. Finally, LATTE enhances cross-site portability via contrastive representation learning. LATTE is evaluated on three analyses: the onset of type-2 diabetes, heart failure, and the onset and relapses of multiple sclerosis. We use various evaluation metrics present in the literature including the $ABC_{gain}$, the proportion of reduction in the area between the observed event indicator and the predicted cumulative incidences in reference to the prediction per incident prevalence. LATTE consistently achieves substantial improvement over benchmark methods such as SAMGEP and RETAIN in all settings.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2305.11407

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Health Care Technology > Medical Record (1.00)
Health & Medicine > Therapeutic Area > Neurology > Multiple Sclerosis (0.88)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Linguistic Constructs as the Representation of the Domain Model in an Intelligent Language Tutoring System

Katinskaia, Anisia, Hou, Jue, Vu, Anh-Duc, Yangarber, Roman

arXiv.org Artificial IntelligenceDec-3-2022

This paper presents the development of an AI-based language learning platform Revita. It is a freely available intelligent online tutor, developed to support learners of multiple languages, from low-intermediate to advanced levels. It has been in pilot use by hundreds of students at several universities, whose feedback and needs are shaping the development. One of the main emerging features of Revita is the introduction of a system of linguistic constructs as the representation of domain knowledge. The system of constructs is developed in close collaboration with experts in language teaching. Constructs define the types of exercises, the content of the feedback, and enable the detailed modeling and evaluation of learning progress.

artificial intelligence, learner, natural language, (16 more...)

arXiv.org Artificial Intelligence

2212.01711

Country: Europe > United Kingdom > England (0.28)

Genre: Research Report (0.50)

Industry: Education > Educational Technology > Educational Software > Computer Based Training (0.65)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.83)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.51)

Add feedback

High-Resolution Boundary Detection for Medical Image Segmentation with Piece-Wise Two-Sample T-Test Augmented Loss

Lin, Yucong, Su, Jinhua, Li, Yuhang, Wei, Yuhao, Yan, Hanchao, Zhang, Saining, Luo, Jiaan, Ai, Danni, Song, Hong, Fan, Jingfan, Fu, Tianyu, Xiao, Deqiang, Wang, Feifei, Hou, Jue, Yang, Jian

arXiv.org Artificial IntelligenceNov-4-2022

Fully automatic segmentation methods, such as liver and liver tumor segmentation, brain and brain tumor segmentation, optic disc segmentation, cell segmentation, lung segmentation, pulmonary nodule segmentation, and cardiac image segmentation [2], are essential for the diagnosis of serious diseases [3]. Therefore, it is important to improve the efficiency and accuracy of medical image segmentation methods. Medical image segmentation involves segmenting specific organs (e.g., the pancreas, liver, and bladder), determining certain functional parts of an organ (e.g., cardiac segmentation and retinal vessel segmentation), and identifying tumors in the organs. Medical images can generally be categorized according to the imaging technology and data form. Imaging technology includes X-ray photos, computed tomography, magnetic resonance imaging (MRI), and ultrasound imaging. Raw measurements are transformed into pixelated imaging data as part of the standard process. Although the original data are mostly three-dimensional images, two-dimensional slices are often created according to clinical procedure protocols that target specific applications. Most medical image segmentation methods are designed for two-dimensional slices.

artificial intelligence, loss function, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2211.02419

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)

Genre: Research Report > Experimental Study (0.52)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Surrogate Assisted Semi-supervised Inference for High Dimensional Risk Prediction

Hou, Jue, Guo, Zijian, Cai, Tianxi

arXiv.org Machine LearningMay-3-2021

Risk modeling with EHR data is challenging due to a lack of direct observations on the disease outcome, and the high dimensionality of the candidate predictors. In this paper, we develop a surrogate assisted semi-supervised-learning (SAS) approach to risk modeling with high dimensional predictors, leveraging a large unlabeled data on candidate predictors and surrogates of outcome, as well as a small labeled data with annotated outcomes. The SAS procedure borrows information from surrogates along with candidate predictors to impute the unobserved outcomes via a sparse working imputation model with moment conditions to achieve robustness against mis-specification in the imputation model and a one-step bias correction to enable interval estimation for the predicted risk. We demonstrate that the SAS procedure provides valid inference for the predicted risk derived from a high dimensional working model, even when the underlying risk prediction model is dense and the risk model is mis-specified. We present an extensive simulation study to demonstrate the superiority of our SSL approach compared to existing supervised methods. We apply the method to derive genetic risk prediction of type-2 diabetes mellitus using a EHR biobank cohort.

diabetes, oncology, rsc, (22 more...)

arXiv.org Machine Learning

2105.01264

Country: North America > United States > California (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.88)
Health & Medicine > Therapeutic Area > Oncology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Estimating Treatment Effect under Additive Hazards Models with High-dimensional Covariates

Hou, Jue, Bradic, Jelena, Xu, Ronghui

arXiv.org Machine LearningJun-29-2019

Estimating causal effects for survival outcomes in the high-dimensional setting is an extremely important topic for many biomedical applications as well as areas of social sciences. We propose a new orthogonal score method for treatment effect estimation and inference that results in asymptotically valid confidence intervals assuming only good estimation properties of the hazard outcome model and the conditional probability of treatment. This guarantee allows us to provide valid inference for the conditional treatment effect under the high-dimensional additive hazards model under considerably more generality than existing approaches. In addition, we develop a new Hazards Difference (HDi), estimator. We showcase that our approach has double-robustness properties in high dimensions: with cross-fitting, the HDi estimate is consistent under a wide variety of treatment assignment models; the HDi estimate is also consistent when the hazards model is misspecified and instead the true data generating mechanism follows a partially linear additive hazards model. We further develop a novel sparsity doubly robust result, where either the outcome or the treatment model can be a fully dense high-dimensional model. We apply our methods to study the treatment effect of radical prostatectomy versus conservative management for prostate cancer patients using the SEER-Medicare Linked Data.

expit, oncology, us government, (20 more...)

arXiv.org Machine Learning

1907.00287

Country:

North America > United States > California (0.14)
North America > United States > New Jersey (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Industry:

Government > Regional Government > North America Government > United States Government (0.48)
Health & Medicine > Therapeutic Area > Oncology > Prostate Cancer (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Fine-Gray competing risks model with high-dimensional covariates: estimation and Inference

Hou, Jue, Bradic, Jelena, Xu, Ronghui

arXiv.org Machine LearningJul-29-2017

The purpose of this paper is to construct confidence intervals for the regression coefficients in the Fine-Gray model for competing risks data with random censoring, where the number of covariates can be larger than the sample size. Despite strong motivation from biostatistics applications, high-dimensional Fine-Gray model has attracted relatively little attention among the methodological or theoretical literatures. We fill in this blank by proposing first a consistent regularized estimator and then the confidence intervals based on the one-step bias-correcting estimator. We are able to generalize the partial likelihood approach for the Fine-Gray model under random censoring despite many technical difficulties. We lay down a methodological and theoretical framework for the one-step bias-correcting estimator with the partial likelihood, which does not have independent and identically distributed entries. We also handle for our theory the approximation error from the inverse probability weighting (IPW), proposing novel concentration results for time dependent processes. In addition to the theoretical results and algorithms, we present extensive numerical experiments and an application to a study of non-cancer mortality among prostate cancer patients using the linked Medicare-SEER data.

health & medicine, high-dimensional covariate, oncology, (4 more...)

arXiv.org Machine Learning

1707.09561

Genre: Research Report (0.40)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology: Information Technology > Artificial Intelligence (0.53)

Add feedback