category detection
Estimating the strength and timing of syntactic structure building in naturalistic reading
A central question in psycholinguistics is the timing of syntax in sentence processing. Much of the existing evidence comes from violation paradigms, which conflate two separable processes - syntactic category detection and phrase structure construction - and implicitly assume that phrase structure follows category detection. In this study, we use co-registered EEG and eye-tracking data from the ZuCo corpus to disentangle these processes and test their temporal order under naturalistic reading conditions. Analyses of gaze transitions showed that readers preferentially moved between syntactic heads, suggesting that phrase structures, rather than serial word order, organize scanpaths. Bayesian network modeling further revealed that structural depth was the strongest driver of deviations from linear reading, outweighing lexical familiarity and surprisal. Finally, fixation-related potentials demonstrated that syntactic surprisal influences neural activity before word onset (-184 to -10 ms) and during early integration (48 to 300 ms). These findings extend current models of syntactic timing by showing that phrase structure construction can precede category detection and dominate lexical influences, supporting a predictive "tree-scaffolding" account of comprehension.
Liu
Open category detection is the problem of detecting "alien" test instances that belong to categories/classes that were not present in the training data. In many applications, reliably detecting such aliens is central to ensuring safety and/or quality of test data analysis. Unfortunately, to the best of our knowledge, there are no algorithms that provide theoretical guarantees on their ability to detect aliens under general assumptions. Further, while there are algorithms for open category detection, there are few empirical results that directly report alien-detection rates. Thus, there are significant theoretical and empirical gaps in our understanding of open category detection.
Representation Learning for Aspect Category Detection in Online Reviews
Zhou, Xinjie (Peking University) | Wan, Xiaojun (Peking University) | Xiao, Jianguo (Peking University)
User-generated reviews are valuable resources for decision making. Identifying the aspect categories discussed in a given review sentence (e.g., โfoodโ and โserviceโ in restaurant reviews) is an important task of sentiment analysis and opinion mining. Given a predefined aspect category set, most previous researches leverage hand-crafted features and a classification algorithm to accomplish the task. The crucial step to achieve better performance is feature engineering which consumes much human effort and may be unstable when the product domain changes. In this paper, we propose a representation learning approach to automatically learn useful features for aspect category detection. Specifically, a semi-supervised word embedding algorithm is first proposed to obtain continuous word representations on a large set of reviews with noisy labels. Afterwards, we propose to generate deeper and hybrid features through neural networks stacked on the word vectors. A logistic regression classifier is finally trained with the hybrid features to predict the aspect category. The experiments are carried out on a benchmark dataset released by SemEval-2014. Our approach achieves the state-of-the-art performance and outperforms the best participating team as well as a few strong baselines.