cpc
Paper: Generalization of Reinforcement Learners with Working and Episodic Memory
We thank the reviewers for their thoughtful and constructive feedback on our manuscript. This should help both contextualize each task's difficulty and illustrate what it involves. Reviewer 3 noted the Section 2 task descriptions could be better presented. We have reformatted it so that "the order We also changed our description of IMP ALA to match Reviewer 5's suggestion. Regarding the task suite, Reviewer 4 raised a thoughtful consideration on whether "most of the findings translate when Some 3D tasks in the suite already have '2D-like' semi-counterparts that do not require navigation, '2D-like' because everything is fully observable and the agent has a first-person point of view from a fixed point, without Spot the Difference level, was overall harder than Change Detection for our ablation models.
Supplementaryfor NeuralMethodsforPoint-wiseDependencyEstimation
Four approaches are discussed: Variational Bounds of Mutual Information, Density Matching, ProbabilisticClassifier,andDensity-RatioFitting. Proposition3(IJS and its neural estimation, restating Jensen-Shannon bound with f-GAN objective [22]). We adopt the "concatenate critic" design [20, 22, 23] for our neural network parametrized function. NotethatProbabilistic Classifier method applies sigmoid function to the outputs to ensure probabilistic outputs. To proceed, it suffices if we could provide an upper bound forPrS(|lS(θk)| ε/2).
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Asia > Middle East > Jordan (0.04)
Paper: Generalization of Reinforcement Learners with Working and Episodic Memory
We thank the reviewers for their thoughtful and constructive feedback on our manuscript. This should help both contextualize each task's difficulty and illustrate what it involves. Reviewer 3 noted the Section 2 task descriptions could be better presented. We have reformatted it so that "the order We also changed our description of IMP ALA to match Reviewer 5's suggestion. Regarding the task suite, Reviewer 4 raised a thoughtful consideration on whether "most of the findings translate when Some 3D tasks in the suite already have '2D-like' semi-counterparts that do not require navigation, '2D-like' because everything is fully observable and the agent has a first-person point of view from a fixed point, without Spot the Difference level, was overall harder than Change Detection for our ablation models.
Supplementary for Neural Methods for Point-wise Dependency Estimation Y ao-Hung Hubert Tsai
In this section, we shall show detailed derivations for the point-wise dependency estimation methods. We denote M be any class of functions m: Ω R . Work done at Carnegie Mellon University. This approach casts the PD estimation as the problem of estimating the'class'-posterior probability. The MI neural estimation methods can be dissected into two procedures: learning and inference .
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > Canada (0.04)
- Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
Improving constraint-based discovery with robust propagation and reliable LLM priors
Lyu, Ruiqi, Turcan, Alistair, Zhang, Martin Jinye, Wilder, Bryan
Learning causal structure from observational data is central to scientific modeling and decision-making. Constraint-based methods aim to recover conditional independence (CI) relations in a causal directed acyclic graph (DAG). Classical approaches such as PC and subsequent methods orient v-structures first and then propagate edge directions from these seeds, assuming perfect CI tests and exhaustive search of separating subsets -- assumptions often violated in practice, leading to cascading errors in the final graph. Recent work has explored using large language models (LLMs) as experts, prompting sets of nodes for edge directions, and could augment edge orientation when assumptions are not met. However, such methods implicitly assume perfect experts, which is unrealistic for hallucination-prone LLMs. We propose MosaCD, a causal discovery method that propagates edges from a high-confidence set of seeds derived from both CI tests and LLM annotations. To filter hallucinations, we introduce shuffled queries that exploit LLMs' positional bias, retaining only high-confidence seeds. We then apply a novel confidence-down propagation strategy that orients the most reliable edges first, and can be integrated with any skeleton-based discovery method. Across multiple real-world graphs, MosaCD achieves higher accuracy in final graph construction than existing constraint-based methods, largely due to the improved reliability of initial seeds and robust propagation strategies.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Advancing Medical Artificial Intelligence Using a Century of Cases
Buckley, Thomas A., Conci, Riccardo, Brodeur, Peter G., Gusdorf, Jason, Beltrán, Sourik, Behrouzi, Bita, Crowe, Byron, Dockterman, Jacob, Muhammad, Muzzammil, Ohnigian, Sarah, Sanchez, Andrew, Diao, James A., Shah, Aashna P., Restrepo, Daniel, Rosenberg, Eric S., Lea, Andrew S., Zitnik, Marinka, Podolsky, Scott H., Kanjee, Zahir, Abdulnour, Raja-Elie E., Koshy, Jacob M., Rodman, Adam, Manrai, Arjun K.
BACKGROUND: For over a century, the New England Journal of Medicine Clinicopathological Conferences (CPCs) have tested the reasoning of expert physicians and, recently, artificial intelligence (AI). However, prior AI evaluations have focused on final diagnoses without addressing the multifaceted reasoning and presentation skills required of expert discussants. METHODS: Using 7102 CPCs (1923-2025) and 1021 Image Challenges (2006-2025), we conducted extensive physician annotation and automated processing to create CPC-Bench, a physician-validated benchmark spanning 10 text-based and multimodal tasks, against which we evaluated leading large language models (LLMs). Then, we developed "Dr. CaBot," an AI discussant designed to produce written and slide-based video presentations using only the case presentation, modeling the role of the human expert in these cases. RESULTS: When challenged with 377 contemporary CPCs, o3 (OpenAI) ranked the final diagnosis first in 60% of cases and within the top ten in 84% of cases, outperforming a 20-physician baseline; next-test selection accuracy reached 98%. Event-level physician annotations quantified AI diagnostic accuracy per unit of information. Performance was lower on literature search and image tasks; o3 and Gemini 2.5 Pro (Google) achieved 67% accuracy on image challenges. In blinded comparisons of CaBot vs. human expert-generated text, physicians misclassified the source of the differential in 46 of 62 (74%) of trials, and scored CaBot more favorably across quality dimensions. To promote research, we are releasing CaBot and CPC-Bench. CONCLUSIONS: LLMs exceed physician performance on complex text-based differential diagnosis and convincingly emulate expert medical presentations, but image interpretation and literature retrieval remain weaker. CPC-Bench and CaBot may enable transparent and continued tracking of progress in medical AI.
- North America > United States > Massachusetts > Suffolk County > Boston (0.05)
- North America > United States > New York > Monroe County > Rochester (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (2 more...)
- Health & Medicine > Therapeutic Area (1.00)
- Health & Medicine > Health Care Providers & Services (1.00)
- Health & Medicine > Diagnostic Medicine (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)
CPC-CMS: Cognitive Pairwise Comparison Classification Model Selection Framework for Document-level Sentiment Analysis
Li, Jianfei, Yuen, Kevin Kam Fung
This study proposes the Cognitive Pairwise Comparison Classification Model Selection (CPC-CMS) framework for document-level sentiment analysis. The CPC, based on expert knowledge judgment, is used to calculate the weights of evaluation criteria, including accuracy, precision, recall, F1-score, specificity, Matthews Correlation Coefficient (MCC), Cohen's Kappa (Kappa), and efficiency. Naive Bayes, Linear Support Vector Classification (LSVC), Random Forest, Logistic Regression, Extreme Gradient Boosting (XGBoost), Long Short-Term Memory (LSTM), and A Lite Bidirectional Encoder Representations from Transformers (ALBERT) are chosen as classification baseline models. A weighted decision matrix consisting of classification evaluation scores with respect to criteria weights, is formed to select the best classification model for a classification problem. Three open datasets of social media are used to demonstrate the feasibility of the proposed CPC-CMS. Based on our simulation, for evaluation results excluding the time factor, ALBERT is the best for the three datasets; if time consumption is included, no single model always performs better than the other models. The CPC-CMS can be applied to the other classification applications in different areas.
- Asia > China > Hong Kong (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Asia > Singapore (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Data Normalization Strategies for EEG Deep Learning
Normalization is a critical yet often overlooked component in the preprocessing pipeline for EEG deep learning applications. The rise of large-scale pretraining paradigms such as self-supervised learning (SSL) introduces a new set of tasks whose nature is substantially different from supervised training common in EEG deep learning applications. This raises new questions about optimal normalization strategies for the applicable task. In this study, we systematically evaluate the impact of normalization granularity (recording vs. window level) and scope (cross-channel vs. within-channel) on both supervised (age and gender prediction) and self-supervised (Contrastive Predictive Coding) tasks. Using high-density resting-state EEG from 2,836 subjects in the Healthy Brain Network dataset, we show that optimal normalization strategies differ significantly between training paradigms. Window-level within-channel normalization yields the best performance in supervised tasks, while minimal or cross-channel normalization at the window level is more effective for SSL. These results underscore the necessity of task-specific normalization choices and challenge the assumption that a universal normalization strategy can generalize across learning settings. Our findings provide practical insights for developing robust EEG deep learning pipelines as the field shifts toward large-scale, foundation model training.
- North America > United States > California > San Diego County > San Diego (0.04)
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- North America > United States > California > San Diego County > La Jolla (0.04)
- (2 more...)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Health Care Technology (1.00)