Problem Solving
Reviews: Expressive power of tensor-network factorizations for probabilistic modeling
The authors compare the ranks of tensor representations of HMM, and outputs of quantum circuits with two qubit unitary gates yielding Matrix product States (MPS) and so-called Locally Purified States (LPS) when ancillary unmeasured bits are present. A general comment: Born machines automaticaly enforce positivity but is it clear that 83) and (4) are less than 1? The A's come from some unitary circuits in SM? If yes the main problem formulation seems not selfcontained in sect.2. Some are more surprizing namely the very large (at least of the order of the number of qubits) difference in rank when one works in the real field versus complex field.
Reviews: On the Expressive Power of Deep Polynomial Neural Networks
Post-rebuttal: After reading the authors' response and further consideration, I am downgrading my score to 7 from 9. While I am still very excited about the new perspective this work brings, I now realize that there is still a lot of work remaining in order to tie the theoretical results to real-world phenomena. Regardless of whether the paper gets accepted, I'd ask the authors to make the gap clearer and to lay out more clearly an agenda for future work that address the various issues discussed in the rebuttal, e.g.: approximation, empirical notions of filling, etc. ORIGINALITY The paper considers the functional space of polynomial networks as an algebraic object. They use tools from algebraic geometry to analyze the dimension of the Zariski closure of this space. The paper is highly original in relating recent results from algebra to basic issues about neural networks. QUALITY & CLARITY This work tackles head-on the problem of analyzing the functional space of polynomial varieties.
Review for NeurIPS paper: Generating Correct Answers for Progressive Matrices Intelligence Tests
Weaknesses: My first concern is that this model seems far from minimalism. Generating correct answer for RPM is an interesting task. But one of the reasons it is interesting to the current AI community is that humans can somehow generate some results correctly without huge amount of training. Although this work demonstrates the possibility of generator that can show some reasoning capability, I highly speculate that this is a distillation from the subnetworks for context extraction, which is trained with strong supervision. There is still a long distance from this model and human brain. The latter one is believed to be designed by nature following minimalism.
Reviews: Search on the Replay Buffer: Bridging Planning and Reinforcement Learning
Compact search spaces would confer computational benefits if nothing else. Overall, studying how compact representations of the state might might compare when used inside graph search seems like a nice way to evaluate just how much utility is added by the distributional RL component of the overall approach.
DarkMind: Latent Chain-of-Thought Backdoor in Customized LLMs
With the growing demand for personalized AI solutions, customized LLMs have become a preferred choice for businesses and individuals, driving the deployment of millions of AI agents across various platforms, e.g., GPT Store hosts over 3 million customized GPTs. Their popularity is partly driven by advanced reasoning capabilities, such as Chain-of-Thought, which enhance their ability to tackle complex tasks. However, their rapid proliferation introduces new vulnerabilities, particularly in reasoning processes that remain largely unexplored. We introduce DarkMind, a novel backdoor attack that exploits the reasoning capabilities of customized LLMs. Designed to remain latent, DarkMind activates within the reasoning chain to covertly alter the final outcome. Unlike existing attacks, it operates without injecting triggers into user queries, making it a more potent threat. We evaluate DarkMind across eight datasets covering arithmetic, commonsense, and symbolic reasoning domains, using five state-of-the-art LLMs with five distinct trigger implementations. Our results demonstrate DarkMind effectiveness across all scenarios, underscoring its impact. Finally, we explore potential defense mechanisms to mitigate its risks, emphasizing the need for stronger security measures.
Domaino1s: Guiding LLM Reasoning for Explainable Answers in High-Stakes Domains
Chu, Xu, Tan, Zhijie, Xue, Hanlin, Wang, Guanyu, Mo, Tong, Li, Weiping
Large Language Models (LLMs) are widely applied to downstream domains. However, current LLMs for high-stakes domain tasks, such as financial investment and legal QA, typically generate brief answers without reasoning processes and explanations. This limits users' confidence in making decisions based on their responses. While original CoT shows promise, it lacks self-correction mechanisms during reasoning. This work introduces Domain$o1$s, which enhances LLMs' reasoning capabilities on domain tasks through supervised fine-tuning and tree search. We construct CoT-stock-2k and CoT-legal-2k datasets for fine-tuning models that activate domain-specific reasoning steps based on their judgment. Additionally, we propose Selective Tree Exploration to spontaneously explore solution spaces and sample optimal reasoning paths to improve performance. We also introduce PROOF-Score, a new metric for evaluating domain models' explainability, complementing traditional accuracy metrics with richer assessment dimensions. Extensive experiments on stock investment recommendation and legal reasoning QA tasks demonstrate Domaino1s's leading performance and explainability. Our code is available at https://anonymous.4open.science/r/Domaino1s-006F/.
Towards Human-Guided, Data-Centric LLM Co-Pilots
Saveliev, Evgeny, Liu, Jiashuo, Seedat, Nabeel, Boyd, Anders, van der Schaar, Mihaela
Machine learning (ML) has the potential to revolutionize various domains, but its adoption is often hindered by the disconnect between the needs of domain experts and translating these needs into robust and valid ML tools. Despite recent advances in LLM-based co-pilots to democratize ML for non-technical domain experts, these systems remain predominantly focused on model-centric aspects while overlooking critical data-centric challenges. This limitation is problematic in complex real-world settings where raw data often contains complex issues, such as missing values, label noise, and domain-specific nuances requiring tailored handling. To address this we introduce CliMB-DC, a human-guided, data-centric framework for LLM co-pilots that combines advanced data-centric tools with LLM-driven reasoning to enable robust, context-aware data processing. At its core, CliMB-DC introduces a novel, multi-agent reasoning system that combines a strategic coordinator for dynamic planning and adaptation with a specialized worker agent for precise execution. Domain expertise is then systematically incorporated to guide the reasoning process using a human-in-the-loop approach. To guide development, we formalize a taxonomy of key data-centric challenges that co-pilots must address. Thereafter, to address the dimensions of the taxonomy, we integrate state-of-the-art data-centric tools into an extensible, open-source architecture, facilitating the addition of new tools from the research community. Empirically, using real-world healthcare datasets we demonstrate CliMB-DC's ability to transform uncurated datasets into ML-ready formats, significantly outperforming existing co-pilot baselines for handling data-centric challenges. CliMB-DC promises to empower domain experts from diverse domains -- healthcare, finance, social sciences and more -- to actively participate in driving real-world impact using ML.