AITopics | scf

Collaborating Authors

scf

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

1 Supplementary

Neural Information Processing SystemsFeb-16-2026, 20:55:14 GMT

Code and data to replicate our experiments can be found at https://github.com/ppope/rho-learn. 1.1 DFT Relaxations We use the PBE exchange-correlation functional for all relaxations. In particular a much smaller model was used than the state-of-the-art SCN results. An SCF run may be initialized with a custom density, e.g. one generated from a machine-learning

artificial intelligence, machine learning, quantum espresso, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.39)

Add feedback

Library Liberation: Competitive Performance Matmul Through Compiler-composed Nanokernels

Thangamani, Arun, Shahid, Md Asghar Ahmad, Siemieniuk, Adam, Morel, Rolf, Golin, Renato, Heinecke, Alexander

arXiv.org Artificial IntelligenceNov-19-2025

The rapidly evolving landscape of AI and machine learning workloads has widened the gap between high-level domain operations and efficient hardware utilization. Achieving near-peak performance still demands deep hardware expertise-experts either handcraft target-specific kernels (e.g., DeepSeek) or rely on specialized libraries (e.g., CUTLASS)-both of which add complexity and limit scalability for most ML practitioners. This paper introduces a compilation scheme that automatically generates scalable, high-performance microkernels by leveraging the MLIR dialects to bridge domain-level operations and processor capabilities. Our approach removes dependence on low-level libraries by enabling the compiler to auto-generate near-optimal code directly. At its core is a mechanism for composing nanokernels from low-level IR constructs with near-optimal register utilization, forming efficient microkernels tailored to each target. We implement this technique in an MLIR-based compiler supporting both vector and tile based CPU instructions. Experiments show that the generated nanokernels are of production-quality, and competitive with state-of-the-art microkernel libraries.

artificial intelligence, machine learning, vector, (17 more...)

arXiv.org Artificial Intelligence

2511.13764

Country: North America > United States (1.00)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Category-Level Object Shape and Pose Estimation in Less Than a Millisecond

Shaikewitz, Lorenzo, Nguyen, Tim, Carlone, Luca

arXiv.org Artificial IntelligenceSep-24-2025

Object shape and pose estimation is a foundational robotics problem, supporting tasks from manipulation to scene understanding and navigation. We present a fast local solver for shape and pose estimation which requires only category-level object priors and admits an efficient certificate of global optimality. Given an RGB-D image of an object, we use a learned front-end to detect sparse, category-level semantic keypoints on the target object. We represent the target object's unknown shape using a linear active shape model and pose a maximum a posteriori optimization problem to solve for position, orientation, and shape simultaneously. Expressed in unit quaternions, this problem admits first-order optimality conditions in the form of an eigenvalue problem with eigenvector nonlinearities. Our primary contribution is to solve this problem efficiently with self-consistent field iteration, which only requires computing a 4-by-4 matrix and finding its minimum eigenvalue-vector pair at each iterate. Solving a linear system for the corresponding Lagrange multipliers gives a simple global optimality certificate. One iteration of our solver runs in about 100 microseconds, enabling fast outlier rejection. We test our method on synthetic data and a variety of real-world settings, including two public datasets and a drone tracking scenario. Code is released at https://github.com/MIT-SPARK/Fast-ShapeAndPose.

artificial intelligence, optimization problem, rotation, (18 more...)

arXiv.org Artificial Intelligence

2509.18979

Country: North America > United States > Massachusetts (0.46)

Genre:

Research Report (0.64)
Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Vision > Video Understanding (0.82)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)

Add feedback

d961e9f236177d65d21100592edb0769-Supplemental.pdf

Neural Information Processing SystemsAug-16-2025, 17:43:06 GMT

co-sod method, icnet, visual comparison, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.71)

Add feedback

Set-Rationalizable Choice and Self-Stability

Brandt, Felix, Harrenstein, Paul

arXiv.org Artificial IntelligenceJul-22-2025

A common assumption in modern microeconomic theory is that choice should be rationalizable via a binary preference relation, which \citeauthor{Sen71a} showed to be equivalent to two consistency conditions, namely $α$ (contraction) and $γ$ (expansion). Within the context of \emph{social} choice, however, rationalizability and similar notions of consistency have proved to be highly problematic, as witnessed by a range of impossibility results, among which Arrow's is the most prominent. Since choice functions select \emph{sets} of alternatives rather than single alternatives, we propose to rationalize choice functions by preference relations over sets (set-rationalizability). We also introduce two consistency conditions, $\hatα$ and $\hatγ$, which are defined in analogy to $α$ and $γ$, and find that a choice function is set-rationalizable if and only if it satisfies $\hatα$. Moreover, a choice function satisfies $\hatα$ and $\hatγ$ if and only if it is \emph{self-stable}, a new concept based on earlier work by \citeauthor{Dutt88a}. The class of self-stable social choice functions contains a number of appealing Condorcet extensions such as the minimal covering set and the essential set.

artificial intelligence, choice function, relation, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.jet.2011.03.006

0910.3580

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Cognitive Science (0.58)

Add feedback

Conversation Forests: The Key to Fine Tuning Large Language Models for Multi-Turn Medical Conversations is Branching

Savage, Thomas

arXiv.org Artificial IntelligenceJul-16-2025

Fine-tuning methods such as Direct Preference Optimization (DPO) and Group Relative Policy Optimization (GRPO) have demonstrated success in training large language models (LLMs) for single-turn tasks. However, these methods fall short in multi-turn applications, such as diagnostic patient interviewing, where understanding how early conversational turns influence downstream completions and outcomes is essential. In medicine, a multi-turn perspective is critical for learning diagnostic schemas and better understanding conversation dynamics. To address this gap, I introduce Savage Conversation Forests (SCF), a reinforcement learning framework that leverages a branched conversation architecture to fine-tune LLMs for multi-turn dialogue. SCF generates multiple possible conversation continuations at each turn, enabling the model to learn how different early responses affect downstream interactions and diagnostic outcomes. In experiments simulating doctor-patient conversations, SCF with branching outperforms linear conversation architectures on diagnostic accuracy. I hypothesize that SCF's improvements stem from its ability to provide richer, interdependent training signals across conversation turns. These results suggest that a branched training architecture is an important strategy for fine tuning LLMs in complex multi-turn conversational tasks.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2507.04099

Country: North America > United States > Pennsylvania (0.15)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine > Diagnostic Medicine (0.49)
Health & Medicine > Health Care Technology > Medical Record (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

Self-Consistency Training for Density-Functional-Theory Hamiltonian Prediction

Zhang, He, Liu, Chang, Wang, Zun, Wei, Xinran, Liu, Siyuan, Zheng, Nanning, Shao, Bin, Liu, Tie-Yan

arXiv.org Artificial IntelligenceJun-5-2024

Predicting the mean-field Hamiltonian matrix in density functional theory is a fundamental formulation to leverage machine learning for solving molecular science problems. Yet, its applicability is limited by insufficient labeled data for training. In this work, we highlight that Hamiltonian prediction possesses a self-consistency principle, based on which we propose self-consistency training, an exact training method that does not require labeled data. It distinguishes the task from predicting other molecular properties by the following benefits: (1) it enables the model to be trained on a large amount of unlabeled data, hence addresses the data scarcity challenge and enhances generalization; (2) it is more efficient than running DFT to generate labels for supervised training, since it amortizes DFT calculation over a set of queries. We empirically demonstrate the better generalization in data-scarce and out-of-distribution scenarios, and the better efficiency over DFT labeling. These benefits push forward the applicability of Hamiltonian prediction to an ever-larger scale.

molecule, prediction, self-consistency training, (16 more...)

arXiv.org Artificial Intelligence

2403.0956

Country:

North America > United States (0.45)
Europe > Austria > Vienna (0.14)
Asia > China > Guangxi Province > Nanning (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (0.46)
Government > Regional Government > North America Government > United States Government (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Modeling Recommender Ecosystems: Research Challenges at the Intersection of Mechanism Design, Reinforcement Learning and Generative Models

Boutilier, Craig, Mladenov, Martin, Tennenholtz, Guy

arXiv.org Artificial IntelligenceSep-21-2023

Modern recommender systems lie at the heart of complex ecosystems that couple the behavior of users, content providers, advertisers, and other actors. Despite this, the focus of the majority of recommender research -- and most practical recommenders of any import -- is on the local, myopic optimization of the recommendations made to individual users. This comes at a significant cost to the long-term utility that recommenders could generate for its users. We argue that explicitly modeling the incentives and behaviors of all actors in the system -- and the interactions among them induced by the recommender's policy -- is strictly necessary if one is to maximize the value the system brings to these actors and improve overall ecosystem "health". Doing so requires: optimization over long horizons using techniques such as reinforcement learning; making inevitable tradeoffs in the utility that can be generated for different actors using the methods of social choice; reducing information asymmetry, while accounting for incentives and strategic behavior, using the tools of mechanism design; better modeling of both user and item-provider behaviors by incorporating notions from behavioral economics and psychology; and exploiting recent advances in generative and foundation models to make these mechanisms interpretable and actionable. We propose a conceptual framework that encompasses these elements, and articulate a number of research challenges that emerge at the intersection of these different disciplines.

creator, mechanism design, proceedings, (15 more...)

arXiv.org Artificial Intelligence

2309.06375

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
(27 more...)

Genre:

Instructional Material (0.67)
Research Report (0.63)

Industry:

Leisure & Entertainment (0.92)
Marketing (0.92)
Information Technology > Services (0.67)
Media > Music (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

OpenHLS: High-Level Synthesis for Low-Latency Deep Neural Networks for Experimental Science

Levental, Maksim, Khan, Arham, Chard, Ryan, Yoshii, Kazutomo, Chard, Kyle, Foster, Ian

arXiv.org Artificial IntelligenceMar-15-2023

In many experiment-driven scientific domains, such as high-energy physics, material science, and cosmology, high data rate experiments impose hard constraints on data acquisition systems: collected data must either be indiscriminately stored for post-processing and analysis, thereby necessitating large storage capacity, or accurately filtered in real-time, thereby necessitating low-latency processing. Deep neural networks, effective in other filtering tasks, have not been widely employed in such data acquisition systems, due to design and deployment difficulties. We present an open source, lightweight, compiler framework, without any proprietary dependencies, OpenHLS, based on high-level synthesis techniques, for translating high-level representations of deep neural networks to low-level representations, suitable for deployment to near-sensor devices such as field-programmable gate arrays. We evaluate OpenHLS on various workloads and present a case-study implementation of a deep neural network for Bragg peak detection in the context of high-energy diffraction microscopy. We show OpenHLS is able to produce an implementation of the network with a throughput 4.8 $\mu$s/sample, which is approximately a 4$\times$ improvement over the existing implementation

artificial intelligence, machine learning, opération, (19 more...)

arXiv.org Artificial Intelligence

2302.06751

Country:

North America > United States > District of Columbia > Washington (0.05)
North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

D4FT: A Deep Learning Approach to Kohn-Sham Density Functional Theory

Li, Tianbo, Lin, Min, Hu, Zheyuan, Zheng, Kunhao, Vignale, Giovanni, Kawaguchi, Kenji, Neto, A. H. Castro, Novoselov, Kostya S., Yan, Shuicheng

arXiv.org Artificial IntelligenceMar-1-2023

Kohn-Sham Density Functional Theory (KS-DFT) has been traditionally solved by the Self-Consistent Field (SCF) method. Behind the SCF loop is the physics intuition of solving a system of non-interactive single-electron wave functions under an effective potential. In this work, we propose a deep learning approach to KS-DFT. First, in contrast to the conventional SCF loop, we propose to directly minimize the total energy by reparameterizing the orthogonal constraint as a feed-forward computation. We prove that such an approach has the same expressivity as the SCF method, yet reduces the computational complexity from O(N^4) to O(N^3). Second, the numerical integration which involves a summation over the quadrature grids can be amortized to the optimization steps. At each step, stochastic gradient descent (SGD) is performed with a sampled minibatch of the grids. Extensive experiments are carried out to demonstrate the advantage of our approach in terms of efficiency and stability. In addition, we show that our approach enables us to explore more complex neural-based wave functions.

artificial intelligence, machine learning, wave function, (17 more...)

arXiv.org Artificial Intelligence

2303.00399

Country:

Asia > Singapore (0.04)
North America > United States > Minnesota (0.04)

Genre: Research Report (0.50)

Industry: Energy (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback