Goto

Collaborating Authors

 deepctrl


ControllingNeuralNetworkswithRule Representations

Neural Information Processing Systems

DNNs get more accurate as the size and coverage of training data increase [17]. While investing in high-quality and large-scale labeled data is one path, another is utilizing prior knowledge - concisely referred to as'rules': reasoning heuristics, equations, associative logic, constraints or blacklists.


Controlling Neural Networks with Rule Representations

Neural Information Processing Systems

We propose a novel training method that integrates rules into deep learning, in a way the strengths of the rules are controllable at inference. Deep Neural Networks with Controllable Rule Representations (DeepCTRL) incorporates a rule encoder into the model coupled with a rule-based objective, enabling a shared representation for decision making. DeepCTRL is agnostic to data type and model architecture. It can be applied to any kind of rule defined for inputs and outputs. The key aspect of DeepCTRL is that it does not require retraining to adapt the rule strength -- at inference, the user can adjust it based on the desired operation point on accuracy vs. rule verification ratio. In real-world domains where incorporating rules is critical -- such as Physics, Retail and Healthcare -- we show the effectiveness of DeepCTRL in teaching rules for deep learning. DeepCTRL improves the trust and reliability of the trained models by significantly increasing their rule verification ratio, while also providing accuracy gains at downstream tasks. Additionally, DeepCTRL enables novel use cases such as hypothesis testing of the rules on data samples, and unsupervised adaptation based on shared rules between datasets.


Controlling Neural Networks with Rule Representations

Neural Information Processing Systems

We propose a novel training method that integrates rules into deep learning, in a way the strengths of the rules are controllable at inference. Deep Neural Networks with Controllable Rule Representations (DeepCTRL) incorporates a rule encoder into the model coupled with a rule-based objective, enabling a shared representation for decision making. DeepCTRL is agnostic to data type and model architecture. It can be applied to any kind of rule defined for inputs and outputs. The key aspect of DeepCTRL is that it does not require retraining to adapt the rule strength -- at inference, the user can adjust it based on the desired operation point on accuracy vs. rule verification ratio. In real-world domains where incorporating rules is critical -- such as Physics, Retail and Healthcare -- we show the effectiveness of DeepCTRL in teaching rules for deep learning.


10 Biggest Algorithmic Breakthroughs of 2022

#artificialintelligence

Since artificial intelligence keeps getting smarter with each passing day, its demand in several fields keeps getting bigger. The surge reflects the need for faster chips, more data, and definitely better algorithms. Shifting our focus, here are the top algorithmic breakthroughs that have become essential in the modern AI developer's toolbox. Meta AI showed its contributions to the challenging area of deep learning with HTPS, a deep learning model that solved several International Math Olympiad (IMO) problems. This method showcases important capabilities, demonstrating that deep neural networks make mathematical reasoning possible.


Google AI researchers present a new method to train models, 'DeepCTRL'

#artificialintelligence

Deep neural networks (DNNs) provide increasingly accurate outputs as the volume and variety of their training data increases. While investing in high-quality, large-scale labelled datasets are one strategy to improve models, and another is to apply "rules" – reasoning heuristics, equations, associative logic, or limitations. Consider the classic physics problem of forecasting the future state of a double pendulum system using a model. While the model may learn to predict the system's total energy at a given point in time only from empirical data, it will often overestimate the energy unless given an equation that incorporates known physical constraints, such as energy conservation. The model can't represent such well-established physical principles on its own.