drnet
Dynamic Resolution Network
Deep convolutional neural networks (CNNs) are often of sophisticated design with numerous learnable parameters for the accuracy reason. To alleviate the expensive costs of deploying them on mobile devices, recent works have made huge efforts for excavating redundancy in pre-defined architectures. Nevertheless, the redundancy on the input resolution of modern CNNs has not been fully investigated, i.e., the resolution of input image is fixed. In this paper, we observe that the smallest resolution for accurately predicting the given image is different using the same neural network. To this end, we propose a novel dynamic-resolution network (DRNet) in which the input resolution is determined dynamically based on each input sample. Wherein, a resolution predictor with negligible computational costs is explored and optimized jointly with the desired network.
Dynamic Resolution Network
Deep convolutional neural networks (CNNs) are often of sophisticated design with numerous learnable parameters for the accuracy reason. To alleviate the expensive costs of deploying them on mobile devices, recent works have made huge efforts for excavating redundancy in pre-defined architectures. Nevertheless, the redundancy on the input resolution of modern CNNs has not been fully investigated, i.e., the resolution of input image is fixed. In this paper, we observe that the smallest resolution for accurately predicting the given image is different using the same neural network. To this end, we propose a novel dynamic-resolution network (DRNet) in which the input resolution is determined dynamically based on each input sample. Wherein, a resolution predictor with negligible computational costs is explored and optimized jointly with the desired network.
Learning Visual Abstract Reasoning through Dual-Stream Networks
Zhao, Kai, Xu, Chang, Si, Bailu
Visual abstract reasoning tasks present challenges for deep neural networks, exposing limitations in their capabilities. In this work, we present a neural network model that addresses the challenges posed by Raven's Progressive Matrices (RPM). Inspired by the two-stream hypothesis of visual processing, we introduce the Dual-stream Reasoning Network (DRNet), which utilizes two parallel branches to capture image features. On top of the two streams, a reasoning module first learns to merge the high-level features of the same image. Then, it employs a rule extractor to handle combinations involving the eight context images and each candidate image, extracting discrete abstract rules and utilizing an multilayer perceptron (MLP) to make predictions. Empirical results demonstrate that the proposed DRNet achieves state-of-the-art average performance across multiple RPM benchmarks. Furthermore, DRNet demonstrates robust generalization capabilities, even extending to various out-of-distribution scenarios. The dual streams within DRNet serve distinct functions by addressing local or spatial information. They are then integrated into the reasoning module, leveraging abstract rules to facilitate the execution of visual reasoning tasks. These findings indicate that the dual-stream architecture could play a crucial role in visual abstract reasoning.
- Asia > China > Beijing > Beijing (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Europe > Austria (0.04)
- Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)
DRNet: A Decision-Making Method for Autonomous Lane Changingwith Deep Reinforcement Learning
Xu, Kunpeng, Chen, Lifei, Wang, Shengrui
Machine learning techniques have outperformed numerous rule-based methods for decision-making in autonomous vehicles. Despite recent efforts, lane changing remains a major challenge, due to the complex driving scenarios and changeable social behaviors of surrounding vehicles. To help improve the state of the art, we propose to leveraging the emerging \underline{D}eep \underline{R}einforcement learning (DRL) approach for la\underline{NE} changing at the \underline{T}actical level. To this end, we present "DRNet", a novel and highly efficient DRL-based framework that enables a DRL agent to learn to drive by executing reasonable lane changing on simulated highways with an arbitrary number of lanes, and considering driving style of surrounding vehicles to make better decisions. Furthermore, to achieve a safe policy for decision-making, DRNet incorporates ideas from safety verification, the most important component of autonomous driving, to ensure that only safe actions are chosen at any time. The setting of our state representation and reward function enables the trained agent to take appropriate actions in a real-world-like simulator. Our DRL agent has the ability to learn the desired task without causing collisions and outperforms DDQN and other baseline models.
- North America > United States > California (0.04)
- North America > Canada > Quebec > Estrie Region > Sherbrooke (0.04)
- Asia > China > Fujian Province > Fuzhou (0.04)
- Transportation > Ground > Road (1.00)
- Automobiles & Trucks (1.00)
Gaze Estimation Approach Using Deep Differential Residual Network
Huang, Longzhao, Li, Yujie, Wang, Xu, Wang, Haoyu, Bouridane, Ahmed, Chaddad, Ahmad
Gaze estimation, which is a method to determine where a person is looking at given the person's full face, is a valuable clue for understanding human intention. Similarly to other domains of computer vision, deep learning (DL) methods have gained recognition in the gaze estimation domain. However, there are still gaze calibration problems in the gaze estimation domain, thus preventing existing methods from further improving the performances. An effective solution is to directly predict the difference information of two human eyes, such as the differential network (Diff-Nn). However, this solution results in a loss of accuracy when using only one inference image. We propose a differential residual model (DRNet) combined with a new loss function to make use of the difference information of two eye images. We treat the difference information as auxiliary information. We assess the proposed model (DRNet) mainly using two public datasets (1) MpiiGaze and (2) Eyediap. Considering only the eye features, DRNet outperforms the state-of-the-art gaze estimation methods with $angular-error$ of 4.57 and 6.14 using MpiiGaze and Eyediap datasets, respectively. Furthermore, the experimental results also demonstrate that DRNet is extremely robust to noise images.
- Asia > China (0.05)
- North America > Canada > Quebec > Montreal (0.04)
- Asia > Middle East > UAE > Sharjah Emirate > Sharjah (0.04)
DRNets can solve Sudoku, speed scientific discovery
Say you're driving with a friend in a familiar neighborhood, and the friend asks you to turn at the next intersection. The friend doesn't say which way to turn, but since you both know it's a one-way street, it's understood. That type of reasoning is at the heart of a new artificial-intelligence framework – tested successfully on overlapping Sudoku puzzles – that could speed discovery in materials science, renewable energy technology and other areas. An interdisciplinary research team led by Carla Gomes, the Ronald C. and Antonia V. Nielsen Professor of Computing and Information Science in the Cornell Ann S. Bowers College of Computing and Information Science, has developed Deep Reasoning Networks (DRNets), which combine deep learning – even with a relatively small amount of data – with an understanding of the subject's boundaries and rules, known as "constraint reasoning." Di Chen, a computer science doctoral student in Gomes' group, is first author of "Automating Crystal-Structure Phase Mapping by Combining Deep Learning with Constraint Reasoning," published Sept. 16 in Nature Machine Intelligence.
- Energy > Renewable (0.72)
- Leisure & Entertainment > Games > Sudoku (0.66)
Automating Crystal-Structure Phase Mapping: Combining Deep Learning with Constraint Reasoning
Chen, Di, Bai, Yiwei, Ament, Sebastian, Zhao, Wenting, Guevarra, Dan, Zhou, Lan, Selman, Bart, van Dover, R. Bruce, Gregoire, John M., Gomes, Carla P.
Crystal-structure phase mapping is a core, long-standing challenge in materials science that requires identifying crystal structures, or mixtures thereof, in synthesized materials. Materials science experts excel at solving simple systems but cannot solve complex systems, creating a major bottleneck in high-throughput materials discovery. Herein we show how to automate crystal-structure phase mapping. We formulate phase mapping as an unsupervised pattern demixing problem and describe how to solve it using Deep Reasoning Networks (DRNets). DRNets combine deep learning with constraint reasoning for incorporating scientific prior knowledge and consequently require only a modest amount of (unlabeled) data. DRNets compensate for the limited data by exploiting and magnifying the rich prior knowledge about the thermodynamic rules governing the mixtures of crystals with constraint reasoning seamlessly integrated into neural network optimization. DRNets are designed with an interpretable latent space for encoding prior-knowledge domain constraints and seamlessly integrate constraint reasoning into neural network optimization. DRNets surpass previous approaches on crystal-structure phase mapping, unraveling the Bi-Cu-V oxide phase diagram, and aiding the discovery of solar-fuels materials.
Deep Reasoning Networks: Thinking Fast and Slow
Chen, Di, Bai, Yiwei, Zhao, Wenting, Ament, Sebastian, Gregoire, John M., Gomes, Carla P.
We introduce Deep Reasoning Networks (DRNets), an end-to-end framework that combines deep learning with reasoning for solving complex tasks, typically in an unsupervised or weakly-supervised setting. DRNets exploit problem structure and prior knowledge by tightly combining logic and constraint reasoning with stochastic-gradient-based neural network optimization. We illustrate the power of DRNets on de-mixing overlapping hand-written Sudokus (Multi-MNIST-Sudoku) and on a substantially more complex task in scientific discovery that concerns inferring crystal structures of materials from X-ray diffraction data under thermodynamic rules (Crystal-Structure-Phase-Mapping). At a high level, DRNets encode a structured latent space of the input data, which is constrained to adhere to prior knowledge by a reasoning module. The structured latent encoding is used by a generative decoder to generate the targeted output. Finally, an overall objective combines responses from the generative decoder (thinking fast) and the reasoning module (thinking slow), which is optimized using constraint-aware stochastic gradient descent. We show how to encode different tasks as DRNets and demonstrate DRNets' effectiveness with detailed experiments: DRNets significantly outperform the state of the art and experts' capabilities on Crystal-Structure-Phase-Mapping, recovering more precise and physically meaningful crystal structures. On Multi-MNIST-Sudoku, DRNets perfectly recovered the mixed Sudokus' digits, with 100% digit accuracy, outperforming the supervised state-of-the-art MNIST de-mixing models. Finally, as a proof of concept, we also show how DRNets can solve standard combinatorial problems -- 9-by-9 Sudoku puzzles and Boolean satisfiability problems (SAT), outperforming other specialized deep learning models. DRNets are general and can be adapted and expanded to tackle other tasks.
- North America > United States > New York > Tompkins County > Ithaca (0.05)
- North America > United States > California > Los Angeles County > Pasadena (0.04)
- Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)