fovea
JOINEDTrans: Prior Guided Multi-task Transformer for Joint Optic Disc/Cup Segmentation and Fovea Detection
He, Huaqing, Lin, Li, Cai, Zhiyuan, Cheng, Pujin, Tang, Xiaoying
Deep learning-based image segmentation and detection models have largely improved the efficiency of analyzing retinal landmarks such as optic disc (OD), optic cup (OC), and fovea. However, factors including ophthalmic disease-related lesions and low image quality issues may severely complicate automatic OD/OC segmentation and fovea detection. Most existing works treat the identification of each landmark as a single task, and take into account no prior information. To address these issues, we propose a prior guided multi-task transformer framework for joint OD/OC segmentation and fovea detection, named JOINEDTrans. JOINEDTrans effectively combines various spatial features of the fundus images, relieving the structural distortions induced by lesions and other imaging issues. It contains a segmentation branch and a detection branch. To be noted, we employ an encoder pretrained in a vessel segmentation task to effectively exploit the positional relationship among vessel, OD/OC, and fovea, successfully incorporating spatial prior into the proposed JOINEDTrans framework. There are a coarse stage and a fine stage in JOINEDTrans. In the coarse stage, OD/OC coarse segmentation and fovea heatmap localization are obtained through a joint segmentation and detection module. In the fine stage, we crop regions of interest for subsequent refinement and use predictions obtained in the coarse stage to provide additional information for better performance and faster convergence. Experimental results demonstrate that JOINEDTrans outperforms existing state-of-the-art methods on the publicly available GAMMA, REFUGE, and PALM fundus image datasets. We make our code available at https://github.com/HuaqingHe/JOINEDTrans
Active Dynamical Prospection: Modeling Mental Simulation as Particle Filtering for Sensorimotor Control during Pathfinding
What do humans do when confronted with a common challenge: we know where we want to go but we are not yet sure the best way to get there, or even if we can. This is the problem posed to agents during spatial navigation and pathfinding, and its solution may give us clues about the more abstract domain of planning in general. In this work, we model pathfinding behavior in a continuous, explicitly exploratory paradigm. In our task, participants (and agents) must coordinate both visual exploration and navigation within a partially observable environment. Our contribution has three primary components: 1) an analysis of behavioral data from 81 human participants in a novel pathfinding paradigm conducted as an online experiment, 2) a proposal to model prospective mental simulation during navigation as particle filtering, and 3) an instantiation of this proposal in a computational agent. We show that our model, Active Dynamical Prospection, demonstrates similar patterns of map solution rate, path selection, and trial duration, as well as attentional behavior (at both aggregate and individual levels) when compared with data from human participants. We also find that both distal attention and delay prior to first move (both potential correlates of prospective simulation) are predictive of task performance.
Optic disc and fovea localisation in ultra-widefield scanning laser ophthalmoscope images captured in multiple modalities
Wakeford, Peter Robert, Pellegrini, Enrico, Robertson, Gavin, Verhoek, Michael, Fleming, Alan Duncan, van Hemert, Jano, Heng, Ik Siong
We propose a convolutional neural network for localising the centres of the optic disc (OD) and fovea in ultra-wide field of view scanning laser ophthalmoscope (UWFoV-SLO) images of the retina. Images captured in both reflectance and autofluorescence (AF) modes, and central pole and eyesteered gazes, were used. The method achieved an OD localisation accuracy of 99.4% within one OD radius, and fovea localisation accuracy of 99.1% within one OD radius on a test set comprising of 1790 images. The performance of fovea localisation in AF images was comparable to the variation between human annotators at this task. The laterality of the image (whether the image is of the left or right eye) was inferred from the OD and fovea coordinates with an accuracy of 99.9%.
Why Google DeepMind Is Putting AI on the Psychologist's Couch
Artificial intelligence can now carry out many of the same cognitive tasks humans can, but we still don't really understand how AIs think. Google DeepMind plans to train long-standing tests of human cognitive skills on machine minds to learn how they work. A long-standing problem in AI research has been the fact that deep neural networks are "black boxes." You can't tell how these algorithms work just by looking at their code. They teach themselves by training on data and there's no simple flow diagram a human can follow.
A POMDP Model of Eye-Hand Coordination
Erez, Tom (Washington University in St. Louis) | Tramper, Julian J. (Radboud University) | Smart, William D (Washington University in St. Louis) | Gielen, Stan CAM (Radboud University)
This paper presents a generative model of eye-hand coordination. We use numerical optimization to solve for the joint behavior of an eye and two hands, deriving a predicted motion pattern from first principles, without imposing heuristics. We model the planar scene as a POMDP with 17 continuous state dimensions. Belief-space optimization is facilitated by using a nominal-belief heuristic, whereby we assume (during planning) that the maximum likelihood observation is always obtained. Since a globally-optimal solution for such a high-dimensional domain is computationally intractable, we employ local optimization in the belief domain. By solving for a locally-optimal plan through belief space, we generate a motion pattern of mutual coordination between hands and eye: the eye's saccades disambiguate the scene in a task-relevant manner, and the hands' motions anticipate the eye's saccades. Finally, the model is validated through a behavioral experiment, in which human subjects perform the same eye-hand coordination task. We show how simulation is congruent with the experimental results.
Sensor Map Discovery for Developing Robots
Stober, Jeremy (The University of Texas at Austin) | Fishgold, Lewis (The University of Texas at Austin) | Kuipers, Benjamin (University of Michigan)
Modern mobile robots navigate uncertain environments using complex compositions of camera, laser, and sonar sensor data. Manual calibration of these sensors is a tedious process that involves determining sensor behavior, geometry and location through model specification and system identification. Instead, we seek to automate the construction of sensor model geometry by mining uninterpreted sensor streams for regularities. Manifold learning methods are powerful techniques for deriving sensor structure from streams of sensor data. In recent years, the proliferation of manifold learning algorithms has led to a variety of choices for autonomously generating models of sensor geometry. We present a series of comparisons between different manifold learning methods for discovering sensor geometry for the specific case of a mobile robot with a variety of sensors. We also explore the effect of control laws and sensor boundary size on the efficacy of manifold learning approaches. We find that "motor babbling" control laws generate better geometric sensor maps than mid-line or wall following control laws and identify a novel method for distinguishing boundary sensor elements. We also present a new learning method, sensorimotor embedding, that takes advantage of the controllable nature of robots to build sensor maps.
Saliency-Driven Image Acuity Modulation on a Reconfigurable Array of Spiking Silicon Neurons
Vogelstein, R. J., Mallik, Udayan, Culurciello, Eugenio, Cauwenberghs, Gert, Etienne-Cummings, Ralph
We have constructed a system that uses an array of 9,600 spiking silicon neurons, a fast microcontroller, and digital memory, to implement a reconfigurable network of integrate-and-fire neurons. The system is designed for rapid prototyping of spiking neural networks that require high-throughput communication with external address-event hardware. Arbitrary network topologies can be implemented by selectively routing address-events to specific internal or external targets according to a memory-based projective field mapping. The utility and versatility of the system is demonstrated by configuring it as a three-stage network that accepts input from an address-event imager, detects salient regions of the image, and performs spatial acuity modulation around a high-resolution fovea that is centered on the location of highest salience.
Saliency-Driven Image Acuity Modulation on a Reconfigurable Array of Spiking Silicon Neurons
Vogelstein, R. J., Mallik, Udayan, Culurciello, Eugenio, Cauwenberghs, Gert, Etienne-Cummings, Ralph
We have constructed a system that uses an array of 9,600 spiking silicon neurons, a fast microcontroller, and digital memory, to implement a reconfigurable network of integrate-and-fire neurons. The system is designed for rapid prototyping of spiking neural networks that require high-throughput communication with external address-event hardware. Arbitrary network topologies can be implemented by selectively routing address-events to specific internal or external targets according to a memory-based projective field mapping. The utility and versatility of the system is demonstrated by configuring it as a three-stage network that accepts input from an address-event imager, detects salient regions of the image, and performs spatial acuity modulation around a high-resolution fovea that is centered on the location of highest salience.