Goto

Collaborating Authors

 soma


SOMA: Feature Gradient Enhanced Affine-Flow Matching for SAR-Optical Registration

arXiv.org Artificial Intelligence

Achieving pixel-level registration between SAR and optical images remains a challenging task due to their fundamentally different imaging mechanisms and visual characteristics. Although deep learning has achieved great success in many cross-modal tasks, its performance on SAR-Optical registration tasks is still unsatisfactory. Gradient-based information has traditionally played a crucial role in handcrafted descriptors by highlighting structural differences. However, such gradient cues have not been effectively leveraged in deep learning frameworks for SAR-Optical image matching. To address this gap, we propose SOMA, a dense registration framework that integrates structural gradient priors into deep features and refines alignment through a hybrid matching strategy. Specifically, we introduce the Feature Gradient Enhancer (FGE), which embeds multi-scale, multi-directional gradient filters into the feature space using attention and reconstruction mechanisms to boost feature distinctiveness. Furthermore, we propose the Global-Local Affine-Flow Matcher (GLAM), which combines affine transformation and flow-based refinement within a coarse-to-fine architecture to ensure both structural consistency and local accuracy. Experimental results demonstrate that SOMA significantly improves registration precision, increasing the CMR@1px by 12.29% on the SEN1-2 dataset and 18.50% on the GFGE SO dataset. In addition, SOMA exhibits strong robustness and generalizes well across diverse scenes and resolutions.


An Ontology for Unified Modeling of Tasks, Actions, Environments, and Capabilities in Personal Service Robotics

arXiv.org Artificial Intelligence

Personal service robots are increasingly used in domestic settings to assist older adults and people requiring support. Effective operation involves not only physical interaction but also the ability to interpret dynamic environments, understand tasks, and choose appropriate actions based on context. This requires integrating both hardware components (e.g. sensors, actuators) and software systems capable of reasoning about tasks, environments, and robot capabilities. Frameworks such as the Robot Operating System (ROS) provide open-source tools that help connect low-level hardware with higher-level functionalities. However, real-world deployments remain tightly coupled to specific platforms. As a result, solutions are often isolated and hard-coded, limiting interoperability, reusability, and knowledge sharing. Ontologies and knowledge graphs offer a structured way to represent tasks, environments, and robot capabilities. Existing ontologies, such as the Socio-physical Model of Activities (SOMA) and the Descriptive Ontology for Linguistic and Cognitive Engineering (DOLCE), provide models for activities, spatial relationships, and reasoning structures. However, they often focus on specific domains and do not fully capture the connection between environment, action, robot capabilities, and system-level integration. In this work, we propose the Ontology for roBOts and acTions (OntoBOT), which extends existing ontologies to provide a unified representation of tasks, actions, environments, and capabilities. Our contributions are twofold: (1) we unify these aspects into a cohesive ontology to support formal reasoning about task execution, and (2) we demonstrate its generalizability by evaluating competency questions across four embodied agents - TIAGo, HSR, UR3, and Stretch - showing how OntoBOT enables context-aware reasoning, task-oriented execution, and knowledge sharing in service robotics.


Bridging Bots: from Perception to Action via Multimodal-LMs and Knowledge Graphs

arXiv.org Artificial Intelligence

Personal service robots are deployed to support daily living in domestic environments, particularly for elderly and individuals requiring assistance. These robots must perceive complex and dynamic surroundings, understand tasks, and execute context-appropriate actions. However, current systems rely on proprietary, hard-coded solutions tied to specific hardware and software, resulting in siloed implementations that are difficult to adapt and scale across platforms. Ontologies and Knowledge Graphs (KGs) offer a solution to enable interoperability across systems, through structured and standardized representations of knowledge and reasoning. However, symbolic systems such as KGs and ontologies struggle with raw and noisy sensory input. In contrast, multimodal language models are well suited for interpreting input such as images and natural language, but often lack transparency, consistency, and knowledge grounding. In this work, we propose a neurosymbolic framework that combines the perceptual strengths of multimodal language models with the structured representations provided by KGs and ontologies, with the aim of supporting interoperability in robotic applications. Our approach generates ontology-compliant KGs that can inform robot behavior in a platform-independent manner. We evaluated this framework by integrating robot perception data, ontologies, and five multimodal models (three LLaMA and two GPT models), using different modes of neural-symbolic interaction. We assess the consistency and effectiveness of the generated KGs across multiple runs and configurations, and perform statistical analyzes to evaluate performance. Results show that GPT-o1 and LLaMA 4 Maverick consistently outperform other models. However, our findings also indicate that newer models do not guarantee better results, highlighting the critical role of the integration strategy in generating ontology-compliant KGs.


Extending the Benefits of Parallel Elasticity across Multiple Actuation Tasks: A Geometric and Optimization-Based Approach

arXiv.org Artificial Intelligence

A spring in parallel with an effort source (e.g., electric motor or human muscle) can reduce its energy consumption and effort (i.e., torque or force) depending on the spring stiffness, spring preload, and actuation task. However, selecting the spring stiffness and preload that guarantees effort or energy reduction for an arbitrary set of tasks is a design challenge. This work formulates a convex optimization problem to guarantee that a parallel spring reduces the root-mean-square source effort or energy consumption for multiple tasks. Specifically, we guarantee the benefits across multiple tasks by enforcing a set of convex quadratic constraints in our optimization variables -- the parallel spring stiffness and preload. These quadratic constraints are equivalent to ellipses in the stiffness and preload plane, any combination of stiffness and preload inside the ellipse represents a parallel spring that minimizes effort source or energy consumption with respect to an actuator without a spring. This geometric interpretation intuitively guides the stiffness and preload selection process. We analytically and experimentally prove the convex quadratic function of the spring stiffness and preload. As applications, we analyze the stiffness and preload selection of a parallel spring for a knee exoskeleton using human muscle as the effort source and a prosthetic ankle powered by electric motors. To promote adoption, the optimization and geometric methods are available as supplemental open-source software that can be executed in a web browser.


An Overlooked Role of Context-Sensitive Dendrites

arXiv.org Artificial Intelligence

To date, most dendritic studies have predominantly focused on the apical zone of pyramidal two-point neurons (TPNs) receiving only feedback (FB) connections from higher perceptual layers and using them for learning. Recent cellular neurophysiology and computational neuroscience studies suggests that the apical input (context), coming from feedback and lateral connections, is multifaceted and far more diverse, with greater implications for ongoing learning and processing in the brain than previously realized. In addition to the FB, the apical tuft receives signals from neighboring cells of the same network as proximal (P) context, other parts of the brain as distal (D) context, and overall coherent information across the network as universal (U) context. The integrated context (C) amplifies and suppresses the transmission of coherent and conflicting feedforward (FF) signals, respectively. Specifically, we show that complex context-sensitive (CS)-TPNs flexibly integrate C moment-by-moment with the FF somatic current at the soma such that the somatic current is amplified when both feedforward (FF) and C are coherent; otherwise, it is attenuated. This generates the event only when the FF and C currents are coherent, which is then translated into a singlet or a burst based on the FB information. Spiking simulation results show that this flexible integration of somatic and contextual currents enables the propagation of more coherent signals (bursts), making learning faster with fewer neurons. Similar behavior is observed when this functioning is used in conventional artificial networks, where orders of magnitude fewer neurons are required to process vast amounts of heterogeneous real-world audio-visual (AV) data trained using backpropagation (BP). The computational findings presented here demonstrate the universality of CS-TPNs, suggesting a dendritic narrative that was previously overlooked.


Self-training superconducting neuromorphic circuits using reinforcement learning rules

arXiv.org Artificial Intelligence

Reinforcement learning algorithms are used in a wide range of applications, from gaming and robotics to autonomous vehicles. In this paper we describe a set of reinforcement learning-based local weight update rules and their implementation in superconducting hardware. Using SPICE circuit simulations, we implement a small-scale neural network with a learning time of order one nanosecond. This network can be trained to learn new functions simply by changing the target output for a given set of inputs, without the need for any external adjustments to the network. In this implementation the weights are adjusted based on the current state of the overall network response and locally stored information about the previous action. This removes the need to program explicit weight values in these networks, which is one of the primary challenges that analog hardware implementations of neural networks face. The adjustment of weights is based on a global reinforcement signal that obviates the need for circuitry to back-propagate errors.


A Synchronized Layer-by-layer Growing Approach for Plausible Neuronal Morphology Generation

arXiv.org Artificial Intelligence

Neuronal morphology is essential for studying brain functioning and understanding neurodegenerative disorders. As the acquiring of real-world morphology data is expensive, computational approaches especially learning-based ones e.g. MorphVAE for morphology generation were recently studied, which are often conducted in a way of randomly augmenting a given authentic morphology to achieve plausibility. Under such a setting, this paper proposes \textbf{MorphGrower} which aims to generate more plausible morphology samples by mimicking the natural growth mechanism instead of a one-shot treatment as done in MorphVAE. Specifically, MorphGrower generates morphologies layer by layer synchronously and chooses a pair of sibling branches as the basic generation block, and the generation of each layer is conditioned on the morphological structure of previous layers and then generate morphologies via a conditional variational autoencoder with spherical latent space. Extensive experimental results on four real-world datasets demonstrate that MorphGrower outperforms MorphVAE by a notable margin. Our code will be publicly available to facilitate future research.


Implementing Simple Neural Network in C#

#artificialintelligence

Artificial Neural Networks are insipired by our nervous system. The general idea is that if we copy the brain structure, we will build a learning machine. As you are probably aware, the smallest unit of the nervous system is a neuron. These are cells with similar and simple structures. Yet, by continuous communication, these cells achieve enormous processing power.


Change with distance from the soma

Science

Neuroscience Hippocampal neurons receive and integrate synaptic input along their dendritic tree. Inputs located near the cell soma or in the distal dendrite contribute differently to neuronal integration by a variety of mechanisms, including NMDA receptor (NMDAR)–dependent plasticity processes. Using superresolution microscopy, single-nanoparticle imaging, and glutamate uncaging, Ferreira et al. investigated the nanoscale organization of NMDARs containing the subunits GluN2A and GluN2B along the dendritic tree. The organization and surface dynamics of GluN2B-NMDARs, but not of GluN2A-NMDARs, changed between proximal and distal clusters, with a gradual increase in receptor local density from proximal to distal dendritic segments. At the proximal dendrite, the nanoscale organization and membrane dynamics of GluN2B-NMDARs were influenced by physical interplay with the protein kinase CaMKII. Proc. Natl. Acad. Sci. U.S.A. 117 , 24526 (2020).


The Neuron – A Hackers Perspective

#artificialintelligence

It's not too often that you see handkerchiefs around anymore. Today, they're largely viewed as unsanitary and well… just plain gross. You'll be quite disappointed to learn that they have absolutely nothing to do with this article other than a couple of similarities they share when compared to your neocortex. If you were to pull the neocortex from your brain and stretch it out on a table, you most likely wouldn't be able to see that not only is it roughly the size of a large handkerchief; it also shares the same thickness. The neocortex, or cortex for short, is Latin for "new rind", or "new bark", and represents the most recent evolutionary change to the mammalian brain.