AITopics | octopi

Collaborating Authors

octopi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

VLA-Touch: Enhancing Vision-Language-Action Models with Dual-Level Tactile Feedback

Bi, Jianxin, Ma, Kevin Yuchen, Hao, Ce, Shou, Mike Zheng, Soh, Harold

arXiv.org Artificial IntelligenceJul-30-2025

Tactile feedback is generally recognized to be crucial for effective interaction with the physical world. However, state-of-the-art Vision-Language-Action (VLA) models lack the ability to interpret and use tactile signals, limiting their effectiveness in contact-rich tasks. Incorporating tactile feedback into these systems is challenging due to the absence of large multi-modal datasets. We present VLA-Touch, an approach that enhances generalist robot policies with tactile sensing \emph{without fine-tuning} the base VLA. Our method introduces two key innovations: (1) a pipeline that leverages a pretrained tactile-language model that provides semantic tactile feedback for high-level task planning, and (2) a diffusion-based controller that refines VLA-generated actions with tactile signals for contact-rich manipulation. Through real-world experiments, we demonstrate that our dual-level integration of tactile feedback improves task planning efficiency while enhancing execution precision. Code is open-sourced at \href{https://github.com/jxbi1010/VLA-Touch}{this URL}.

artificial intelligence, manipulation, tactile feedback, (15 more...)

arXiv.org Artificial Intelligence

2507.17294

Genre: Research Report > New Finding (0.93)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots > Manipulation (0.68)
Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.56)

Add feedback

Robotic Perception with a Large Tactile-Vision-Language Model for Physical Property Inference

Guo, Zexiang, Chen, Hengxiang, Mai, Xinheng, Qiu, Qiusang, Ma, Gan, Kappassov, Zhanat, Li, Qiang, Chen, Nutan

arXiv.org Artificial IntelligenceJun-25-2025

Inferring physical properties can significantly enhance robotic manipulation by enabling robots to handle objects safely and efficiently through adaptive grasping strategies. Previous approaches have typically relied on either tactile or visual data, limiting their ability to fully capture properties. We introduce a novel cross-modal perception framework that integrates visual observations with tactile representations within a multimodal vision-language model. Our physical reasoning framework, which employs a hierarchical feature alignment mechanism and a refined prompting strategy, enables our model to make property-specific predictions that strongly correlate with ground-truth measurements. Evaluated on 35 diverse objects, our approach outperforms existing baselines and demonstrates strong zero-shot generalization.

large language model, natural language, octopi, (17 more...)

arXiv.org Artificial Intelligence

2506.19303

Country: Asia > China (0.14)

Genre: Research Report > Experimental Study (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

How I Used DALL·E 2 to Generate The Logo for OctoSQL

#artificialintelligenceAug-2-2022, 18:22:30 GMT

Everybody has heard about the latest cool thing™, which is DALL·E 2 (henceforth called Dall-e). A few months ago, when the first previews started, it was basically everywhere. Now, a few weeks ago, the floodgates have been opened and lots of people on the waitlist got access - that group included me. I’ve spent a day playing around with it, learned some basics (like the fact that adding “artstation” to the end of your phrase automatically makes the output much better…), and generated a bunch of (even a few nice-looking) images.

database, digital art, logo, (15 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback