AITopics | head position

Collaborating Authors

head position

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning Nonverbal Cues in Multiparty Social Interactions for Robotic Facilitators

Martin-Ozimek, Antonio Lech, Jayarathne, Isuru, Mon, Su Larb, Chew, Jouhyeong

arXiv.org Artificial IntelligenceJan-18-2025

Conventional behavior cloning (BC) models often struggle to replicate the subtleties of human actions. Previous studies have attempted to address this issue through the development of a new BC technique: Implicit Behavior Cloning (IBC). This new technique consistently outperformed the conventional Mean Squared Error (MSE) BC models in a variety of tasks. Our goal is to replicate the performance of the IBC model by Florence [in Proceedings of the 5th Conference on Robot Learning, 164:158-168, 2022], for social interaction tasks using our custom dataset. While previous studies have explored the use of large language models (LLMs) for enhancing group conversations, they often overlook the significance of non-verbal cues, which constitute a substantial part of human communication. We propose using IBC to replicate nonverbal cues like gaze behaviors. The model is evaluated against various types of facilitator data and compared to an explicit, MSE BC model. Results show that the IBC model outperforms the MSE BC model across session types using the same metrics used in the previous IBC paper. Despite some metrics showing mixed results which are explainable for the custom dataset for social interaction, we successfully replicated the IBC model to generate nonverbal cues. Our contributions are (1) the replication and extension of the IBC model, and (2) a nonverbal cues generation model for social interaction. These advancements facilitate the integration of robots into the complex interactions between robots and humans, e.g., in the absence of a human facilitator.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2501.10857

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Genre: Research Report > New Finding (0.48)

Industry:

Education (0.69)
Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.55)

Add feedback

Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines

Lai, Junyu, Xu, Jiahe, Yang, Yao, Huang, Yunpeng, Cao, Chun, Xu, Jingwei

arXiv.org Artificial IntelligenceOct-10-2024

Large Language Models (LLMs) have demonstrated remarkable capabilities across a wide range of natural language processing and reasoning tasks. However, their performance in the foundational domain of arithmetic remains unsatisfactory. When dealing with arithmetic tasks, LLMs often memorize specific examples rather than learning the underlying computational logic, limiting their ability to generalize to new problems. In this paper, we propose a Composable Arithmetic Execution Framework (CAEF) that enables LLMs to learn to execute step-by-step computations by emulating Turing Machines, thereby gaining a genuine understanding of computational logic. Moreover, the proposed framework is highly scalable, allowing composing learned operators to significantly reduce the difficulty of learning complex operators. In our evaluation, CAEF achieves nearly 100% accuracy across seven common mathematical operations on the LLaMA 3.1-8B model, effectively supporting computations involving operands with up to 100 digits, a level where GPT-4o falls short noticeably in some settings.

executor, operator, turing machine, (14 more...)

arXiv.org Artificial Intelligence

2410.07896

Country: Asia > China > Jiangsu Province > Nanjing (0.04)

Genre:

Workflow (1.00)
Research Report (1.00)

Industry: Education > Curriculum > Subject-Specific Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.56)

Add feedback

CapHuman: Capture Your Moments in Parallel Universes

Liang, Chao, Ma, Fan, Zhu, Linchao, Deng, Yingying, Yang, Yi

arXiv.org Artificial IntelligenceFeb-1-2024

We concentrate on a novel human-centric image synthesis task, that is, given only one reference facial photograph, it is expected to generate specific individual images with diverse head positions, poses, and facial expressions in different contexts. To accomplish this goal, we argue that our generative model should be capable of the following favorable characteristics: (1) a strong visual and semantic understanding of our world and human society for basic object and human image generation. (2) generalizable identity preservation ability. (3) flexible and fine-grained head control. Recently, large pre-trained text-to-image diffusion models have shown remarkable results, serving as a powerful generative foundation. As a basis, we aim to unleash the above two capabilities of the pre-trained model. In this work, we present a new framework named CapHuman. We embrace the ``encode then learn to align" paradigm, which enables generalizable identity preservation for new individuals without cumbersome tuning at inference. CapHuman encodes identity features and then learns to align them into the latent space. Moreover, we introduce the 3D facial prior to equip our model with control over the human head in a flexible and 3D-consistent manner. Extensive qualitative and quantitative analyses demonstrate our CapHuman can produce well-identity-preserved, photo-realistic, and high-fidelity portraits with content-rich representations and various head renditions, superior to established baselines. Code and checkpoint will be released at https://github.com/VamosC/CapHuman.

caphuman, identity feature, identity preservation, (14 more...)

arXiv.org Artificial Intelligence

2402.00627

Country:

Pacific Ocean > North Pacific Ocean > San Francisco Bay > Golden Gate (0.04)
Asia > Middle East > Republic of Türkiye > Batman Province > Batman (0.04)
Asia > Macao (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Appearance-based gaze estimation enhanced with synthetic images using deep neural networks

Herashchenko, Dmytro, Farkaš, Igor

arXiv.org Artificial IntelligenceNov-23-2023

Human eye gaze estimation is an important cognitive ingredient for successful human-robot interaction, enabling the robot to read and predict human behavior. We approach this problem using artificial neural networks and build a modular system estimating gaze from separately cropped eyes, taking advantage of existing well-functioning components for face detection (RetinaFace) and head pose estimation (6DRepNet). Our proposed method does not require any special hardware or infrared filters but uses a standard notebook-builtin RGB camera, as often approached with appearance-based methods. Using the MetaHuman tool, we also generated a large synthetic dataset of more than 57,000 human faces and made it publicly available. The inclusion of this dataset (with eye gaze and head pose information) on top of the standard Columbia Gaze dataset into training the model led to better accuracy with a mean average error below two degrees in eye pitch and yaw directions, which compares favourably to related methods. We also verified the feasibility of our model by its preliminary testing in real-world setting using the builtin 4K camera in NICO semi-humanoid robot's eye.

dataset, estimation, interaction, (16 more...)

arXiv.org Artificial Intelligence

2311.14175

Country: Europe > Slovakia > Bratislava > Bratislava (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.65)

Add feedback

Subject-Independent Magnetoencephalographic Source Localization by a Multilayer Perceptron

Neural Information Processing SystemsApr-6-2023, 16:11:20 GMT

We describe a system that localizes a single dipole to reasonable accu- racy from noisy magnetoencephalographic (MEG) measurements in real time. At its core is a multilayer perceptron (MLP) trained to map sen- sor signals and head position to dipole location. Including head position overcomes the previous need to retrain the MLP for each subject and ses- sion. The training dataset was generated by mapping randomly chosen dipoles and head positions through an analytic model and adding noise from real MEG recordings. After training, a localization took 0.7 ms with an average error of 0.90 cm.

head position, multilayer perceptron, subject-independent magnetoencephalographic source localization, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)

Add feedback

Ford F-150 Lightning Electric Pickup to Have Level 2 Autonomous Driving - AI Trends

#artificialintelligenceJun-8-2021, 11:15:42 GMT

The all-electric Ford F-150 Lightning, announced recently by the Ford Motor Co., will feature hands-free driving by virtue of Blue Cruise advanced driving assistance system (ADAS). The hands-free driving features will also be available on the 2021 internal combustion pickup truck and certain Mustang models through a software update later this year, according to an account in TechCrunch. The hands-free capability uses cameras, radar sensors and software to provide a combination of adaptive cruise control, lane centering and speed-sign recognition. It has undergone some 500,000 miles of development testing, Ford emphasized in an announcement in April. The system also has an in-cabin camera that monitors eye gaze and head position to help ensure the driver's eyes remain on the road.

bluecruise, ford, vehicle, (15 more...)

#artificialintelligence

Country:

North America > United States (0.33)
North America > Canada (0.15)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)
Automobiles & Trucks > Manufacturer (1.00)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.40)

Add feedback

Universality of Gradient Descent Neural Network Training

Welper, G.

arXiv.org Machine LearningJul-27-2020

It has been observed that design choices of neural networks are often crucial for their successful optimization. In this article, we therefore discuss the question if it is always possible to redesign a neural network so that it trains well with gradient descent. This yields the following universality result: If, for a given network, there is any algorithm that can find good network weights for a classification task, then there exists an extension of this network that reproduces these weights and the corresponding forward output by mere gradient descent training. The construction is not intended for practical computations, but it provides some orientation on the possibilities of meta-learning and related approaches.

artificial intelligence, machine learning, turing machine, (16 more...)

arXiv.org Machine Learning

2007.13664

Country:

North America > United States > Florida > Orange County > Orlando (0.14)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(5 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.94)

Add feedback

Progress Extrapolating Algorithmic Learning to Arbitrary Sequence Lengths

Robinson, Andreas

arXiv.org Machine LearningMar-23-2020

Recent neural network models for algorithmic tasks have led to significant improvements in extrapolation to sequences much longer than training, but it remains an outstanding problem that the performance still degrades for very long or adversarial sequences. We present alternative architectures and loss-terms to address these issues, and our testing of these approaches has not detected any remaining extrapolation errors within memory constraints. We focus on linear time algorithmic tasks including copy, parentheses parsing, and binary addition. First, activation binning was used to discretize the trained network in order to avoid computational drift from continuous operations, and a binning-based digital loss term was added to encourage discretizable representations. In addition, a localized differentiable memory (LDM) architecture, in contrast to distributed memory access, addressed remaining extrapolation errors and avoided unbounded growth of internal computational states. Previous work has found that algorithmic extrapolation issues can also be alleviated with approaches relying on program traces, but the current effort does not rely on such traces.

activation, extrapolation, sequence, (14 more...)

arXiv.org Machine Learning

2003.08494

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Smart PILLOW uses airbags to adjust your head position in your sleep and stop you snoring

Daily Mail - Science & techJan-8-2020, 12:09:47 GMT

A smart pillow made by lifestyle technology company 10minds wants to end the scourge of snoring once and for all. The Motion Pillow, which was showcased at CES in Las Vegas, is a memory foam pillow that uses multiple different technologies to help alleviate issues that contribute to snoring. Using the company's'Sleep Pressure Monitoring System' - pads inside the pillow that can detect the position of one's head - the Motion Pillow is able to activate airbags inside the product to give sleepers' heads and necks a nudge in the right direction. The pillow technology is coupled with an audio detection system that is capable of hearing snores when they happen. Once the recorder picks up on any heavy breathing, it is able to communicate with the Motion Pillow, which then inflates airbags to reposition a user's head.

artificial intelligence, motion pillow, social media, (11 more...)

Daily Mail - Science & tech

Country: North America > United States > Nevada > Clark County > Las Vegas (0.26)

Industry:

Information Technology (1.00)
Transportation > Ground > Road (0.83)
Automobiles & Trucks > Parts Supplier (0.83)

Technology:

Information Technology > Artificial Intelligence (0.50)
Information Technology > Sensing and Signal Processing (0.36)
Information Technology > Communications > Social Media (0.32)

Add feedback

AI Sucks at Making Adorable Cat Photos, Clearly Misses the Entire Point of the Internet

#artificialintelligenceMar-2-2019, 06:12:13 GMT

Artificial intelligence (AI) recently tried to generate cat photos from scratch, and the results were cat-astrophic. This particular neural network (a type of AI modeled after the workings of the human brain) can produce astonishingly realistic original photos of human faces. In fact, the images of these made-up people were nearly impossible for human viewers to distinguish from photos of real people, programmers of the AI reported in a study that was posted December 2018 to the preprint journal arXiv. Felines, however, proved to be another story. The same algorithm that generated flawless human faces created cats with misshapen heads; the wrong number of eyes and legs; and bodies that were too long, too short, unusually rotund or rectangular, and bent at peculiar angles.

artificial intelligence, machine learning, stylegan, (10 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.41)

Add feedback