AITopics | Joo, Jungseock

Collaborating Authors

Joo, Jungseock

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Integrating Deep Metric Learning with Coreset for Active Learning in 3D Segmentation

Vepa, Arvind Murari, Yang, Zukang, Choi, Andrew, Joo, Jungseock, Scalzo, Fabien, Sun, Yizhou

arXiv.org Artificial IntelligenceNov-24-2024

Deep learning has seen remarkable advancements in machine learning, yet it often demands extensive annotated data. Tasks like 3D semantic segmentation impose a substantial annotation burden, especially in domains like medicine, where expert annotations drive up the cost. Active learning (AL) holds great potential to alleviate this annotation burden in 3D medical segmentation. The majority of existing AL methods, however, are not tailored to the medical domain. While weakly-supervised methods have been explored to reduce annotation burden, the fusion of AL with weak supervision remains unexplored, despite its potential to significantly reduce annotation costs. Additionally, there is little focus on slice-based AL for 3D segmentation, which can also significantly reduce costs in comparison to conventional volume-based AL. This paper introduces a novel metric learning method for Coreset to perform slice-based active learning in 3D medical segmentation. By merging contrastive learning with inherent data groupings in medical imaging, we learn a metric that emphasizes the relevant differences in samples for training 3D medical segmentation models. We perform comprehensive evaluations using both weak and full annotations across four datasets (medical and non-medical). Our findings demonstrate that our approach surpasses existing active learning techniques on both weak and full annotations and obtains superior performance with low-annotation budgets which is crucial in medical imaging. Source code for this project is available in the supplementary materials and on GitHub: https://github.com/arvindmvepa/al-seg.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2411.15763

Country:

North America > Canada (0.28)
Europe > France (0.28)
Asia > China (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Nuclear Medicine (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Preemptive Motion Planning for Human-to-Robot Indirect Placement Handovers

Choi, Andrew, Jawed, Mohammad Khalid, Joo, Jungseock

arXiv.org Artificial IntelligenceFeb-19-2024

As technology advances, the need for safe, efficient, and collaborative human-robot-teams has become increasingly important. One of the most fundamental collaborative tasks in any setting is the object handover. Human-to-robot handovers can take either of two approaches: (1) direct hand-to-hand or (2) indirect hand-to-placement-to-pick-up. The latter approach ensures minimal contact between the human and robot but can also result in increased idle time due to having to wait for the object to first be placed down on a surface. To minimize such idle time, the robot must preemptively predict the human intent of where the object will be placed. Furthermore, for the robot to preemptively act in any sort of productive manner, predictions and motion planning must occur in real-time. We introduce a novel prediction-planning pipeline that allows the robot to preemptively move towards the human agent's intended placement location using gaze and gestures as model inputs. In this paper, we investigate the performance and drawbacks of our early intent predictor-planner as well as the practical benefits of using such a pipeline through a human-robot case study.

artificial intelligence, machine learning, prediction, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICRA46639.2022.9811558

2203.00156

Country: North America > United States > California (0.15)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.72)

Add feedback

Sim2Real Neural Controllers for Physics-based Robotic Deployment of Deformable Linear Objects

Tong, Dezhong, Choi, Andrew, Qin, Longhui, Huang, Weicheng, Joo, Jungseock, Jawed, M. Khalid

arXiv.org Artificial IntelligenceDec-10-2023

Deformable linear objects (DLOs), such as rods, cables, and ropes, play important roles in daily life. However, manipulation of DLOs is challenging as large geometrically nonlinear deformations may occur during the manipulation process. This problem is made even more difficult as the different deformation modes (e.g., stretching, bending, and twisting) may result in elastic instabilities during manipulation. In this paper, we formulate a physics-guided data-driven method to solve a challenging manipulation task -- accurately deploying a DLO (an elastic rod) onto a rigid substrate along various prescribed patterns. Our framework combines machine learning, scaling analysis, and physical simulations to develop a physics-based neural controller for deployment. We explore the complex interplay between the gravitational and elastic energies of the manipulated DLO and obtain a control method for DLO deployment that is robust against friction and material properties. Out of the numerous geometrical and material properties of the rod and substrate, we show that only three non-dimensional parameters are needed to describe the deployment process with physical analysis. Therefore, the essence of the controlling law for the manipulation task can be constructed with a low-dimensional model, drastically increasing the computation speed. The effectiveness of our optimal control scheme is shown through a comprehensive robotic case study comparing against a heuristic control method for deploying rods for a wide variety of patterns. In addition to this, we also showcase the practicality of our control scheme by having a robot accomplish challenging high-level tasks such as mimicking human handwriting, cable placement, and tying knots.

artificial intelligence, dlo, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2303.02574

Country: North America > United States > California > Los Angeles County > Los Angeles (0.14)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Deep Learning of Force Manifolds from the Simulated Physics of Robotic Paper Folding

Choi, Andrew, Tong, Dezhong, Terzopoulos, Demetri, Joo, Jungseock, Jawed, M. Khalid

arXiv.org Artificial IntelligenceJul-10-2023

Robotic manipulation of slender objects is challenging, especially when the induced deformations are large and nonlinear. Traditionally, learning-based control approaches, such as imitation learning, have been used to address deformable material manipulation. These approaches lack generality and often suffer critical failure from a simple switch of material, geometric, and/or environmental (e.g., friction) properties. This article tackles a fundamental but difficult deformable manipulation task: forming a predefined fold in paper with only a single manipulator. A data-driven framework combining physically-accurate simulation and machine learning is used to train a deep neural network capable of predicting the external forces induced on the manipulated paper given a grasp position. We frame the problem using scaling analysis, resulting in a control framework robust against material and geometric changes. Path planning is then carried out over the generated "neural force manifold" to produce robot manipulation trajectories optimized to prevent sliding, with offline trajectory generation finishing 15$\times$ faster than previous physics-based folding methods. The inference speed of the trained model enables the incorporation of real-time visual feedback to achieve closed-loop sensorimotor control. Real-world experiments demonstrate that our framework can greatly improve robotic manipulation performance compared to state-of-the-art folding strategies, even when manipulating paper objects of various materials and shapes.

artificial intelligence, machine learning, trajectory, (18 more...)

arXiv.org Artificial Intelligence

2301.01968

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Industry: Energy (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

mBEST: Realtime Deformable Linear Object Detection Through Minimal Bending Energy Skeleton Pixel Traversals

Choi, Andrew, Tong, Dezhong, Park, Brian, Terzopoulos, Demetri, Joo, Jungseock, Jawed, Mohammad Khalid

arXiv.org Artificial IntelligenceJun-27-2023

Robotic manipulation of deformable materials is a challenging task that often requires realtime visual feedback. This is especially true for deformable linear objects (DLOs) or "rods", whose slender and flexible structures make proper tracking and detection nontrivial. To address this challenge, we present mBEST, a robust algorithm for the realtime detection of DLOs that is capable of producing an ordered pixel sequence of each DLO's centerline along with segmentation masks. Our algorithm obtains a binary mask of the DLOs and then thins it to produce a skeleton pixel representation. After refining the skeleton to ensure topological correctness, the pixels are traversed to generate paths along each unique DLO. At the core of our algorithm, we postulate that intersections can be robustly handled by choosing the combination of paths that minimizes the cumulative bending energy of the DLO(s). We show that this simple and intuitive formulation outperforms the state-of-the-art methods for detecting DLOs with large numbers of sporadic crossings ranging from curvatures with high variance to nearly-parallel configurations. Furthermore, our method achieves a significant performance improvement of approximately 50% faster runtime and better scaling over the state of the art.

artificial intelligence, intersection, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2302.09444

Country: North America > United States > California (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Emergent Graphical Conventions in a Visual Communication Game

Qiu, Shuwen, Xie, Sirui, Fan, Lifeng, Gao, Tao, Joo, Jungseock, Zhu, Song-Chun, Zhu, Yixin

arXiv.org Artificial IntelligenceFeb-23-2023

Humans communicate with graphical sketches apart from symbolic languages (Fay et al., 2014). Primarily focusing on the latter, recent studies of emergent communication (Lazaridou and Baroni, 2020) overlook the sketches; they do not account for the evolution process through which symbolic sign systems emerge in the trade-off between iconicity and symbolicity. In this work, we take the very first step to model and simulate this process via two neural agents playing a visual communication game; the sender communicates with the receiver by sketching on a canvas. We devise a novel reinforcement learning method such that agents are evolved jointly towards successful communication and abstract graphical conventions. To inspect the emerged conventions, we define three fundamental properties--iconicity, symbolicity, and semanticity--and design evaluation methods accordingly. Our experimental results under different controls are consistent with the observation in studies of human graphical conventions (Hawkins et al., 2019; Fay et al., 2010). Of note, we find that evolved sketches can preserve the continuum of semantics (Mikolov et al., 2013) under proper environmental pressures. More interestingly, co-evolved agents can switch between conventionalized and iconic communication based on their familiarity with referents. We hope the present research can pave the path for studying emergent communication with the modality of sketches.

machine learning, natural language, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

2111.1421

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Understanding and Mitigating Annotation Bias in Facial Expression Recognition

Chen, Yunliang, Joo, Jungseock

arXiv.org Artificial IntelligenceAug-19-2021

The performance of a computer vision model depends on the size and quality of its training data. Recent studies have unveiled previously-unknown composition biases in common image datasets which then lead to skewed model outputs, and have proposed methods to mitigate these biases. However, most existing works assume that human-generated annotations can be considered gold-standard and unbiased. In this paper, we reveal that this assumption can be problematic, and that special care should be taken to prevent models from learning such annotation biases. We focus on facial expression recognition and compare the label biases between lab-controlled and in-the-wild datasets. We demonstrate that many expression datasets contain significant annotation biases between genders, especially when it comes to the happy and angry expressions, and that traditional methods cannot fully mitigate such biases in trained models. To remove expression annotation bias, we propose an AU-Calibrated Facial Expression Recognition (AUC-FER) framework that utilizes facial action units (AUs) and incorporates the triplet loss into the objective function. Experimental results suggest that the proposed method is more effective in removing expression annotation bias than existing techniques.

artificial intelligence, dataset, proportion, (14 more...)

arXiv.org Artificial Intelligence

2108.08504

Country: North America > United States > California (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology: Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)

Add feedback

Communicative Learning with Natural Gestures for Embodied Navigation Agents with Human-in-the-Scene

Wu, Qi, Wu, Cheng-Ju, Zhu, Yixin, Joo, Jungseock

arXiv.org Artificial IntelligenceAug-5-2021

Human-robot collaboration is an essential research topic in artificial intelligence (AI), enabling researchers to devise cognitive AI systems and affords an intuitive means for users to interact with the robot. Of note, communication plays a central role. To date, prior studies in embodied agent navigation have only demonstrated that human languages facilitate communication by instructions in natural languages. Nevertheless, a plethora of other forms of communication is left unexplored. In fact, human communication originated in gestures and oftentimes is delivered through multimodal cues, e.g. "go there" with a pointing gesture. To bridge the gap and fill in the missing dimension of communication in embodied agent navigation, we propose investigating the effects of using gestures as the communicative interface instead of verbal cues. Specifically, we develop a VR-based 3D simulation environment, named Ges-THOR, based on AI2-THOR platform. In this virtual environment, a human player is placed in the same virtual scene and shepherds the artificial agent using only gestures. The agent is tasked to solve the navigation problem guided by natural gestures with unknown semantics; we do not use any predefined gestures due to the diversity and versatile nature of human gestures. We argue that learning the semantics of natural gestures is mutually beneficial to learning the navigation task--learn to communicate and communicate to learn. In a series of experiments, we demonstrate that human gesture cues, even without predefined semantics, improve the object-goal navigation for an embodied agent, outperforming various state-of-the-art methods.

agent, computer game, deep learning, (21 more...)

arXiv.org Artificial Intelligence

2108.02846

Genre: Research Report (0.84)

Industry:

Education (0.69)
Leisure & Entertainment > Games > Computer Games (0.48)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Gender Slopes: Counterfactual Fairness for Computer Vision Models by Attribute Manipulation

Joo, Jungseock, Kärkkäinen, Kimmo

arXiv.org Artificial IntelligenceMay-20-2020

Automated computer vision systems have been applied in many domains including security, law enforcement, and personal devices, but recent reports suggest that these systems may produce biased results, discriminating against people in certain demographic groups. Diagnosing and understanding the underlying true causes of model biases, however, are challenging tasks because modern computer vision systems rely on complex black-box models whose behaviors are hard to decode. We propose to use an encoder-decoder network developed for image attribute manipulation to synthesize facial images varying in the dimensions of gender and race while keeping other signals intact. We use these synthesized images to measure counterfactual fairness of commercial computer vision classifiers by examining the degree to which these classifiers are affected by gender and racial cues controlled in the images, e.g., feminine faces may elicit higher scores for the concept of nurse and lower scores for STEM-related concepts. We also report the skewed gender representations in an online search service on profession-related keywords, which may explain the origin of the biases encoded in the models.

fairness, information management, oncology, (16 more...)

arXiv.org Artificial Intelligence

2005.1043

Country: North America > United States (0.68)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Leisure & Entertainment (0.93)
Information Technology > Security & Privacy (0.87)
Government > Regional Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Fashion Conversation Data on Instagram

Ha, Yu-i (Korea Advanced Institute of Science and Technology (KAIST)) | Kwon, Sejeong (Korea Advanced Institute of Science and Technology (KAIST)) | Cha, Meeyoung (Korea Advanced Institute of Science and Technology (KAIST)) | Joo, Jungseock (University of California, Los Angeles)

AAAI ConferencesMay-11-2017

The fashion industry is establishing its presence on a number of visual-centric social media like Instagram. This creates an interesting clash as fashion brands that have traditionally practiced highly creative and editorialized image marketing now have to engage with people on the platform that epitomizes impromptu, realtime conversation. What kinds of fashion images do brands and individuals share and what are the types of visual features that attract likes and comments? In this research, we take both quantitative and qualitative approaches to answer these questions. We analyze visual features of fashion posts first via manual tagging and then via training on convolutional neural networks. The classified images were examined across four types of fashion brands: mega couture, small couture, designers, and high street. We find that while product-only images make up the majority of fashion conversation in terms of volume, body snaps and face images that portray fashion items more naturally tend to receive a larger number of likes and comments by the audience. Our findings bring insights into building an automated tool for classifying or generating influential fashion information. We make our novel dataset of 24,752 labeled images on fashion conversations, containing visual and textual cues, available for the research community.

fashion conversation data, instagram

AAAI Conferences

Eleventh International AAAI Conference on Web and Social Media

Industry: Textiles, Apparel & Luxury Goods (0.73)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.53)

Add feedback