AITopics | Object-Oriented Architecture

Collaborating Authors

Object-Oriented Architecture

News Overviews Instructional Materials AI-Alerts Classics

MOGAM: A Multimodal Object-oriented Graph Attention Model for Depression Detection

Cha, Junyeop, Kim, Seoyun, Kim, Dongjae, Park, Eunil

arXiv.org Artificial IntelligenceMar-21-2024

Early detection plays a crucial role in the treatment of depression. Therefore, numerous studies have focused on social media platforms, where individuals express their emotions, aiming to achieve early detection of depression. However, the majority of existing approaches often rely on specific features, leading to limited scalability across different types of social media datasets, such as text, images, or videos. To overcome this limitation, we introduce a Multimodal Object-Oriented Graph Attention Model (MOGAM), which can be applied to diverse types of data, offering a more scalable and versatile solution. Furthermore, to ensure that our model can capture authentic symptoms of depression, we only include vlogs from users with a clinical diagnosis. To leverage the diverse features of vlogs, we adopt a multimodal approach and collect additional metadata such as the title, description, and duration of the vlogs. To effectively aggregate these multimodal features, we employed a cross-attention mechanism. MOGAM achieved an accuracy of 0.871 and an F1-score of 0.888. Moreover, to validate the scalability of MOGAM, we evaluated its performance with a benchmark dataset and achieved comparable results with prior studies (0.61 F1-score). In conclusion, we believe that the proposed model, MOGAM, is an effective solution for detecting depression in social media, offering potential benefits in the early detection and treatment of this mental health condition.

depression, mogam, vlog, (11 more...)

arXiv.org Artificial Intelligence

2403.15485

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > Italy > Tuscany > Florence (0.04)
Asia > Bangladesh (0.04)

Genre: Research Report > New Finding (0.93)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.73)
(2 more...)

Add feedback

Generative AI and CS Education

Communications of the ACMMar-19-2024, 17:00:22 GMT

I have spent most of my career working on computer science (CS) education whether teaching undergraduate CS or managing technical education for software engineers at Google. In the early 1990s, when Pascal was the language of choice, I began teaching CS1 and CS2 at Stanford. Over the next few years, I saw the transition from Pascal to C to object-oriented programming. I also saw the pace at which we had to consistently update our course materials and projects, whether it was in the introductory courses or later electives such as graphics or compilers. Languages, software frameworks, libraries, APIs, and so forth change rapidly.

engineer, google, software engineer, (10 more...)

Communications of the ACM

Genre: Instructional Material > Course Syllabus & Notes (0.36)

Industry: Education > Curriculum (0.90)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.55)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.43)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.43)

Add feedback

Large Scale Category Structure Aware Image Categorization

Neural Information Processing SystemsMar-15-2024, 11:04:30 GMT

Most previous research on image categorization has focused on medium-scale data sets, while large-scale image categorization with millions of images from thousands of categories remains a challenge. With the emergence of structured large-scale dataset such as the ImageNet, rich information about the conceptual relationships between images, such as a tree hierarchy among various image categories, become available. As human cognition of complex visual world benefits from underlying semantic relationships between object classes, we believe a machine learning system can and should leverage such information as well for better performance. In this paper, we employ such semantic relatedness among image categories for large-scale image categorization. Specifically, a category hierarchy is utilized to properly define loss function and select common set of features for related categories. An efficient optimization method based on proximal approximation and accelerated parallel gradient method is introduced. Experimental results on a subset of ImageNet containing 1.2 million images from 1000 categories demonstrate the effectiveness and promise of our proposed approach.

category, hierarchy, image category, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > New York (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.67)

Add feedback

Semantic Labeling of 3D Point Clouds for Indoor Scenes

Neural Information Processing SystemsMar-15-2024, 06:56:22 GMT

Inexpensive RGB-D cameras that give an RGB image together with depth data have become widely available. In this paper, we use this data to build 3D point clouds of full indoor scenes such as an office and address the task of semantic labeling of these 3D point clouds. We propose a graphical model that captures various features and contextual relations, including the local visual appearance and shape cues, object co-occurence relationships and geometric relationships. With a large number of object classes and relations, the model's parsimony becomes important and we address that by using multiple types of edge potentials. The model admits efficient approximate inference, and we train it using a maximum-margin learning approach. In our experiments over a total of 52 3D scenes of homes and offices (composed from about 550 views, having 2495 segments labeled with 27 object classes), we get a performance of 84.06% in labeling 17 object classes for offices, and 73.38% in labeling 17 object classes for home scenes. Finally, we applied these algorithms successfully on a mobile robot for the task of finding objects in large cluttered rooms.

experiment, point cloud, precision and recall, (15 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.94)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Why The Brain Separates Face Recognition From Object Recognition

Neural Information Processing SystemsMar-14-2024, 23:37:23 GMT

Many studies have uncovered evidence that visual cortex contains specialized regions involved in processing faces but not other object classes. Recent electrophysiology studies of cells in several of these specialized regions revealed that at least some of these regions are organized in a hierarchical manner with viewpointspecific cells projecting to downstream viewpoint-invariant identity-specific cells [1]. A separate computational line of reasoning leads to the claim that some transformations of visual inputs that preserve viewed object identity are class-specific. In particular, the 2D images evoked by a face undergoing a 3D rotation are not produced by the same image transformation (2D) that would produce the images evoked by an object of another class undergoing the same 3D rotation. However, within the class of faces, knowledge of the image transformation evoked by 3D rotation can be reliably transferred from previously viewed faces to help identify a novel face at a new viewpoint. We show, through computational simulations, that an architecture which applies this method of gaining invariance to class-specific transformations is effective when restricted to faces and fails spectacularly when applied to other object classes. We argue here that in order to accomplish viewpoint-invariant face identification from a single example view, visual cortex must separate the circuitry involved in discounting 3D rotations of faces from the generic circuitry involved in processing other objects. The resulting model of the ventral stream of visual cortex is consistent with the recent physiology results showing the hierarchical organization of the face processing network.

invariance, transformation, visual cortex, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)

Genre: Research Report (0.34)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.90)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.65)

Add feedback

Semantic Kernel Forests from Multiple Taxonomies

Neural Information Processing SystemsMar-14-2024, 17:44:51 GMT

When learning features for complex visual recognition problems, labeled image exemplars alone can be insufficient. While an object taxonomy specifying the categories' semantic relationships could bolster the learning process, not all relationships are relevant to a given visual classification task, nor does a single taxonomy capture all ties that are relevant. In light of these issues, we propose a discriminative feature learning approach that leverages multiple hierarchical taxonomies representing different semantic views of the object categories (e.g., for animal classes, one taxonomy could reflect their phylogenic ties, while another could reflect their habitats). For each taxonomy, we first learn a tree of semantic kernels, where each node has a Mahalanobis kernel optimized to distinguish between the classes in its children nodes. Then, using the resulting semantic kernel forest, we learn class-specific kernel combinations to select only those relationships relevant to recognize each object class. To learn the weights, we introduce a novel hierarchical regularization term that further exploits the taxonomies' structure. We demonstrate our method on challenging object recognition datasets, and show that interleaving multiple taxonomic views yields significant accuracy improvements.

kernel, semantic kernel forest, taxonomy, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Texas > Travis County > Austin (0.04)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.48)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.47)

Add feedback

3D Object Detection and Viewpoint Estimation with a Deformable 3D Cuboid Model

Neural Information Processing SystemsMar-14-2024, 11:03:08 GMT

This paper addresses the problem of category-level 3D object detection. Given a monocular image, our aim is to localize the objects in 3D by enclosing them with tight oriented 3D bounding boxes. We propose a novel approach that extends the well-acclaimed deformable part-based model [1] to reason in 3D. Our model represents an object class as a deformable 3D cuboid composed of faces and parts, which are both allowed to deform with respect to their anchors on the 3D box. We model the appearance of each face in fronto-parallel coordinates, thus effectively factoring out the appearance variation induced by viewpoint.

cuboid, deformation, detection, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Illinois > Cook County > Chicago (0.04)

Genre: Research Report (0.34)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
(2 more...)

Add feedback

Localizing 3D cuboids in single-view images

Neural Information Processing SystemsMar-14-2024, 08:57:11 GMT

In this paper we seek to detect rectangular cuboids and localize their corners in uncalibrated single-view images depicting everyday scenes. In contrast to recent approaches that rely on detecting vanishing points of the scene and grouping line segments to form cuboids, we build a discriminative parts-based detector that models the appearance of the cuboid corners and internal edges while enforcing consistency to a 3D cuboid model. Our model copes with different 3D viewpoints and aspect ratios and is able to detect cuboids across many different object categories. We introduce a database of images with cuboid annotations that spans a variety of indoor and outdoor scenes and show qualitative and quantitative results on our collected database.

corner location, cuboid, detection, (13 more...)

Neural Information Processing Systems

Country:

Asia (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.35)

Add feedback

3D Object Proposals for Accurate Object Class Detection Xiaozhi Chen 1 Andrew Berneshawi

Neural Information Processing SystemsMar-13-2024, 00:16:13 GMT

The goal of this paper is to generate high-quality 3D object proposals in the context of autonomous driving. Our method exploits stereo imagery to place proposals in the form of 3D bounding boxes. We formulate the problem as minimizing an energy function encoding object size priors, ground plane as well as several depth informed features that reason about free space, point cloud densities and distance to the ground. Our experiments show significant performance gains over existing RGB and RGB-D object proposal methods on the challenging KITTI benchmark. Combined with convolutional neural net (CNN) scoring, our approach outperforms all existing results on all three KITTI object classes.

machine learning, object-oriented architecture, proposal, (17 more...)

Neural Information Processing Systems

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.68)

Industry:

Automobiles & Trucks (0.90)
Transportation > Ground > Road (0.50)
Information Technology > Robotics & Automation (0.35)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.86)

Add feedback

Hierarchical Object Representation for Open-Ended Object Category Learning and Recognition

Neural Information Processing SystemsMar-12-2024, 09:02:34 GMT

Most robots lack the ability to learn new objects from past experiences. To migrate a robot to a new environment one must often completely re-generate the knowledgebase that it is running with. Since in open-ended domains the set of categories to be learned is not predefined, it is not feasible to assume that one can pre-program all object categories required by robots. Therefore, autonomous robots must have the ability to continuously execute learning and recognition in a concurrent and interleaved fashion. This paper proposes an open-ended 3D object recognition system which concurrently learns both the object categories and the statistical features for encoding objects. In particular, we propose an extension of Latent Dirichlet Allocation to learn structural semantic features (i.e.

category, natural language, object-oriented architecture, (20 more...)

Neural Information Processing Systems

Country: