AITopics

2211.15501

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.40)

arXiv.org Artificial IntelligenceNov-22-2022

PEVL: Position-enhanced Pre-training and Prompt Tuning for Vision-language Models

Yao, Yuan, Chen, Qianyu, Zhang, Ao, Ji, Wei, Liu, Zhiyuan, Chua, Tat-Seng, Sun, Maosong

Vision-language pre-training (VLP) has shown impressive performance on a wide range of cross-modal tasks, where VLP models without reliance on object detectors are becoming the mainstream due to their superior computation efficiency and competitive performance. However, the removal of object detectors also deprives the capability of VLP models in explicit object modeling, which is essential to various position-sensitive vision-language (VL) tasks, such as referring expression comprehension and visual commonsense reasoning. To address the challenge, we introduce PEVL that enhances the pre-training and prompt tuning of VLP models with explicit object position modeling. Specifically, PEVL reformulates discretized object positions and language in a unified language modeling framework, which facilitates explicit VL alignment during pre-training, and also enables flexible prompt tuning for various downstream tasks. We show that PEVL enables state-of-the-art performance of detector-free VLP models on position-sensitive tasks such as referring expression comprehension and phrase grounding, and also improves the performance on position-insensitive tasks with grounded inputs. We make the data and code for this paper publicly available at https://github.com/thunlp/PEVL.

machine learning, natural language, object-oriented architecture, (21 more...)

2205.11169

Country:

Asia > China > Beijing > Beijing (0.04)
Asia > Singapore > Central Region > Singapore (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.34)

Yang, Chenhongyi, Huang, Lichao, Crowley, Elliot J.

Plug and Play Active Learning for Object Detection

arXiv.org Artificial IntelligenceNov-21-2022

Annotating data for supervised learning is expensive and tedious, and we want to do as little of it as possible. To make the most of a given "annotation budget" we can turn to active learning (AL) which aims to identify the most informative samples in a dataset for annotation. Active learning algorithms are typically uncertainty-based or diversity-based. Both have seen success in image classification, but fall short when it comes to object detection. We hypothesise that this is because: (1) it is difficult to quantify uncertainty for object detection as it consists of both localisation and classification, where some classes are harder to localise, and others are harder to classify; (2) it is difficult to measure similarities for diversity-based AL when images contain different numbers of objects. We propose a two-stage active learning algorithm Plug and Play Active Learning (PPAL) that overcomes these difficulties. It consists of (1) Difficulty Calibrated Uncertainty Sampling, in which we used a category-wise difficulty coefficient that takes both classification and localisation into account to re-weight object uncertainties for uncertainty-based sampling; (2) Category Conditioned Matching Similarity to compute the similarities of multi-instance images as ensembles of their instance similarities. PPAL is highly generalisable because it makes no change to model architectures or detector training pipelines. We benchmark PPAL on the MS-COCO and Pascal VOC datasets using different detector architectures and show that our method outperforms the prior state-of-the-art. Code is available at https://github.com/ChenhongyiYang/PPAL

artificial intelligence, machine learning, object-oriented architecture, (15 more...)

2211.11612

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.49)

arXiv.org Artificial IntelligenceNov-18-2022

Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone

Dou, Zi-Yi, Kamath, Aishwarya, Gan, Zhe, Zhang, Pengchuan, Wang, Jianfeng, Li, Linjie, Liu, Zicheng, Liu, Ce, LeCun, Yann, Peng, Nanyun, Gao, Jianfeng, Wang, Lijuan

Vision-language (VL) pre-training has recently received considerable attention. However, most existing end-to-end pre-training approaches either only aim to tackle VL tasks such as image-text retrieval, visual question answering (VQA) and image captioning that test high-level understanding of images, or only target region-level understanding for tasks such as phrase grounding and object detection. We present FIBER (Fusion-In-the-Backbone-based transformER), a new VL model architecture that can seamlessly handle both these types of tasks. Instead of having dedicated transformer layers for fusion after the uni-modal backbones, FIBER pushes multimodal fusion deep into the model by inserting cross-attention into the image and text backbones, bringing gains in terms of memory and performance. In addition, unlike previous work that is either only pre-trained on image-text data or on fine-grained data with box-level annotations, we present a two-stage pre-training strategy that uses both these kinds of data efficiently: (i) coarse-grained pre-training based on image-text data; followed by (ii) fine-grained pre-training based on image-text-box data. We conduct comprehensive experiments on a wide range of VL tasks, ranging from VQA, image captioning, and retrieval, to phrase grounding, referring expression comprehension, and object detection. Using deep multimodal fusion coupled with the two-stage pre-training, FIBER provides consistent performance improvements over strong baselines across all tasks, often outperforming methods using magnitudes more data. Code is available at https://github.com/microsoft/FIBER.

backbone, machine learning, object-oriented architecture, (18 more...)

2206.07643

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > New York (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.48)

arXiv.org Artificial IntelligenceNov-15-2022

Knowledge Retrieval using Functional Object-Oriented Network

Shaik, Naseem

Robots can complete all human-performed tasks, but due to their current lack of knowledge, some tasks still cannot be completed by them with a high degree of success. However, with the right knowledge, these tasks can be completed by robots with a high degree of success, reducing the amount of human effort required to complete daily tasks. In this paper, the FOON, which describes the robot action success rate, is discussed. The functional object-oriented network (FOON) is a knowledge representation for symbolic task planning that takes the shape of a graph. It is to demonstrate the adaptability of FOON in developing a novel and adaptive method of solving a problem utilizing knowledge obtained from various sources, a graph retrieval methodology is shown to produce manipulation motion sequences from the FOON to accomplish a desired aim. The outcomes are illustrated using motion sequences created by the FOON to complete the desired objectives in a simulated environment.

artificial intelligence, object-oriented architecture, task tree, (14 more...)

2211.03037

Country: North America > United States > Florida > Hillsborough County > Tampa (0.15)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.74)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.62)

#artificialintelligenceNov-7-2022, 22:20:39 GMT

PixelRNN, image generation with RNN(lab note 1: model architecture)

With a complex image, first binarize the image intensity between 0, 1, so as to avoid blurring the image, and then flatten each line of the image for all colour channels ie. Keep the previous logic, but replace the pixel generating pixel for row to generate row. After generation, comparing the origin images, there is very little loss of 0.1160. The only difference between them is which RNN output sections dominate the generation of the next pixel row, in other words, for Many-To-One there's an extra call Assume we have m n c image, m is row number, n is column, c is color channel number. For grey-scale, the input_siz e should be n 1 because there is only one color channel .

image generation, lab note 1, model architecture, (2 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.76)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.40)

#artificialintelligenceNov-6-2022, 12:37:10 GMT

[100%OFF] 150+ Exercises - Object Oriented Programming In Python - OOP

Welcome to the 150 Exercises – Object Oriented Programming in Python – OOP course, where you can test your Python programming skills in object-oriented programming (OOP) and complete over 150 exercises! Python is a programming language that lets you work quickly and integrate systems more effectively. Python can be easy to pick up whether you're a first time programmer or you're experienced with other languages. The course is designed for people who have basic knowledge in Python and OOP concepts. It consists of over 150 exercises with solutions.

object oriented programming, oop, python

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (1.00)

#artificialintelligenceNov-6-2022, 00:26:29 GMT

Learn Python from Zero to Hero [Basic, GUI, Web, Full Stack]

Welcome to: Learn Python from Zero to Hero [Basic, GUI, Web, Full Stack as you know Python is an interpreted, object-oriented, high-level programming language with dynamic semantics. Python developers are in demand. Across a wide range of fields, there is a demand for those with Python skills. If you're looking to start or change your career, it could be a vital skill to help you. It could lead to a well-paid career. There will be many job opportunities.

application, python, rest api, (8 more...)

Genre: Instructional Material > Course Syllabus & Notes (0.34)

Technology:

Information Technology > Software > Programming Languages (0.99)
Information Technology > Graphics (0.72)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.40)

#artificialintelligenceNov-2-2022, 12:45:21 GMT

Other - Visual C++ programming for desktop application development

Visual C programming for desktop application development Published 10/2022 MP4 Video: h264, 1280x720 Audio: AAC, 44.1 KHz, 2 Ch Genre: eLearning Language: English Duration: 19 lectures (3h 53m) Size: 1.69 GB Visual C programming for desktop application development What you'll learn Upon successful completion of the course, the students will be able to develop Graphical User Interface (GUI)-based applications using Visual C Students will be able to develop GUI desktop applications in VC for the applications that they have previously made in console environment using C Develop desktop application using VC in the latest version of Microsoft Visual Studio that will enable students to perform various user interface operations Students previously knowing only C will be able to learn how to develop Graphical User Interface applications through VC via easy to learn short tutorials Requirements Basic knowledge of C (console based programming) Basic knowledge of Object-Oriented programming Description Welcome to the course of, Beginning Visual C programming for desktop application development. This is a must to take course if you have just learned the basic C using console interface and wondering how various user-interface applications can be created using C . This course will enable you to understand the basics of desktop application development using the latest version of Microsoft's visual studio. The teaching methodology of this course is based on hands-on topic specific examples that enable quicker learning. In this course, you will be learning VC using the latest version of Microsoft's visual studio.

application, desktop application development, qtzjt, (12 more...)

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry: Education (0.75)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Graphics (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.59)

#artificialintelligenceNov-1-2022, 12:08:20 GMT

[100%OFF] Certified Associate & Professional Python Programming Pack

Are you ready to take the PCAP – Certified Associate in Python Programming exam? The last three exams are in the form of practice tests and consists of 240 questions that may appear during the PCAP – Certified Associate in Python Programming exam. Where necessary, explanations are added to the questions. This course allows you to confirm your proficiency and give you the confidence you need to earn the PCAP – Certified Associate in Python Programming certification. PCAP – Certified Associate in Python Programming certification is a professional, high-stakes credential that measures the candidate's ability to perform intermediate-level coding tasks in the Python language, including the ability to design, develop, debug, execute, and refactor multi-module Python programs, as well as measures their skills and knowledge related to analyzing and modeling real-life problems in OOP categories with the use of the fundamental notions and techniques available in the object-oriented approach.

certified associate, certified professional, python programming 1, (13 more...)

Genre: Instructional Material > Course Syllabus & Notes (0.42)

Industry:

Information Technology > Software (0.40)
Education (0.32)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.55)