Goto

Collaborating Authors

 computer vision problem


Interview with Filippos Gouidis: Object state classification

AIHub

Filippos's PhD dissertation focuses on developing a method for recognizing object states without visual training data. By leveraging semantic knowledge from online sources and Large Language Models, structured as Knowledge Graphs, Graph Neural Networks learn representations for accurate state classification. In this interview series, we're meeting some of the AAAI/SIGAI Doctoral Consortium participants to find out more about their research. The Doctoral Consortium provides an opportunity for a group of PhD students to discuss and explore their research interests and career objectives in an interdisciplinary workshop together with a panel of established researchers. In this latest interview, we met with Filippos Gouidis, who has recently completed his PhD, and found out more about his research on object state classification.


Transfer Learning Applied to Computer Vision Problems: Survey on Current Progress, Limitations, and Opportunities

arXiv.org Artificial Intelligence

The field of Computer Vision (CV) has faced challenges. Initially, it relied on handcrafted features and rule-based algorithms, resulting in limited accuracy. The introduction of machine learning (ML) has brought progress, particularly Transfer Learning (TL), which addresses various CV problems by reusing pre-trained models. TL requires less data and computing while delivering nearly equal accuracy, making it a prominent technique in the CV landscape. Our research focuses on TL development and how CV applications use it to solve real-world problems. We discuss recent developments, limitations, and opportunities.


Instability of computer vision models is a necessary result of the task itself

arXiv.org Machine Learning

Adversarial examples resulting from instability of current computer vision models are an extremely important topic due to their potential to compromise any application. In this paper we demonstrate that instability is inevitable due to a) symmetries (translational invariance) of the data, b) the categorical nature of the classification task, and c) the fundamental discrepancy of classifying images as objects themselves. The issue is further exacerbated by non-exhaustive labelling of the training data. Therefore we conclude that instability is a necessary result of how the problem of computer vision is currently formulated. While the problem cannot be eliminated, through the analysis of the causes, we have arrived at ways how it can be partially alleviated. These include i) increasing the resolution of images, ii) providing contextual information for the image, iii) exhaustive labelling of training data, and iv) preventing attackers from frequent access to the computer vision system.


Optimization on a Budget: A Reinforcement Learning Approach

Neural Information Processing Systems

Many popular optimization algorithms, like the Levenberg-Marquardt algorithm (LMA), use heuristic-based controllers'' that modulate the behavior of the optimizer during the optimization process. For example, in the LMA a damping parameter is dynamically modified based on a set rules that were developed using various heuristic arguments. Reinforcement learning (RL) is a machine learning approach to learn optimal controllers by examples and thus is an obvious candidate to improve the heuristic-based controllers implicit in the most popular and heavily used optimization algorithms. Improving the performance of off-the-shelf optimizers is particularly important for time-constrained optimization problems. For example the LMA algorithm has become popular for many real-time computer vision problems, including object tracking from video, where only a small amount of time can be allocated to the optimizer on each incoming video frame.


A Guide to Human Pose Estimation for AI

#artificialintelligence

Human pose estimation and tracking is a computer vision task that includes detecting, associating, and tracking semantic key points. Examples of semantic key points are "right shoulders," "left knees," or the "left brake lights of vehicles." The performance of semantic keypoint tracking in live video footage requires high computational resources which has been limiting the accuracy of pose estimation. With the latest advances, new applications with real-time requirements become possible, such as self-driving cars and last-mile delivery robots. Today, the most powerful image processing models are based on convolutional neural networks (CNNs).


Deep Learning For Computer Vision

#artificialintelligence

Deep learning is seeing tremendous adoption in different industries. One specific area where deep learning has shown great potential is Computer Vision. I personally graduated from a computer vision master's program and went immediately to work in the industry. So what follows is my take on different trends that I am seeing in companies that are using deep learning to tackle challenging computer vision problems. So going back to my studies, in the middle of the master's program, I did an internship in a company in Luxembourg that makes large scanners of wood!


Supercharge your computer vision models with synthetic datasets built by UnityUnity ใŒๆง‹็ฏ‰ใ—ใŸๅˆๆˆใƒ‡ใƒผใ‚ฟใ‚ปใƒƒใƒˆใ‚’ไฝฟใฃใฆใ‚ณใƒณใƒ”ใƒฅใƒผใ‚ฟใƒผใƒ“ใ‚ธใƒงใƒณใƒขใƒ‡ใƒซใ‚’่ถ…ๅผทๅŒ–ใ—ใ‚ˆใ† - Unity Technologies Blog

#artificialintelligence

Is your limited dataset holding back the performance of your computer vision model? Using the power of the Unity Computer Vision Perception Package, Unity can unlock the potential of your computer vision model by generating custom datasets tailored to your specific requirements. Today, Unity Computer Vision Datasets are available to customers worldwide. Find out more about this new offering. Building a quality synthetic dataset is both an art and a science.


Optimization on a Budget: A Reinforcement Learning Approach

Neural Information Processing Systems

Many popular optimization algorithms, like the Levenberg-Marquardt algorithm (LMA), use heuristic-based controllers'' that modulate the behavior of the optimizer during the optimization process. For example, in the LMA a damping parameter is dynamically modified based on a set rules that were developed using various heuristic arguments. Reinforcement learning (RL) is a machine learning approach to learn optimal controllers by examples and thus is an obvious candidate to improve the heuristic-based controllers implicit in the most popular and heavily used optimization algorithms. Improving the performance of off-the-shelf optimizers is particularly important for time-constrained optimization problems. For example the LMA algorithm has become popular for many real-time computer vision problems, including object tracking from video, where only a small amount of time can be allocated to the optimizer on each incoming video frame.


H2O.ai Prague Meetup Number 4

#artificialintelligence

This meetup was recorded in Prague on September 19. Talk 1: Customized Loss Function in Gradient Boosting Machine by Veronika Maurerova About Veronika: * Software Engineer at H2O.ai * https://twitter.com/MaureVer Talk 3: General pipeline for Computer Vision problems by Yauhen Babakhin In this talk, we will consider the whole process of addressing Computer Vision problems. Proceeding to the training process accompanied by some recent methods in Deep Learning. And finishing with some practical tips and tricks that could help to increase the quality of the model.


How to Develop and Demonstrate Competence With Deep Learning for Computer Vision

#artificialintelligence

Computer vision is perhaps one area that has been most impacted by developments in deep learning. It can be difficult to both develop and to demonstrate competence with deep learning for problems in the field of computer vision. It is not clear how to get started, what the most important techniques are, and the types of problems and projects that can best highlight the value that deep learning can bring to the field. On approach is to systematically develop, and at the same time demonstrate competence with, data handling, modeling techniques, and application domains and present your results in a public portfolio of completed projects. This approach allows you to compound your skills from project to project.