AITopics

Country: North America > United States > Maryland > Montgomery County > Potomac (0.06)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Government > Military > Air Force (1.00)

Technology:

Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.65)

#artificialintelligenceJun-24-2021, 14:50:09 GMT

Lip-Reading AI is Under Development, Under Watchful Eyes - AI Trends

A lip-reading app from Irish startup Liopa is said to represent a breakthrough in the field of visual speech recognition (VSR), which trains AI to read lips without any audio input. Liopa's product, SRAVI (Speech Recognition App for the Voice Impaired) is a communication aid for speech-impaired patients. It is likely to be the first lip-reading AI app available for public purchase, according to an account from Vice/Motherboard. Researchers driven by a range of potential commercial applications including surveillance tools have been working for years to teach computers to lip-read, and it has proven a challenging task. Liopa is working to certify SRAVI as a Class I medical device in Europe, hoping to complete the certification by August.

liopa, lip-reading ai, motherboard, (14 more...)

Country:

North America > United States (0.16)
Europe > United Kingdom > Northern Ireland > County Down > Belfast (0.05)
Europe > United Kingdom > Northern Ireland > County Antrim > Belfast (0.05)
Asia > India (0.05)

Industry:

Information Technology (0.93)
Health & Medicine (0.72)
Education > Curriculum > Subject-Specific Education (0.37)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.40)

arXiv.org Artificial IntelligenceMay-30-2021

ICDAR 2021 Competition on Scientific Table Image Recognition to LaTeX

Kayal, Pratik, Anand, Mrinal, Desai, Harsh, Singh, Mayank

Tables present important information concisely in many scientific documents. Visual features like mathematical symbols, equations, and spanning cells make structure and content extraction from tables embedded in research documents difficult. This paper discusses the dataset, tasks, participants' methods, and results of the ICDAR 2021 Competition on Scientific Table Image Recognition to LaTeX. Specifically, the task of the competition is to convert a tabular image to its corresponding LaTeX source code. We proposed two subtasks. In Subtask 1, we ask the participants to reconstruct the LaTeX structure code from an image. In Subtask 2, we ask the participants to reconstruct the LaTeX content code from an image. This report describes the datasets and ground truth specification, details the performance evaluation metrics used, presents the final results, and summarizes the participating methods. Submission by team VCGroup got the highest Exact Match accuracy score of 74% for Subtask 1 and 55% for Subtask 2, beating previous baselines by 5% and 12%, respectively. Although improvements can still be made to the recognition capabilities of models, this competition contributes to the development of fully automated table recognition systems by challenging practitioners to solve problems under specific constraints and sharing their approaches; the platform will remain available for post-challenge submissions at https://competitions.codalab.org/competitions/26979 .

competition, icdar 2021, scientific table image recognition, (1 more...)

2105.14426

Genre: Research Report (0.69)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.60)

#artificialintelligenceMay-29-2021, 15:50:13 GMT

How image search works at Dropbox

Image classification lets us automatically understand what's in an image, but by itself this isn't enough to enable search. Sure, if a user searches for beach we could return the images with the highest scores for that category, but what if they instead search for shore? What if instead of apple they search for fruit or granny smith? We could collate a large dictionary of synonyms and near-synonyms and hierarchical relationships between words, but this quickly becomes unwieldy, especially if we support multiple languages. Word vectors So let's reframe the problem.

category space, image search work, vector, (7 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.40)

#artificialintelligenceMay-24-2021, 20:51:03 GMT

Image Search -- Transfer Learning with CNN (Convolutional Neural Network)

To build an Image Search Engine that retrieves the most similar images from the database based on specific target images. Given a query image (containing a specific instance) and a collection of images with different contents, we want to find the images that contain the same query instance from the collection. The below images are two examples of query images (original cropped). The image below is the query result using ResNet transfer learning. Since I have ten query images, there are ten rows of images, with each row containing the ten most similar images to the query image.

convolutional neural network, query image, transfer learning, (5 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.58)

#artificialintelligenceMay-19-2021, 19:10:15 GMT

Image Recognition AI: Algorithms And Applications

Image Recognition AI: Algorithms And Applications Machine learning began with humans feeding information to the computer through the usage of keyboards for them to understand and develop certain learned patterns. This process relied heavily on the ability of the human to enter the correct information and help the computer develop its patterns. This breakthrough does not really require someone to feed the information to the computer or be their eyes so to say. Because this new technique allows machines to interpret and categorize whatever they see in images or videos. In other words, computers now have their own eyes.

algorithm and application, image recognition ai, information, (10 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.79)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.50)

#artificialintelligenceMay-15-2021, 21:10:11 GMT

Deep Residual Learning for Image Recognition (2015)

Short summaries (1–2 minutes reading time) to help you (and me) understand and remember important papers/concepts about machine learning and related topics. "If you can't explain is simply, you don't understand it well enough" -- Einstein, maybe.

deep residual learning, image recognition

Technology: Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.40)

arXiv.org Artificial IntelligenceMay-5-2021

RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition

Ding, Xiaohan, Zhang, Xiangyu, Han, Jungong, Ding, Guiguang

We propose RepMLP, a multi-layer-perceptron-style neural network building block for image recognition, which is composed of a series of fully-connected (FC) layers. Compared to convolutional layers, FC layers are more efficient, better at modeling the long-range dependencies and positional patterns, but worse at capturing the local structures, hence usually less favored for image recognition. We propose a structural re-parameterization technique that adds local prior into an FC to make it powerful for image recognition. Specifically, we construct convolutional layers inside a RepMLP during training and merge them into the FC for inference. On CIFAR, a simple pure-MLP model shows performance very close to CNN. By inserting RepMLP in traditional CNN, we improve ResNets by 1.8% accuracy on ImageNet, 2.9% for face recognition, and 2.3% mIoU on Cityscapes with lower FLOPs. Our intriguing findings highlight that combining the global representational capacity and positional perception of FC with the local prior of convolution can improve the performance of neural network with faster speed on both the tasks with translation invariance (e.g., semantic segmentation) and those with aligned images and positional patterns (e.g., face recognition). The code and models are available at https://github.com/DingXiaoH/RepMLP.

artificial intelligence, machine learning, pattern recognition, (17 more...)

2105.01883

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > Nevada > Clark County > Las Vegas (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Wang, Zihao, Delingette, Hervé

Attention for Image Registration (AiR): an unsupervised Transformer approach

arXiv.org Artificial IntelligenceMay-5-2021

Image registration as an important basis in signal processing task often encounter the problem of stability and efficiency. Non-learning registration approaches rely on the optimization of the similarity metrics between the fix and moving images. Yet, those approaches are usually costly in both time and space complexity. The problem can be worse when the size of the image is large or the deformations between the images are severe. Recently, deep learning, or precisely saying, the convolutional neural network (CNN) based image registration methods have been widely investigated in the research community and show promising effectiveness to overcome the weakness of non-learning based methods. To explore the advanced learning approaches in image registration problem for solving practical issues, we present in this paper a method of introducing attention mechanism in deformable image registration problem. The proposed approach is based on learning the deformation field with a Transformer framework (AiR) that does not rely on the CNN but can be efficiently trained on GPGPU devices also. In a more vivid interpretation: we treat the image registration problem as the same as a language translation task and introducing a Transformer to tackle the problem. Our method learns an unsupervised generated deformation map and is tested on two benchmark datasets. The source code of the AiR will be released at Gitlab.

image registration, registration, transformer, (12 more...)

2105.02282

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > France > Provence-Alpes-Côte d'Azur > Alpes-Maritimes > Nice (0.04)
Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.71)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceMay-4-2021

PingAn-VCGroup's Solution for ICDAR 2021 Competition on Scientific Table Image Recognition to Latex

He, Yelin, Qi, Xianbiao, Ye, Jiaquan, Gao, Peng, Chen, Yihao, Li, Bingcong, Tang, Xin, Xiao, Rong

Recognizing a table image into a Latex code is challenging due to complexity and diversity of table structures and long sequence problem compared to traditional OCR. The challenge aims at assessing the ability of state-of-the-art methods to recognize scientific tables into LaTeX codes. In this competition, there are two sub-tasks with different levels of difficulty. Subtask I Table Structure Reconstruction is to reconstruct the structure of a table image into the form of LaTeX code but ignore the content of the table. Subtask II Table Content Reconstruction is to reconstruct the structure and the content of a table image simultaneously into the form of LaTeX code.

accuracy, competition, prediction accuracy, (12 more...)

2105.01846

Genre: Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.42)