AITopics

Country: Asia > India (0.25)

Industry: Materials > Metals & Mining > Diamonds (1.00)

Technology:

Information Technology > Cloud Computing (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.30)

arXiv.org Artificial IntelligenceAug-26-2022

Temporal Fuzzy Utility Maximization with Remaining Measure

Wan, Shicheng, Ye, Zhenqiang, Gan, Wensheng, Chen, Jiahui

High utility itemset mining approaches discover hidden patterns from large amounts of temporal data. However, an inescapable problem of high utility itemset mining is that its discovered results hide the quantities of patterns, which causes poor interpretability. The results only reflect the shopping trends of customers, which cannot help decision makers quantify collected information. In linguistic terms, computers use mathematical or programming languages that are precisely formalized, but the language used by humans is always ambiguous. In this paper, we propose a novel one-phase temporal fuzzy utility itemset mining approach called TFUM. It revises temporal fuzzy-lists to maintain less but major information about potential high temporal fuzzy utility itemsets in memory, and then discovers a complete set of real interesting patterns in a short time. In particular, the remaining measure is the first adopted in the temporal fuzzy utility itemset mining domain in this paper. The remaining maximal temporal fuzzy utility is a tighter and stronger upper bound than that of previous studies adopted. Hence, it plays an important role in pruning the search space in TFUM. Finally, we also evaluate the efficiency and effectiveness of TFUM on various datasets. Extensive experimental results indicate that TFUM outperforms the state-of-the-art algorithms in terms of runtime cost, memory usage, and scalability. In addition, experiments prove that the remaining measure can significantly prune unnecessary candidates during mining.

algorithm, itemset, transaction, (14 more...)

2208.12439

Country: Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.48)
(3 more...)

arXiv.org Artificial IntelligenceAug-26-2022

A Generic Algorithm for Top-K On-Shelf Utility Mining

Chen, Jiahui, Guo, Xu, Gan, Wensheng, Wan, Shichen, Yu, Philip S.

On-shelf utility mining (OSUM) is an emerging research direction in data mining. It aims to discover itemsets that have high relative utility in their selling time period. Compared with traditional utility mining, OSUM can find more practical and meaningful patterns in real-life applications. However, there is a major drawback to traditional OSUM. For normal users, it is hard to define a minimum threshold minutil for mining the right amount of on-shelf high utility itemsets. On one hand, if the threshold is set too high, the number of patterns would not be enough. On the other hand, if the threshold is set too low, too many patterns will be discovered and cause an unnecessary waste of time and memory consumption. To address this issue, the user usually directly specifies a parameter k, where only the top-k high relative utility itemsets would be considered. Therefore, in this paper, we propose a generic algorithm named TOIT for mining Top-k On-shelf hIgh-utility paTterns to solve this problem. TOIT applies a novel strategy to raise the minutil based on the on-shelf datasets. Besides, two novel upper-bound strategies named subtree utility and local utility are applied to prune the search space. By adopting the strategies mentioned above, the TOIT algorithm can narrow the search space as early as possible, improve the mining efficiency, and reduce the memory consumption, so it can obtain better performance than other algorithms. A series of experiments have been conducted on real datasets with different styles to compare the effects with the state-of-the-art KOSHU algorithm. The experimental results showed that TOIT outperforms KOSHU in both running time and memory consumption.

algorithm, itemset, transaction, (15 more...)

2208.1423

Country:

Asia > China > Guangdong Province > Guangzhou (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)

Genre: Research Report > New Finding (0.54)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.47)

arXiv.org Artificial IntelligenceAug-26-2022

An End-to-End OCR Framework for Robust Arabic-Handwriting Recognition using a Novel Transformers-based Model and an Innovative 270 Million-Words Multi-Font Corpus of Classical Arabic with Diacritics

Mostafa, Aly, Mohamed, Omar, Ashraf, Ali, Elbehery, Ahmed, Jamal, Salma, Salah, Anas, Ghoneim, Amr S.

This research is the second phase in a series of investigations on developing an Optical Character Recognition (OCR) of Arabic historical documents and examining how different modeling procedures interact with the problem. The first research studied the effect of Transformers on our custom-built Arabic dataset. One of the downsides of the first research was the size of the training data, a mere 15000 images from our 30 million images, due to lack of resources. Also, we add an image enhancement layer, time and space optimization, and Post-Correction layer to aid the model in predicting the correct word for the correct context. Notably, we propose an end-to-end text recognition approach using Vision Transformers as an encoder, namely BEIT, and vanilla Transformer as a decoder, eliminating CNNs for feature extraction and reducing the model's complexity. The experiments show that our end-to-end model outperforms Convolutions Backbones. The model attained a CER of 4.46%.

dataset, neural network, segmentation, (14 more...)

2208.11484

Country:

Africa > Middle East > Egypt (0.04)
Europe > Switzerland > Fribourg > Fribourg (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.89)
Information Technology > Artificial Intelligence > Vision > Optical Character Recognition (0.89)

Khowaja, Sunder Ali, Lee, Ik Hyun, Yoon, Jiseok

2nd Place Solutions for UG2+ Challenge 2022 -- D$^{3}$Net for Mitigating Atmospheric Turbulence from Images

arXiv.org Artificial IntelligenceAug-25-2022

This technical report briefly introduces to the D$^{3}$Net proposed by our team "TUK-IKLAB" for Atmospheric Turbulence Mitigation in $UG2^{+}$ Challenge at CVPR 2022. In the light of test and validation results on textual images to improve text recognition performance and hot-air balloon images for image enhancement, we can say that the proposed method achieves state-of-the-art performance. Furthermore, we also provide a visual comparison with publicly available denoising, deblurring, and frame averaging methods with respect to the proposed work. The proposed method ranked 2nd on the final leader-board of the aforementioned challenge in the testing phase, respectively.

atmospheric turbulence mitigation, hot-air balloon image, mitigating atmospheric turbulence, (10 more...)

2208.12332

Country:

Asia > South Korea (0.06)
Asia > Pakistan > Sindh (0.05)

Genre: Research Report (0.40)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.50)

arXiv.org Artificial IntelligenceAug-24-2022

A Survey of Open Source Automation Tools for Data Science Predictions

Hoell, Nicholas

We present an expository overview of technical and cultural challenges to the development and adoption of automation at various stages in the data science prediction lifecycle, restricting focus to supervised learning with structured datasets. In addition, we review popular open source Python tools implementing common solution patterns for the automation challenges and highlight gaps where we feel progress still demands to be made.

machine learning, pattern recognition, programming language, (21 more...)

2208.11792

Country: North America > United States (0.67)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Energy > Oil & Gas (0.67)
Health & Medicine > Pharmaceuticals & Biotechnology (0.45)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(8 more...)

Luccioni, Alexandra Sasha, Rolnick, David

Bugs in the Data: How ImageNet Misrepresents Biodiversity

arXiv.org Artificial IntelligenceAug-24-2022

ImageNet-1k is a dataset often used for benchmarking machine learning (ML) models and evaluating tasks such as image recognition and object detection. Wild animals make up 27% of ImageNet-1k but, unlike classes representing people and objects, these data have not been closely scrutinized. In the current paper, we analyze the 13,450 images from 269 classes that represent wild animals in the ImageNet-1k validation set, with the participation of expert ecologists. We find that many of the classes are ill-defined or overlapping, and that 12% of the images are incorrectly labeled, with some classes having >90% of images incorrect. We also find that both the wildlife-related labels and images included in ImageNet-1k present significant geographical and cultural biases, as well as ambiguities such as artificial animals, multiple species in the same image, or the presence of humans. Our findings highlight serious issues with the extensive use of this dataset for evaluating ML systems, the use of such algorithms in wildlife-related tasks, and more broadly the ways in which ML datasets are commonly created and curated.

annotator, dataset, imagenet-1k, (15 more...)

2208.11695

Country:

North America > United States (0.15)
Asia (0.04)
Africa (0.04)
(3 more...)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.34)

#artificialintelligenceAug-23-2022, 02:29:40 GMT

OpenCV 4 Computer Vision Application Programming Cookbook: Build complex computer vision applications with OpenCV and C++, 4th Edition: Millan Escriva, David, Laganiere, Robert: 9781789340723: Amazon.com: Books

David Millán Escrivá was eight years old when he wrote his first program on an 8086 PC with BASIC language, which enabled the 2D plotting of BASIC equations. He started with his computer development relationship and created many applications and games. In 2005, he completed his studies in IT from the Universitat Politécnica de Valencia with honors in human-computer interaction supported by Computer Vision with OpenCV (v0.96). He had a final project based on this subject and published it on HCI Spanish Congress. In 2014, he completed his Master's degree in artificial intelligence, computer graphics, and pattern recognition, focusing on pattern recognition and computer vision.

build complex computer vision application, computer vision application programming cookbook, pattern recognition, (4 more...)

Industry: Retail > Online (0.40)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.73)

#artificialintelligenceAug-14-2022, 03:18:39 GMT

SAS Predictive Modeling

You'll learn Understand the worth of this course of predictive modeling with SAS enterprise miner. Skills like skill to analyze data and see a complex pattern, coding skill, and strong understanding of concepts. Predictive modeling is the process of studying the data models. To predict models a different set of methods of statistics are used .these SAS enterprise miner tends to provide us with several tools for predictive modeling. By this course you will be able to have complete knowledge of predictive modeling with SAS enterprise miner.

predictive modeling, sas enterprise miner, trainee, (7 more...)

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.40)

#artificialintelligenceAug-12-2022, 22:57:30 GMT

Lens AI Is Now Used Everywhere For Google Image Search

Google Lens has been around for some time now as the search giant's de facto AI search for images and image-based text. Now, following rumors that suggested Lens for desktop platforms might be coming, searching Google via an image upload uses the Assistant-related feature too. That's based on recent reports following a roll-out on the company's search page. For clarity, that's searches found at images.google.com. The site is effectively Google's solution for reverse searching images.

google image search, google lens, lens ai, (1 more...)

Industry: Information Technology > Services (0.40)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.62)