AITopics | mathematical expression recognition

Collaborating Authors

mathematical expression recognition

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Link prediction Graph Neural Networks for structure recognition of Handwritten Mathematical Expressions

Nguyen, Cuong Tuan, Nguyen, Ngoc Tuan, Dao, Triet Hoang Minh, Nhat, Huy Minh, Dinh, Huy Truong

arXiv.org Artificial IntelligenceNov-5-2025

We propose a Graph Neural Network (GNN)-based approach for Handwritten Mathematical Expression (HME) recognition by modeling HMEs as graphs, where nodes represent symbols and edges capture spatial dependencies. A deep BLSTM network is used for symbol segmentation, recognition, and spatial relation classification, forming an initial primitive graph. A 2D-CFG parser then generates all possible spatial relations, while the GNN-based link prediction model refines the structure by removing unnecessary connections, ultimately forming the Symbol Label Graph. Experimental results demonstrate the effectiveness of our approach, showing promising performance in HME structure recognition.

artificial intelligence, machine learning, recognition, (14 more...)

arXiv.org Artificial Intelligence

2511.02288

Country:

Asia > Vietnam > Hồ Chí Minh City > Hồ Chí Minh City (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

VEHME: A Vision-Language Model For Evaluating Handwritten Mathematics Expressions

Nguyen, Thu Phuong, Nguyen, Duc M., Jeon, Hyotaek, Lee, Hyunwook, Song, Hyunmin, Ko, Sungahn, Kim, Taehwan

arXiv.org Artificial IntelligenceOct-28-2025

Automatically assessing handwritten mathematical solutions is an important problem in educational technology with practical applications, but it remains a significant challenge due to the diverse formats, unstructured layouts, and symbolic complexity of student work. To address this challenge, we introduce VEHME-a Vision-Language Model for Evaluating Handwritten Mathematics Expressions-designed to assess open-form handwritten math responses with high accuracy and interpretable reasoning traces. VEHME integrates a two-phase training pipeline: (i) supervised fine-tuning using structured reasoning data, and (ii) reinforcement learning that aligns model outputs with multi-dimensional grading objectives, including correctness, reasoning depth, and error localization. To enhance spatial understanding, we propose an Expression-Aware Visual Prompting Module, trained on our synthesized multi-line math expressions dataset to robustly guide attention in visually heterogeneous inputs. Evaluated on AIHub and FERMAT datasets, VEHME achieves state-of-the-art performance among open-source models and approaches the accuracy of proprietary systems, demonstrating its potential as a scalable and accessible tool for automated math assessment. Our training and experiment code is publicly available at our GitHub repository.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2510.22798

Country:

Europe > Austria > Vienna (0.14)
Asia > South Korea > Ulsan > Ulsan (0.04)
North America > United States (0.04)
Asia > South Korea > Gyeongsangbuk-do > Pohang (0.04)

Genre:

Instructional Material (1.00)
Research Report > New Finding (0.93)

Industry:

Education > Educational Technology > Educational Software (0.46)
Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

InkFM: A Foundational Model for Full-Page Online Handwritten Note Understanding

Fadeeva, Anastasiia, Coriou, Vincent, Antognini, Diego, Musat, Claudiu, Maksai, Andrii

arXiv.org Artificial IntelligenceMar-29-2025

Tablets and styluses are increasingly popular for taking notes. To optimize this experience and ensure a smooth and efficient workflow, it's important to develop methods for accurately interpreting and understanding the content of handwritten digital notes. We introduce a foundational model called InkFM for analyzing full pages of handwritten content. Trained on a diverse mixture of tasks, this model offers a unique combination of capabilities: recognizing text in 28 different scripts, mathematical expressions recognition, and segmenting pages into distinct elements like text and drawings. Our results demonstrate that these tasks can be effectively unified within a single model, achieving SoTA text line segmentation out-of-the-box quality surpassing public baselines like docTR. Fine- or LoRA-tuning our base model on public datasets further improves the quality of page segmentation, achieves state-of the art text recognition (DeepWriting, CASIA, SCUT, and Mathwriting datasets) and sketch classification (QuickDraw). This adaptability of InkFM provides a powerful starting point for developing applications with handwritten input.

large language model, machine learning, pattern recognition, (19 more...)

arXiv.org Artificial Intelligence

2503.23081

Country: North America > United States (0.04)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.67)

Add feedback

PP-FormulaNet: Bridging Accuracy and Efficiency in Advanced Formula Recognition

Liu, Hongen, Cui, Cheng, Du, Yuning, Liu, Yi, Pan, Gang

arXiv.org Artificial IntelligenceMar-24-2025

Formula recognition is an important task in document intelligence. It involves converting mathematical expressions from document images into structured symbolic formats that computers can easily work with. LaTeX is the most common format used for this purpose. In this work, we present PP-FormulaNet, a state-of-the-art formula recognition model that excels in both accuracy and efficiency. To meet the diverse needs of applications, we have developed two specialized models: PP-FormulaNet-L, tailored for high-accuracy scenarios, and PP-FormulaNet-S, optimized for high-efficiency contexts. Our extensive evaluations reveal that PP-FormulaNet-L attains accuracy levels that surpass those of prominent models such as UniMERNet by a significant 6%. Conversely, PP-FormulaNet-S operates at speeds that are over 16 times faster. These advancements facilitate seamless integration of PP-FormulaNet into a broad spectrum of document processing environments that involve intricate mathematical formulas. Furthermore, we introduce a Formula Mining System, which is capable of extracting a vast amount of high-quality formula data. This system further enhances the robustness and applicability of our formula recognition model. Code and models are publicly available at PaddleOCR(https://github.com/PaddlePaddle/PaddleOCR) and PaddleX(https://github.com/PaddlePaddle/PaddleX).

formula, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2503.18382

Country: Asia > China > Tianjin Province > Tianjin (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.69)

Add feedback

SemiHMER: Semi-supervised Handwritten Mathematical Expression Recognition using pseudo-labels

Chen, Kehua, Shen, Haoyang

arXiv.org Artificial IntelligenceFeb-10-2025

In recent years, deep learning with Convolutional Neural Networks (CNNs) has achieved remarkable results in the field of HMER (Handwritten Mathematical Expression Recognition). However, it remains challenging to improve performance with limited labeled training data. This paper presents, for the first time, a simple yet effective semi-supervised HMER framework by introducing dual-branch semi-supervised learning. Specifically, we simplify the conventional deep co-training from consistency regularization to cross-supervised learning, where the prediction of one branch is used as a pseudo-label to supervise the other branch directly end-to-end. Considering that the learning of the two branches tends to converge in the later stages of model optimization, we also incorporate a weak-to-strong strategy by applying different levels of augmentation to each branch, which behaves like expanding the training data and improving the quality of network training. Meanwhile, We propose a novel module, Global Dynamic Counting Module(GDCM), to enhance the performance of the HMER decoder, which alleviates recognition inaccuracies in long-distance formula recognition and the occurrence of repeated characters. We release our code at https://github.com/chenkehua/SemiHMER.

artificial intelligence, machine learning, recognition, (17 more...)

arXiv.org Artificial Intelligence

2502.07172

Country:

Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MNIST-Fraction: Enhancing Math Education with AI-Driven Fraction Detection and Analysis

Ahadian, Pegah, Feng, Yunhe, Kosko, Karl, Ferdig, Richard, Guan, Qiang

arXiv.org Artificial IntelligenceDec-11-2024

Mathematics education, a crucial and basic field, significantly influences students' learning in related subjects and their future careers. Utilizing artificial intelligence to interpret and comprehend math problems in education is not yet fully explored. This is due to the scarcity of quality datasets and the intricacies of processing handwritten information. In this paper, we present a novel contribution to the field of mathematics education through the development of MNIST-Fraction, a dataset inspired by the renowned MNIST, specifically tailored for the recognition and understanding of handwritten math fractions. Our approach is the utilization of deep learning, specifically Convolutional Neural Networks (CNNs), for the recognition and understanding of handwritten math fractions to effectively detect and analyze fractions, along with their numerators and denominators. This capability is pivotal in calculating the value of fractions, a fundamental aspect of math learning. The MNIST-Fraction dataset is designed to closely mimic real-world scenarios, providing a reliable and relevant resource for AI-driven educational tools. Furthermore, we conduct a comprehensive comparison of our dataset with the original MNIST dataset using various classifiers, demonstrating the effectiveness and versatility of MNIST-Fraction in both detection and classification tasks. This comparative analysis not only validates the practical utility of our dataset but also offers insights into its potential applications in math education. To foster collaboration and further research within the computational and educational communities. Our work aims to bridge the gap in high-quality educational resources for math learning, offering a valuable tool for both educators and researchers in the field.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3603287.365122

2412.08633

Country:

North America > United States > Georgia > Cobb County > Marietta (0.15)
North America > United States > Ohio > Portage County > Kent (0.05)
North America > United States > Texas > Denton County > Denton (0.04)
(9 more...)

Genre: Research Report > Promising Solution (0.46)

Industry:

Education > Curriculum > Subject-Specific Education (0.95)
Education > Educational Technology (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Automated LaTeX Code Generation from Handwritten Math Expressions Using Vision Transformer

Sundararaj, Jayaprakash, Vyas, Akhil, Gonzalez-Maldonado, Benjamin

arXiv.org Artificial IntelligenceDec-7-2024

Transforming mathematical expressions into LaTeX poses a significant challenge. In this paper, we examine the application of advanced transformer-based architectures to address the task of converting handwritten or digital mathematical expression images into corresponding LaTeX code. As a baseline, we utilize the current state-of-the-art CNN encoder and LSTM decoder. Additionally, we explore enhancements to the CNN-RNN architecture by replacing the CNN encoder with the pretrained ResNet50 model with modification to suite the grey scale input. Further, we experiment with vision transformer model and compare with Baseline and CNN-LSTM model. Our findings reveal that the vision transformer architectures outperform the baseline CNN-RNN framework, delivering higher overall accuracy and BLEU scores while achieving lower Levenshtein distances. Moreover, these results highlight the potential for further improvement through fine-tuning of model parameters. To encourage open research, we also provide the model implementation, enabling reproduction of our results and facilitating further research in this domain.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2412.03853

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.05)
South America > Uruguay > Maldonado > Maldonado (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
Europe > Portugal > Lisbon > Lisbon (0.04)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Local and Global Graph Modeling with Edge-weighted Graph Attention Network for Handwritten Mathematical Expression Recognition

Xie, Yejing, Zanibbi, Richard, Mouchère, Harold

arXiv.org Artificial IntelligenceOct-24-2024

TEX), handwritten mathematical expressions offer greater ease of use for humans but pose a greater challenge for machine recognition due to variations in individual writing styles and writing habits. Handwritten Mathematical Expression Recognition (HMER), which involves converting handwritten math into markup language for easier computer processing and rendering, is a challenging promising field with various of potential applications. Compared to Optical Character Recognition (OCR), recognizing handwritten manuscripts is more challenging due to the wide variation in handwriting styles. HMER not only faces the common challenges of handwriting recognition but also has to deal with the added complexity of interpreting the 2D structure of mathematical expressions. According to different processing objective, HMER can be categorized into Online HMER and Offline HMER. Online HMER processes a sequence of temporal trajectories captured by digital devices like tablets and digital pens. Online data is segmented into individual strokes based on pen-down and pen-up interruption. While offline expressions are static images collected by scanner, camera or smartphone.

machine learning, pattern recognition, recognition, (15 more...)

arXiv.org Artificial Intelligence

2410.18555

Country:

Asia > South Korea (0.14)
Europe > France > Pays de la Loire > Loire-Atlantique > Nantes (0.04)
Europe > Switzerland > Vaud > Lausanne (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.88)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.63)

Add feedback

Enhancing Complex Formula Recognition with Hierarchical Detail-Focused Network

Wang, Jiale, Yu, Junhui, Liu, Huanyong, Kong, Chenanran

arXiv.org Artificial IntelligenceSep-17-2024

Hierarchical and complex Mathematical Expression Recognition (MER) is challenging due to multiple possible interpretations of a formula, complicating both parsing and evaluation. In this paper, we introduce the Hierarchical Detail-Focused Recognition dataset (HDR), the first dataset specifically designed to address these issues. It consists of a large-scale training set, HDR-100M, offering an unprecedented scale and diversity with one hundred million training instances. And the test set, HDR-Test, includes multiple interpretations of complex hierarchical formulas for comprehensive model performance evaluation. Additionally, the parsing of complex formulas often suffers from errors in fine-grained details. To address this, we propose the Hierarchical Detail-Focused Recognition Network (HDNet), an innovative framework that incorporates a hierarchical sub-formula module, focusing on the precise handling of formula details, thereby significantly enhancing MER performance. Experimental results demonstrate that HDNet outperforms existing MER models across various datasets.

artificial intelligence, formula, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2409.11677

Country:

Oceania > Australia > New South Wales > Sydney (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > District of Columbia > Washington (0.04)
(7 more...)

Genre: Research Report > New Finding (0.34)

Industry: Media (0.60)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

NAMER: Non-Autoregressive Modeling for Handwritten Mathematical Expression Recognition

Liu, Chenyu, Pan, Jia, Hu, Jinshui, Yin, Baocai, Yin, Bing, Chen, Mingjun, Liu, Cong, Du, Jun, Liu, Qingfeng

arXiv.org Artificial IntelligenceJul-16-2024

Recently, Handwritten Mathematical Expression Recognition (HMER) has gained considerable attention in pattern recognition for its diverse applications in document understanding. Current methods typically approach HMER as an image-to-sequence generation task within an autoregressive (AR) encoder-decoder framework. However, these approaches suffer from several drawbacks: 1) a lack of overall language context, limiting information utilization beyond the current decoding step; 2) error accumulation during AR decoding; and 3) slow decoding speed. To tackle these problems, this paper makes the first attempt to build a novel bottom-up Non-AutoRegressive Modeling approach for HMER, called NAMER. NAMER comprises a Visual Aware Tokenizer (VAT) and a Parallel Graph Decoder (PGD). Initially, the VAT tokenizes visible symbols and local relations at a coarse level. Subsequently, the PGD refines all tokens and establishes connectivities in parallel, leveraging comprehensive visual and linguistic contexts. Experiments on CROHME 2014/2016/2019 and HME100K datasets demonstrate that NAMER not only outperforms the current state-of-the-art (SOTA) methods on ExpRate by 1.93%/2.35%/1.49%/0.62%, but also achieves significant speedups of 13.7x and 6.7x faster in decoding time and overall FPS, proving the effectiveness and efficiency of NAMER.

mathematical expression recognition, namer, recognition, (13 more...)

arXiv.org Artificial Intelligence

2407.1138

Country:

Asia > China > Anhui Province > Hefei (0.04)
Europe > Switzerland > Vaud > Lausanne (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
(2 more...)

Add feedback