Ensemble Learning
Machine Learning-Based Manufacturing Cost Prediction from 2D Engineering Drawings via Geometric Features
Arıkan, Ahmet Bilal, Özönder, Şener, Koçyiğit, Mustafa Taha, Altun, Hüseyin Oktay, Küçükkartal, H. Kübra, Arslanoğlu, Murat, Çağırankaya, Fatih, Ayvaz, Berk
We present an integrated machine learning framework that transforms how manufacturing cost is estimated from 2D engineering drawings. Unlike traditional quotation workflows that require labor-intensive process planning, our approach about 200 geometric and statistical descriptors directly from 13,684 DWG drawings of automotive suspension and steering parts spanning 24 product groups. Gradient-boosted decision tree models (XGBoost, CatBoost, LightGBM) trained on these features achieve nearly 10% mean absolute percentage error across groups, demonstrating robust scalability beyond part-specific heuristics. By coupling cost prediction with explainability tools such as SHAP, the framework identifies geometric design drivers including rotated dimension maxima, arc statistics and divergence metrics, offering actionable insights for cost-aware design. This end-to-end CAD-to-cost pipeline shortens quotation lead times, ensures consistent and transparent cost assessments across part families and provides a deployable pathway toward real-time, ERP-integrated decision support in Industry 4.0 manufacturing environments.
xRFM: Accurate, scalable, and interpretable feature learning models for tabular data
Beaglehole, Daniel, Holzmüller, David, Radhakrishnan, Adityanarayanan, Belkin, Mikhail
Tabular data - collections of continuous and categorical variables organized into matrices - underlies all aspects of modern commerce and science from airplane engines to biology labs to bagel shops. Yet, while Machine Learning and AI for language and vision have seen unprecedented progress, the primary methodologies of prediction from tabular data have been relatively static, dominated by variations of Gradient Boosted Decision Trees (GBDTs), such as XGBoost [7]. Nevertheless, hundreds of tabular datasets have been assembled to form extensive regression and classification benchmarks [11, 12, 16, 35, 37], and, recently, there has been renewed interest in building state-of-the-art predictive models for tabular data [15, 18, 19]. Notably, given the remarkable effectiveness of large, "foundation" models for text, there has been much excitement in developing similar models on tabular data, and recent effort has led to the development of TabPFN-v2, a foundation model for tabular data appearing in Nature [18]. Yet, despite this progress, tabular data still remains an active area for model development and building scalable, effective, and interpretable machine learning models in this domain remains an open challenge. In this work, we introduce xRFM, a tabular predictive model that combines recent advances in feature learning kernel machines with an adaptive tree structure, making it effective, scalable, and interpretable.