Goto

Collaborating Authors

 modeling system


Maya: Optimizing Deep Learning Training Workloads using Emulated Virtual Accelerators

arXiv.org Artificial Intelligence

Training large foundation models costs hundreds of millions of dollars, making deployment optimization critical. Current approaches require machine learning engineers to manually craft training recipes through error-prone trial-and-error on expensive compute clusters. To enable efficient exploration of training configurations, researchers have developed performance modeling systems. However, these systems force users to translate their workloads into custom specification languages, introducing a fundamental semantic gap between the actual workload and its representation. This gap creates an inherent tradeoff: systems must either support a narrow set of workloads to maintain usability, require complex specifications that limit practical adoption, or compromise prediction accuracy with simplified models. We present Maya, a performance modeling system that eliminates these tradeoffs through transparent device emulation. By operating at the narrow interface between training frameworks and accelerator devices, Maya can capture complete workload behavior without requiring code modifications or translations. Maya intercepts device API calls from unmodified training code to directly observe low-level operations, enabling accurate performance prediction while maintaining both ease of use and generality. Our evaluation shows Maya achieves less than 5% prediction error across diverse models and optimization strategies, identifying configurations that reduce training costs by up to 56% compared to existing approaches.


Modeling Systems with Machine Learning based Differential Equations

arXiv.org Artificial Intelligence

The prediction of behavior in dynamical systems, is frequently subject to the design of models. When a time series obtained from observing the system is available, the task can be performed by designing the model from these observations without additional assumptions or by assuming a preconceived structure in the model, with the help of additional information about the system. In the second case, it is a question of adequately combining theory with observations and subsequently optimizing the mixture. In this work, we proposes the design of time-continuous models of dynamical systems as solutions of differential equations, from non-uniform sampled or noisy observations, using machine learning techniques. The performance of strategy is shown with both, several simulated data sets and experimental data from Hare-Lynx population and Coronavirus 2019 outbreack. Our results suggest that this approach to the modeling systems, can be an useful technique in the case of synthetic or experimental data.


On the Efficiency of the Neuro-Fuzzy Classifier for User Knowledge Modeling Systems

arXiv.org Artificial Intelligence

User knowledge modeling systems are used as the most effective technology for grabbing new user's attention. Moreover, the quality of service (QOS) is increased by these intelligent services. This paper proposes two user knowledge classifiers based on artificial neural networks used as one of the influential parts of knowledge modeling systems. We employed multi-layer perceptron (MLP) and adaptive neural fuzzy inference system (ANFIS) as the classifiers. Moreover, we used real data contains the user's degree of study time, repetition number, their performance in exam, as well as the learning percentage, as our classifier's inputs. Compared with well-known methods like KNN and Bayesian classifiers used in other research with the same data sets, our experiments present better performance. Although, the number of samples in the train set is not large enough, the performance of the neuro-fuzzy classifier in the test set is 98.6% which is the best result in comparison with others. However, the comparison of MLP toward the ANFIS results presents performance reduction, although the MLP performance is more efficient than other methods like Bayesian and KNN. As our goal is evaluating and reporting the efficiency of a neuro-fuzzy classifier for user knowledge modeling systems, we utilized many different evaluation metrics such as Receiver Operating Characteristic and the Area Under its Curve, Total Accuracy, and Kappa statistics.


Can Machine Learning Help Lift China's Smog?

#artificialintelligence

From the street, through Beijing's heavy smog, it can sometimes be hard to make out IBM's Chinese headquarters: a towering office building with a distinctive undulating architectural flourish and a large company logo at the top. But just a short distance away, on the northeast outskirts of the capital, IBM computer scientists are using artificial intelligence to develop what they think will be a way to manage China's notorious and chronic pollution problem more successfully. The team is using complex computer models and machine learning to calculate how pollution will spread across the city. The researchers can now produce pollution forecasts, with a resolution of a kilometer square, up to 10 days in advance. These predictions can also tell the government how it might act to avoid the worst scenarios--for instance, by shutting certain factories, or by reducing the number of cars on the road.