flint
PredictingTrainingTimeWithoutTraining SupplementaryMaterial
In both cases we observe that the predicted curve is reasonably close to the actual curve, more so at the beginning of the training (which is expected, sincethelinearapproximation ismorelikelytohold). Point-wise similarity of predicted and observed loss curve. Up to now we focused on prediction error rates (see e.g. We started defining training time as the first time the (smoothed) loss is belowagiventhreshold(whichwethennormalizedw.r.t. In Section 4we suggest that, in the case of MSE loss, itispossible to predict the training time on alargedataset using asubset ofthesamples. However,sinceourtraining time definition measures the time to reach the asymptotic value (which is what is most useful in practice) rather than the time reach an absolute threshold, this does not affect the accuracy of the prediction(seeAppendixC).
- Europe > France (0.05)
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > California (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Machine Learning for Scientific Visualization: Ensemble Data Analysis
Scientific simulations and experimental measurements produce vast amounts of spatio-temporal data, yet extracting meaningful insights remains challenging due to high dimensionality, complex structures, and missing information. Traditional analysis methods often struggle with these issues, motivating the need for more robust, data-driven approaches. This dissertation explores deep learning methodologies to improve the analysis and visualization of spatio-temporal scientific ensembles, focusing on dimensionality reduction, flow estimation, and temporal interpolation. First, we address high-dimensional data representation through autoencoder-based dimensionality reduction for scientific ensembles. We evaluate the stability of projection metrics under partial labeling and introduce a Pareto-efficient selection strategy to identify optimal autoencoder variants, ensuring expressive and reliable low-dimensional embeddings. Next, we present FLINT, a deep learning model for high-quality flow estimation and temporal interpolation in both flow-supervised and flow-unsupervised settings. FLINT reconstructs missing velocity fields and generates high-fidelity temporal interpolants for scalar fields across 2D+time and 3D+time ensembles without domain-specific assumptions or extensive finetuning. To further improve adaptability and generalization, we introduce HyperFLINT, a hypernetwork-based approach that conditions on simulation parameters to estimate flow fields and interpolate scalar data. This parameter-aware adaptation yields more accurate reconstructions across diverse scientific domains, even with sparse or incomplete data. Overall, this dissertation advances deep learning techniques for scientific visualization, providing scalable, adaptable, and high-quality solutions for interpreting complex spatio-temporal ensembles.
- Asia (0.28)
- North America > United States > California (0.27)
A Framework to Learn with Interpretation
This is achieved by a dedicated architecture and well chosen regularization penalties. We seek for a small-size dictionary of high level attribute functions that take as inputs the outputs of selected hidden layers and whose outputs feed a linear classifier. We impose strong conciseness on the activation of attributes with an entropy-based criterion while enforcing fidelity to both inputs and outputs of the predictive model. A detailed pipeline to visualize the learnt features is also developed.
Flint water crisis led to spike in children with special needs and drop in school grades a decade later, according to research that likens fallout from disaster to Chernobyl
The Flint water crisis has resulted in all-time high numbers of children with special needs and poor performance in school. More than 12,000 children to were exposed to toxic levels of lead in 2014 when the city switched it's public water source to the Flint River, where the water is considerably more acidic. This led to corrosion in lead pipes, which imbued the city's tap water with lead, and then introduced it into the drinking supply. Lead exposure has been linked to behavioral and cognitive problems, mental illness, and an underdeveloped brain. Now, researchers from Michigan and New Jersey experts have reported the rate of young children diagnosed with special needs increased by eight percent after 2014 while performance in math class dropped.
- Europe > Ukraine > Kyiv Oblast > Chernobyl (0.41)
- North America > United States > Michigan (0.30)
- North America > United States > New Jersey (0.25)
- Education (1.00)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.92)
- Water & Waste Management > Water Management > Water Supplies & Services (0.70)
FLINT: A Platform for Federated Learning Integration
Wang, Ewen, Kannan, Ajay, Liang, Yuefeng, Chen, Boyi, Chowdhury, Mosharaf
Cross-device federated learning (FL) has been well-studied from algorithmic, system scalability, and training speed perspectives. Nonetheless, moving from centralized training to cross-device FL for millions or billions of devices presents many risks, including performance loss, developer inertia, poor user experience, and unexpected application failures. In addition, the corresponding infrastructure, development costs, and return on investment are difficult to estimate. In this paper, we present a device-cloud collaborative FL platform that integrates with an existing machine learning platform, providing tools to measure real-world constraints, assess infrastructure capabilities, evaluate model training performance, and estimate system resource requirements to responsibly bring FL into production. We also present a decision workflow that leverages the FL-integrated platform to comprehensively evaluate the trade-offs of cross-device FL and share our empirical evaluations of business-critical machine learning applications that impact hundreds of millions of users.
- Asia > Middle East > Jordan (0.04)
- North America > United States > Michigan (0.04)
- Information Technology > Services (1.00)
- Information Technology > Security & Privacy (1.00)
- Law > Statutes (0.67)
FLInt: Exploiting Floating Point Enabled Integer Arithmetic for Efficient Random Forest Inference
Hakert, Christian, Chen, Kuan-Hsun, Chen, Jian-Jia
In many machine learning applications, e.g., tree-based ensembles, floating point numbers are extensively utilized due to their expressiveness. Nowadays performing data analysis on embedded devices from dynamic data masses becomes available, but such systems often lack hardware capabilities to process floating point numbers, introducing large overheads for their processing. Even if such hardware is present in general computing systems, using integer operations instead of floating point operations promises to reduce operation overheads and improve the performance. In this paper, we provide \mdname, a full precision floating point comparison for random forests, by only using integer and logic operations. To ensure the same functionality preserves, we formally prove the correctness of this comparison. Since random forests only require comparison of floating point numbers during inference, we implement \mdname~in low level realizations and therefore eliminate the need for floating point hardware entirely, by keeping the model accuracy unchanged. The usage of \mdname~basically boils down to a one-by-one replacement of conditions: For instance, a comparison statement in C: if(pX[3]<=(float)10.074347) becomes if((*(((int*)(pX))+3))<=((int)(0x41213087))). Experimental evaluation on X86 and ARMv8 desktop and server class systems shows that the execution time can be reduced by up to $\approx 30\%$ with our novel approach.
ANT: Exploiting Adaptive Numerical Data Type for Low-bit Deep Neural Network Quantization
Guo, Cong, Zhang, Chen, Leng, Jingwen, Liu, Zihan, Yang, Fan, Liu, Yunxin, Guo, Minyi, Zhu, Yuhao
Quantization is a technique to reduce the computation and memory cost of DNN models, which are getting increasingly large. Existing quantization solutions use fixed-point integer or floating-point types, which have limited benefits, as both require more bits to maintain the accuracy of original models. On the other hand, variable-length quantization uses low-bit quantization for normal values and high-precision for a fraction of outlier values. Even though this line of work brings algorithmic benefits, it also introduces significant hardware overheads due to variable-length encoding and decoding. In this work, we propose a fixed-length adaptive numerical data type called ANT to achieve low-bit quantization with tiny hardware overheads. Our data type ANT leverages two key innovations to exploit the intra-tensor and inter-tensor adaptive opportunities in DNN models. First, we propose a particular data type, flint, that combines the advantages of float and int for adapting to the importance of different values within a tensor. Second, we propose an adaptive framework that selects the best type for each tensor according to its distribution characteristics. We design a unified processing element architecture for ANT and show its ease of integration with existing DNN accelerators. Our design results in 2.8$\times$ speedup and 2.5$\times$ energy efficiency improvement over the state-of-the-art quantization accelerators.
- Europe > Switzerland > Vaud > Lausanne (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- (4 more...)