Materials
Overfitting Can Be Harmless for Basis Pursuit: Only to a Degree
Ju, Peizhong, Lin, Xiaojun, Liu, Jia
Recently, there have been significant interests in studying the generalization power of linear regression models in the overparameterized regime, with the hope that such analysis may provide the first step towards understanding why overparameterized deep neural networks generalize well even when they overfit the training data. Studies on min $\ell_2$-norm solutions that overfit the training data have suggested that such solutions exhibit the "double-descent" behavior, i.e., the test error decreases with the number of features $p$ in the overparameterized regime when $p$ is larger than the number of samples $n$. However, for linear models with i.i.d. Gaussian features, for large $p$ the model errors of such min $\ell_2$-norm solutions approach the "null risk," i.e., the error of a trivial estimator that always outputs zero, even when the noise is very low. In contrast, we studied the overfitting solution of min $\ell_1$-norm, which is known as Basis Pursuit (BP) in the compressed sensing literature. Under a sparse true linear model with i.i.d. Gaussian features, we show that for a large range of $p$ up to a limit that grows exponentially with $n$, with high probability the model error of BP is upper bounded by a value that decreases with $p$ and is proportional to the noise level. To the best of our knowledge, this is the first result in the literature showing that, without any explicit regularization in such settings where both $p$ and the dimension of data are much larger than $n$, the test errors of a practical-to-compute overfitting solution can exhibit double-descent and approach the order of the noise level independently of the null risk. Our upper bound also reveals a descent floor for BP that is proportional to the noise level. Further, this descent floor is independent of $n$ and the null risk, but increases with the sparsity level of the true model.
Scalable bundling via dense product embeddings
Kumar, Madhav, Eckles, Dean, Aral, Sinan
Bundling, the practice of jointly selling two or more products at a discount, is a widely used strategy in industry and a well examined concept in academia. Historically, the focus has been on theoretical studies in the context of monopolistic firms and assumed product relationships, e.g., complementarity in usage. We develop a new machine-learning-driven methodology for designing bundles in a large-scale, cross-category retail setting. We leverage historical purchases and consideration sets created from clickstream data to generate dense continuous representations of products called embeddings. We then put minimal structure on these embeddings and develop heuristics for complementarity and substitutability among products. Subsequently, we use the heuristics to create multiple bundles for each product and test their performance using a field experiment with a large retailer. We combine the results from the experiment with product embeddings using a hierarchical model that maps bundle features to their purchase likelihood, as measured by the add-to-cart rate. We find that our embeddings-based heuristics are strong predictors of bundle success, robust across product categories, and generalize well to the retailer's entire assortment.
Machine Learning in Thermodynamics: Prediction of Activity Coefficients by Matrix Completion
Jirasek, Fabian, Alves, Rodrigo A. S., Damay, Julie, Vandermeulen, Robert A., Bamler, Robert, Bortz, Michael, Mandt, Stephan, Kloft, Marius, Hasse, Hans
Activity coefficients, which are a measure of the non-ideality of liquid mixtures, are a key property in chemical engineering with relevance to modeling chemical and phase equilibria as well as transport processes. Although experimental data on thousands of binary mixtures are available, prediction methods are needed to calculate the activity coefficients in many relevant mixtures that have not been explored to-date. In this report, we propose a probabilistic matrix factorization model for predicting the activity coefficients in arbitrary binary mixtures. Although no physical descriptors for the considered components were used, our method outperforms the state-of-the-art method that has been refined over three decades while requiring much less training effort. This opens perspectives to novel methods for predicting physico-chemical properties of binary mixtures with the potential to revolutionize modeling and simulation in chemical engineering.
Robotics in architecture and construction: An industry shift
Robotics in architecture and construction is transforming the way architects approach their designs. This technology isn't just a flash in the pan--it will soon become a fundamental part of the architectural process. Just as the invention of ultra-strong Portland cement and innovative Building Information Modeling (BIM) software dramatically improved the way we design and construct buildings, robotics will have an equally integral role in our industry. Architects who embrace this intriguing and dynamic technology now will be better equipped to design the most efficient buildings of the future. Robotics are already being used in virtually every step of the building design process, from initial site analysis to construction.
Beyond IT-OT integration - Transforming to Cloud & Edge Computing to enable Industry 4.0, AI & ML Capabilities
IT-OT integration is at the core of Industry 4.0, as many use cases require combining and reasoning with data from both OT and IT systems in utilizing data science models, advanced analytics, machine learning and AI to enable insights based cognitive and digital ways of working. As part of the digital transformation, few of the leading industrial products, oil and gas, downstream & chemicals manufacturing companies have already embarked on this journey by initiating data engineering and data integration efforts, developing or implementing data information management systems and by building massive plant and enterprise data lakes. These will facilitate implementation of advanced analytics and AI use case pilots / MVPs for integrated and collaborative operations and scaling up to production to realize the proposed business benefits. At the same time, many of the enterprise and industrial systems have been or are being transformed and migrated to public and private clouds / datacenters due to the cost and efficiency and strategic advantages. Given the above context, the leading companies need to start thinking in terms of "Cloud" and "Edge" computing capabilities with an objective to "centralize where you can in public & private clouds, distribute when you have to the edge".
Freeport to invest in data science, AI programs at North/South America mines - International Mining
After carrying out a successful pilot at its Bagdad copper operation, Freeport McMoRan says it is rolling out a program across its North America and South America mines involving the use of data science, machine learning and integrated functional teams. The program, aimed at addressing bottlenecks, providing cost benefits and driving improved overall performance, was announced in its December quarter results this week. It said: "During 2019, FCX (Freeport) advanced initiatives in its North America and South America mining operations to enhance productivity, expand margins and reduce the capital intensity of the business through the utilisation of new technology applications in combination with a more interactive operating structure." It said the Bagdad mine (Arizona, USA) pilot program, initiated in late 2018, was "highly successful" in utilising these innovative technologies and it would build on this for the implementation across its other mines in North and South America. According to a report in the Financial Times, the system at Bagdad found that the mine was producing seven distinct types of ore and that the processing method, which involves flotation, could be adjusted to recover more copper by adjusting the PH level.
Detection of Surface Cracks in Concrete Structures using Deep Learning
We used Adam as the optimizer and train the model for 6 epochs. We use transfer learning to then train the model on the training data set while measuring loss and accuracy on the validation set. As shown by the loss and accuracy numbers below, the model trains very quickly. After the 1st epoch, train accuracy is 87% and validation accuracy is 97%!. This is the power of transfer learning. Our final model has a validation accuracy of 98.4%.
Semi-Autoregressive Training Improves Mask-Predict Decoding
Ghazvininejad, Marjan, Levy, Omer, Zettlemoyer, Luke
The recently proposed mask-predict decoding algorithm has narrowed the performance gap between semi-autoregressive machine translation models and the traditional left-to-right approach. We introduce a new training method for conditional masked language models, SMART, which mimics the semi-autoregressive behavior of mask-predict, producing training examples that contain model predictions as part of their inputs. Models trained with SMART produce higher-quality translations when using mask-predict decoding, effectively closing the remaining performance gap with fully autoregressive models.
Intelligent Road Inspection with Advanced Machine Learning; Hybrid Prediction Models for Smart Mobility and Transportation Maintenance Systems
Karballaeezadeh, Nader, Zaremotekhases, Farah, Shamshirband, Shahaboddin, Mosavi, Amir, Nabipour, Narjes, Csiba, Peter, Varkonyi-Koczy, Annamaria R.
School of the Built Environment, Oxford Brookes University, Oxford OX3 0BP, UK; a. mosavi@brookes.ac.uk Abstract: Prediction models in mobility and transportation maintenance systems have been dramatically improved through using machine learning methods . The traditional road inspecti on systems based on the pavement condition index (PCI) are often associated with the critical safety, energy and cost issues. Alternatively, t he proposed models utilize surface deflection data from falling weight deflectometer (FWD) test s to predict the PC I. Machine learning methods are the single multi - layer perceptron (MLP) and radial basis function (RBF) neural networks as well their hybrids, i.e., L eve nberg - M arquardt (MLP - LM), scaled conjugate gradient (MLP - SCG), imperialist competitive (RBF - ICA), and g enetic algorithms (RBF - GA). Furthermore, the committee machine intelligent systems (CMIS) method was adopted to combine the results and improve the accur acy of the modeling. The results of the analysis have been verified through using four criteria of aver age percent relative error (APRE), average absolute percent relative error (AAPRE), root mean square error (RMSE), and standard error (SD). The CMIS mode l outperforms other models with the promising results of APRE 2.3303, AAPRE 11.6768, RMSE 12.0056, and SD 0.0210. Introduction In road transportation, pavement plays a vital role as th e part of the road that is in direct contact with vehicles . U sers' judgment about the quality of road service is primarily predicated upon pavement conditions. The Maintena nce, Rehabilitation, and Reconstruction (MR&R) program of pavement network is a multidimensional decision - making process that takes into account several consideration s.
Google Nest Mini review: better bass and recycled plastic
The second generation of Google's smallest smart speaker gets a new name, more eco-friendly, a little smarter and more bass. The £49 Nest Mini replaces the Google Home Mini as part of a revamped and renamed line of Google smart home products under the Nest brand, pushing its predecessor to a clearance price of only £19. From the outside you would be hard pushed to see what has changed. The Nest Mini sticks with the same pincushion design with a fabric top and nonslip rubber pad on the bottom. The top contains three far-field microphones and is touch sensitive.