residual plot
Predicting the Impact of Scope Changes on Project Cost and Schedule Using Machine Learning Techniques
In the dynamic landscape of project management, scope changes are an inevitable reality that can significantly impact project performance. These changes, whether initiated by stakeholders, external factors, or internal project dynamics, can lead to cost overruns and schedule delays. Accurately predicting the consequences of these changes is crucial for effective project control and informed decision-making. This study aims to develop predictive models to estimate the impact of scope changes on project cost and schedule using machine learning techniques. The research utilizes a comprehensive dataset containing detailed information on project tasks, including the Work Breakdown Structure (WBS), task type, productivity rate, estimated cost, actual cost, duration, task dependencies, scope change magnitude, and scope change timing. Multiple machine learning models are developed and evaluated to predict the impact of scope changes on project cost and schedule. These models include Linear Regression, Decision Tree, Ridge Regression, Random Forest, Gradient Boosting, and XGBoost. The dataset is split into training and testing sets, and the models are trained using the preprocessed data. Model robustness and generalization are assessed using cross-validation techniques. To evaluate the performance of models, we use Mean Squared Error (MSE) and R2. Residual plots are generated to assess the goodness of fit and identify any patterns or outliers. Hyperparameter tuning is performed to optimize the XGBoost model and improve its predictive accuracy. The study identifies the most influential project attributes in determining the magnitude of cost and schedule deviations caused by scope modifications. It is identified that productivity rate, scope change magnitude, task dependencies, estimated cost, actual cost, duration, and specific WBS elements are powerful predictors.
- North America > United States > Texas > Bexar County > San Antonio (0.04)
- North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.91)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.70)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.50)
Automated Assessment of Residual Plots with Computer Vision Models
Li, Weihao, Cook, Dianne, Tanaka, Emi, VanderPlas, Susan, Ackermann, Klaus
Plotting the residuals is a recommended procedure to diagnose deviations from linear model assumptions, such as non-linearity, heteroscedasticity, and non-normality. The presence of structure in residual plots can be tested using the lineup protocol to do visual inference. There are a variety of conventional residual tests, but the lineup protocol, used as a statistical test, performs better for diagnostic purposes because it is less sensitive and applies more broadly to different types of departures. However, the lineup protocol relies on human judgment which limits its scalability. This work presents a solution by providing a computer vision model to automate the assessment of residual plots. It is trained to predict a distance measure that quantifies the disparity between the residual distribution of a fitted classical normal linear regression model and the reference distribution, based on Kullback-Leibler divergence. From extensive simulation studies, the computer vision model exhibits lower sensitivity than conventional tests but higher sensitivity than human visual tests. It is slightly less effective on non-linearity patterns. Several examples from classical papers and contemporary data illustrate the new procedures, highlighting its usefulness in automating the diagnostic process and supplementing existing methods.
- North America > United States > Nebraska > Lancaster County > Lincoln (0.14)
- Oceania > Australia (0.04)
- Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
- (5 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.68)
10 Amazing Machine Learning Visualizations You Should Know in 2023 - KDnuggets
Data visualization plays an important role in machine learning. Visualizations that are directly related to the above key things in machine learning are called machine learning visualizations. Creating machine learning visualizations is sometimes a complicated process as it requires a lot of code to write even in Python. But, thanks to Python's open-source Yellowbrick library, even complex machine learning visualizations can be created with less code. That library extends the Scikit-learn API and provides high-level functions for visual diagnostics that are not provided by Scikit-learn.
Learn Excel's Powerful Tools for Linear Regression
Additionally, ggplot2 is a powerful visualization library that allows us to easily render the scatterplot and the regression line for a quick inspection. If you're interested in producing similar results in Python, the best way is to use the OLS ( Ordinary Least Squares) model from statsmodels. It has the closest output to the base R lm package producing a similar summary table. We'll start by importing the packages we need to run the model. Next, let's prepare our data.
The Importance of Analyzing Model Assumptions in Machine Learning
With this summary, we can see important values such as R2, the F-statistic, and many others. You can also analyze a model using a graphical diagnostic such as plotting the residuals against the fitted/predicted values. Above is the fitted versus residual plot for our weight-height dataset, using height as the predictor. For the most part, this plot is random. However, as fitted values increase, so does the range of residuals.
Beginners Guide to Regression Analysis and Plot Interpretations Tutorials & Notes Machine Learning HackerEarth
"The road to machine learning starts with Regression. If you are aspiring to become a data scientist, regression is the first algorithm you need to learn master. Not just to clear job interviews, but to solve real world problems. Till today, a lot of consultancy firms continue to use regression techniques at a larger scale to help their clients. No doubt, it's one of the easiest algorithms to learn, but it requires persistent effort to get to the master level.
- Research Report > Experimental Study (0.51)
- Research Report > New Finding (0.41)
How to use data analysis for machine learning, part 2 - SHARP SIGHT LABS
In part 1, we went over how to use data visualization and data analysis prior to machine learning. For example, we discussed how to visualize the data to identify potential issues in the dataset, examine the variable distributions, etc. In this blog post, we'll continue by building a very simple model and using data visualization to examine that model. Just a quick reminder: as I noted in part 1, we're working with a very simple model. This is deliberately a "toy" model, which allows us to focus on the visualization/analysis aspect of the task without the added level of complexity that we'd inject by using a more advanced machine learning algorithm.
How to use data analysis for machine learning, part 2 R-bloggers
In part 1, we went over how to use data visualization and data analysis prior to machine learning. For example, we discussed how to visualize the data to identify potential issues in the dataset, examine the variable distributions, etc. In this blog post, we'll continue by building a very simple model and using data visualization to examine that model. Just a quick reminder: as I noted in part 1, we're working with a very simple model. This is deliberately a "toy" model, which allows us to focus on the visualization/analysis aspect of the task without the added level of complexity that we'd inject by using a more advanced machine learning algorithm.