Managing Expectations — Part 2
If you didn’t read my post Managing Expectations — Part 1, I encourage you to do so to have better context for this one.
Regression analysis is a statistical method that helps to understand how changes in one variable, independent, are associated with changes in another, dependent. There are various types of regression analysis, including linear regression, logistic regression, polynomial regression, and others, each suited for different types of data and research questions. Linear regression, for instance, assumes a linear relationship between the variables, while logistic regression is used when the dependent variable is binary (e.g., yes/no, true/false). In Linear Regression the result is linear function that crosses most of the dependent variables in a plot chart.
As you can see on the above snapshot, the function doesn’t adjust perfectly. It has an error term. There will always be error in the equation. The linear regression analysis will never generate, nor the others, a perfect prediction.
In project/program management, this statistical method can help us predict an expected outcome we want to achieve but within an acceptable degree of error. This deviation can be associated also with risk.
Suppose you aim to excel in a physics test. In this scenario, the independent variables include:
- Number of hours of sleep.
- Number of practice exercises completed successfully.
- Hours dedicated to studying.
Although additional variables could be considered, for the sake of simplicity, let’s focus on these three. Each variable contributes to predicting our final grade, yet predictions are never exact. The model, as a representation of reality, is built by focusing on a specific set of variables while holding others constant, even though they may vary in reality. Moreover, we cannot control all independent variables. For instance, imagine your physics teacher encounters an unforeseen issue and delegates the test creation to a colleague. This change might lead to a different test approach and/or exercises than anticipated. Such unpredictability is incorporated into the error or risk of the prediction.
The question that needs to be answers is : How can we minimize errors?
- Collecting High-Quality Data: Use accurate and relevant data.
- Feature Selection: Choose important variables and ignore irrelevant ones.
- Transforming Variables: Change the way variables are measured to fit better with the model.
- Addressing Multicollinearity: Avoid using variables that are too similar to each other.
- Model Specification: Pick the right type of model for the data.
- Cross-Validation: Check how well the model works with new data.
- Residual Analysis: Look for patterns in the differences between predicted and actual values.
When running a project time is, most likely, not on your side. You’ll have to opt for the approach that best fits the trade offs your stakeholders are willing to accept. For example, collecting high quality data in a new development that hasn’t been done before is not feasible, maybe feature selection is a better option. Not having enough data will produce estimates that a higher degree of risk, than otherwise.
Wrapping up. While minimizing errors can be done through techniques like the ones listed, implementing them in a project is initially an art that later can be transition to a scientific method. All the way through this process, the expectation should be properly set and communicated with the stakeholders. Deviations from reality will occur, but if we known they could happen the impact is always less.