# So Now What am I Supposed to Do?
By:: [[Ross Jackson]]
2022-09-13
Analysts seldom work with complete datasets. If one is examining data from across numerous years, it is common for variables to change. What used to be measured at some point stops being measured. At some point within the dataset new variables emerge. Life and its corresponding data are messy. When examining a compiled dataset, one might confront the ubiquitous and dreaded “NA.” At that point, the analyst must decide how to proceed.
There is no clear answer as to how one “should” progress. Some might argue to replace the “NAs” with the average value. Others might argue to replace it with the minimum or the maximum [[value]]. Still, others might argue to omit all the observations which contain “NA” values and run the analysis only on those elements which contain all required, numeric values. Without a clear, easy, or universally accepted approach forward, it is left to the analyst to decide what to do. And whereas there is no rule, rest assured if those in power do not like the answer generated there will be no shortage of reasons as to why an alternative approach would have been “better” than the one the analyst selected.
So, if the analyst has the responsibility to decide, but all others consuming the results get to be critics, what is the best course of action moving forward? If time permits, the analyst is often well served to conduct alternative analyses and compare these alternatives to determine if the selected approach influences the results. Going back to the example above, an analyst might not know if it is “best” to use the minimum, average, or maximum value or omit the observations which include NA. What, after all, does “best” even mean? If the analyst conducts the analysis in each of those four ways and comes to the same general outcome, the analyst can inform the decision makers that the results are insensitive to that decision, and therefore the decision doesn’t even really need to be made at all. If the analyst finds the results are highly sensitive to the decision, that analyst can open the topic up to the decision makers for them to make the determination.
If the purpose of the analysis is to inform decision-makers, this approach is beneficial. One can inform the decision makers that a given determination is either essential or inconsequential and then further inform the decision maker as to its ultimate consequence. In cases when analytic technique suggests one approach over another, the analyst can inform those concerned of that as well. [[Understanding]] when a decision is necessary is important. Often one doesn’t need to decide as much as one needs to provide the boundary conditions associated with a given decision being made. Informed by this understanding the analyst is better able to determine what one is supposed to do now.
#### Related Items
[[Analytics]]
[[Decision-making]]
[[Statistics]]
[[Data]]
[[Sensitivity Analysis]]