Precept approaches to shifting information distributions with incorrect sensor calibrations
I am working with function information consisting of sensor measurements however a few of these sensors have been recalibrated utilizing completely different base factors over time such that there are some distribution shifts in my information such because the exemplar graph on this put up.
I’ve no information on the bottom factors used for these sensors however I’ve ample enterprise data to detect a sensor recalibration and shift the information appropriately again into the common distribution.
However, is anybody conscious of any extra principled approaches to carry out this process of shifting elements of the information? I am not referring to issues like change level detection however fairly the precise shifting of the information.
Wanting ahead to any insights and experiences with related issues of others!
[Example of distribution shift](https://preview.redd.it/dqh24wjz7zza1.png?width=476&format=png&auto=webp&v=enabled&s=76d965b07f30894f591ca4313ad3404bc5d6f7da)
Comments ( 6 )
This is just like when a company releases a dividend! Typically, the dividend amount is subtracted from all values in the previous time series. That second jump is positive so you’ll be a subtracting a negative (getting more positive). My hunch is the same solution applies here. Note: best done in a reverse for loop.
If we consider the simple model where the “true” measurements have some time-varying functional form, f(t), then your observed data can be modeled as Y(t) = Baseline(t) + f(t), where Baseline(t) is the step function that models your recalibrations. If you can specify when these occur, then your inferences can just be on f(t).
Maybe you can look at the first derivative of your data. Then, your jumps become rather narrow peaks and you could threshold to filter those time points out
As others said there are two parts. First, in order to detect this you can calculate the first derivative. Given the shape of the recalibrations compared to the normal curve you should be able to simply define a hard cutoff for the value of the first derivative let’s say D
Anytime you identify an interval of points whose derivative is greater than D, remove those points. Then you should calculate the difference S between the maximum Y value in that interval and the minimum Y value in that interval. All proceeding data is then translated along the Y axis S units (could be positive or negative). Finally, since there is now a gap in your time series you can use a Lagrange interpolation to reconnect the line, then simply fill in points along your time axis on that line.