Often (e.g., especially in process control apps), there are big delays (i.e., seconds, minutes and, at times, even hours) inherent between the measurement of an observed variable and the control or response associated with it.
In continuous processes, it's not always possible to have sampled a variable of interest exactly when it is "interesting". Especially if the decision as to *when* it was interesting happens as a result of some FUTURE knowledge!
So, I often end up with a deep queue of sampled variables into which I peek to obtain the value at the time of interest.
This usually means interpolating between samples that are present in the queue and deriving a representative sample of the variable in question.
If you know the characteristics of the parameter being monitored, you can exploit that to minimize the error in the derived value.
Two common strategies I use are linear interpolation (i.e., if observations v1 and v2 happen at t1 and t2, then the value at t1
Why not? Is the process such that these transit times are variable as well?
No, the knowledge is already there, except if your description in the first paragraph is misleading me, the control action is made and then a time delay after the response can be measured, so in essence is about knowing what to expect with a reasonable knowledge on when just "rooting for" the control action was 'correct', otherwise you need to correct it.
If you decide to take action at some multiple of the time delay you don't need to have a such deep queue, just an appropriately designed one.
What does this buys you? The action has occurred in past and after a delay you measure the result, right?
If IIUC, which I still doubt I did, all this strategy could be substituted by a triggering mechanism based in the variable delay for the response.
In order this can become less of a deaf - mute dialog, it would help if you could describe the process more explicitly.
But the descriptions of yours seems you're complicating the problem somewhat.
This is problem in automatic control, which has specific solutions. For example the transit delay could have been transformed in a 'linear delay' if the result delayed in time is the strength of a solution or the mixture of two temperature liquids, you can interpose a mixing tank (sometimes with a stirrer).
Other times you can have the delayed control loop be used in a cascaded approach and have the control action done by feedforward control to be temporally near to the control action, etc.
Details in the process to be controlled would avoid a lengthy discussions on how to build castles in clouds...
If you know the dynamic properties of the thing you're trying to observe, and the rough magnitudes of the disturbances and measurement noise, if you have some time to work through some math, and especially (but not necessarily) if you have a pretty good knowledge of the inputs to the process, then what you are asking for is a Kalman filter, or perhaps an H-infinity filter.
In principal Kalman filters are easy, especially if the plant is reasonably close to being linear with lumped states, and you are sampling at uniform intervals. Their big downside is that their performance degrades fast when the model from which you design them departs to any great degree from their actual behavior.
H-infinity filters are much more robust to model discrepancies, and they are quite doable, although the design step is compute-intensive. However, if the plant isn't reasonably close to being linear with lumped states, or if you aren't sampling at uniform intervals, then you have to do that compute-intensive design step at run-time, instead of ahead of time. Never the less, if you've got the right combination of slow plant and fast processor, it may be a way to go.
For both the Kalman and H-infinity filters, you get better estimates of states if you can collect data in advance of your reasoning, which is exactly what you're doing.
There is a difference between interpolating between existing values, and extrapolating future values. As Tim said, things like Kalman filters are often used for predicting future values.
For interpolation, if a simple linear interpolation is not accurate enough, go for cubic interpolation. There are different ways to fit a cubic spline to a set of points, but they give a good balance between accuracy and computational effort for many curve-fitting applications. You can also use them to extrapolate a little, but beware of trying to look too far into the future!
A good place to start with spline interpolation is the gnu plotutils package. I've used it at command line level to interpolate altitude tables, initially stored as 100ft intervals in the range -2000 to +30000 ft. When modelled in a spreadsheet, showed less than 0.5 ft error over the whole range, with a resolution of 1 ft. You can use it to extrapolate a few value outside the defined range as well. There's also the least squares method, I believe quite widely used for embedded work for interpolation / curve fitting because it's not too computationally intensive.
You hear a lot about Kalman filters, almost fashion of the month, but apparently not always the best approach and very computationally intensive...
Kalman filters are -- in theory -- the optimal way to interpolate between past samples, too. Whether they're enough better than some blind method (like linear interpolation or cubic) to justify the extra design work depends on the problem at hand.
Again, one thing that would tip the balance is if you have some knowledge of the inputs that drive the dynamic behavior -- if you do, and if they make much difference, then the Kalman filter provides a structured framework for using the information instead of ignoring it or flailing around trying to apply some ad-hoc method.
I was smiling when I recommended the OP look into them, because there's a _lot_ of people out there that say "Kalman filter" with the same expectations that they would have if they said "Magic filter". Then when you try to explain the drawbacks they respond with a certain stubborn wishfulness that confirms that what they _really_ want is indeed a magic filter that extracts more information out of their data than was there in the first place.
So, no, a Kalman filter _isn't_ always the best approach. But it could be in this case (I'm waiting for the OP to get back), and if it isn't an H-infinity _might_ be.
For certain conditions (a plant that is only mildly nonlinear, and which is sampled at a fixed rate), the on-line computational complexity of a Kalman is pretty mild. You have to do some math up front, at design time, but not when things are actually running. Ditto an H-infinity filter, but the math to design an H-infinity filter sucks the processor down to a much greater extent.
In fact, for the above conditions, and a plant with just one input and one output, and a choice to use a steady-state Kalman, your filter ends up being a pretty mundane IIR. Only the math that you used to arrive at it makes it special.
Only if you have a seriously nonlinear process that forces you to use an extended Kalman filter (which, in turn, forces you to repeat the design calculations at each step), do you start hitting real computational complexity.
Disclaimer - math is my weak point and I'm not even clear I understand what you are asking for..
Years ago we often used quadratic equations to linearize sensor outputs. We had some very simple viewing programs that would take the sensor output curve and allow us to see the result of the linearization as we tweeked the quadratic equation coefficients to create an ad-hock correction.
I just tend to look for the least complex solution that gets the job done to spec. Call it lazy, but there's no point in over embellishing a solution because it looks interesting :-). That sort of stuff can be done in quiet time. I know kf is applied very successfully to, for example, inertial nav systems, but a paper I read recently suggests that it's not so good with noisy inputs, say mems sensors.
Having said that, it's pretty academic here, as I suspect that I would need to do a math refresher course to handle the theory. Too many years of embedded system design and lack of use dims the mind :-)...
I couldn't agree more. If something simpler will work, then use it. In theory, the OP could cut down his ultimate sampling rate by using a Kalman filter for reconstruction -- but whether he can cut it down by 1% or 99% is way up in the air, and depends on quite a few conditions that he has not stated.
That's actually a good part of my motivation to vent about folks wanting magic filters. An inertial navigation system with MEMS sensors will be at its best if you use a Kalman filter to process the data -- but it may not be nearly as good as you'd like it to be when you compare its performance to systems that use better inertial sensors.
I've actually done extensive work recently on a Kalman filter that's processing data from mems sensors, and it is working pretty darn well -- not magically, but astonishingly good. The only thing to remember is that given a certain set of data going in, any filter can only do so well. The Kalman filter design process (or one of its derivatives) provides a way to find the best filter (or the best starting point) for the data, but insufficient data is still insufficient data.
I think the reason that "Kalman filter" does so often get mistaken for "Magical filter" is because the Kalman process _can_ work so well, even when one's intuition is telling one that there's not enough signal and too much noise to be able to pull things out. The Kalman design process is like flying IFR: you put your intuition on the shelf and you just trust the math, and that gives you the best shot at things working.
Well, then you get a consultant to do the heavy lifting for just that part. Like (modest blush :-)) -- me.
Seriously, the math can be a bit hairy, and knowing _which_ math to use takes some experience. But it can be done.
Sorry for the delay > Often (e.g., especially in process control apps), there are big
In looking over the responses (thanks!), my mention of process control applications AS AN EXAMPLE seems to have caused folks to focus on that -- incorrectly. :<
[this appears to be a characteristic of the human brain -- wanting to focus on specifics instead of more general solutions. There are some interesting texts describing classic experiments that illustrate this behavior :-( ]
The problem I posed is essentially one of data compression. I.e., when *given* samples of a continuous variable at particular (not necessarily periodic!) points in time and *knowing* something about how that variable behaves, how can you reduce the representation of that variable's values over time to a more condensed form -- FROM WHICH you can later extract it's value (within a given level of accuracy) at *any* time in that bounded interval.
E.g., if I dropped a rock from a building and sampled it's velocity at 1.0374 seconds, 3.7777 seconds and 4.002 seconds after it's release, I could VERY ACCURATELY tell you it's relative position at any time from t=0 to the point at which it meets an immovable object. (d'uh... example chosen for it's obviousness). I wouldn't have to sample AND STORE it's position every millisecond in order to be able to *tell* you it's position (after the fact) at any point in that interval.
This is true of many "variables" -- *if* you put some constraints on how it is *allowed* to behave "between observations" (or, alternatively, on how often you observe it given a particular type of allowed behavior).
E.g., if I measure a child's height while growing, I can be reasonably sure that if he/she was H1 at T1 and H2 at T2, then all Hi in [T1,T2] will satisfy H1 to reduce the resources set aside to track these parameters
I.e., to be able to fold (compress) observations into other observations by exploiting knowledge of the variable's dynamic behavior. If I can eliminate observation X, then there is also the possibility of *making* observation X!
I posed this to a fellow, here, "at the water cooler". This quickly escalated to a considerable "diversion" as others got drawn into the discussion.
[Gotta love math majors -- the world is nice and simple for them! Equations can be massaged in ways that intuition wouldn't take as obvious. And, it's amusing to watch them "zone out" into their own techno-babble -- (sigh) if only all communications could be that unambiguous! :< ]
Two days later, the key result of these exchanges, was the realization that you can pick and chose between suitable models AT WILL since you are modeling the *past* and not trying to predict the future! And, that you can freely switch between models at different points in time. E.g., replacing higher order models with simpler ones in certain regions.
So, now I'll look at efficient ways to fit multiple models to the same data to chose between them (picking the one that requires the least amount of resources for a given level of reproduction accuracy).
I'll try to write up my results (instead of just publishing the code) as I suspect the thinking behind the model derivations will be the more valuable issue.
I didn't say that "everyone" missed the mark. Several posts (even *yours*!) commented on the curve-fitting aspect. What seems to have been missed is using that activity to *reduce* data (storage) requirements. And leveraging the knowledge of the variable's behavioral *constraints* to chose a modeling technique.
That's entirely possible -- my verbal skills are abysmal! But, this is the only forum that always seems to want "examples" (instead of just "solving" a general question as posed) so that complicates my task.
Given this clarification, can you suggest a better way for me to have posed the question that would have *avoided* the misunderstanding (and NOT evoked the "Why do you want to do that?" response)?
My ineptitude here doesn't factor into how people react in controlled experiments elsewhere :>
In a sense, "yes"... but, MP3 is lossy. And, AFAIK, there is no way to quantitatively pick the degree of "inaccuracy" in it's reproduction of the input signal. I.e., you can't say "This representation is guaranteed to limit the instantaneous error between original signal and reproduced signal to X for any time in the interval represented" (obviously, I need to be able to know how (in)accurate the reproduction is when I reconstruct particular values). Note that some coding techniques will allow you to bound the error at particular points (e.g., fit a series of piecewise linear segments to each pair of consecutive samples and you *know* the samples are reproduced exactly).
Apparently, it's not a hard problem to solve -- you just have to throw CPU cycles at it and have a good selection of models to choose from. Unfortunately, it is really hard to figure out how to "factor" computations that affect the fitting of multiple models to some common computation so they aren't needlessly repeated. E.g., you need some transparent way of caching intermediate results without the code being aware of them. :<
I once worked in an area where signal detection was a little complex. While I personally don't have the math to follow a everything here, from a conceptual viewpoint, I can say that we had good results once we qualified the aspect(s) that were important, then built a system that both could detect on specific features AND use the 'better' signals to build a 'model' which was used to 'auto-correlate' any following signal. The system even had an arming feature and a 'window' of acceptability (built with timing and level thresholds, etc.).