Does it differentiate?Greg is not so sure. I'll give some examples. First here is a graph of some of the windows. I've added a step function, to show that in fact virtually any odd function will differentiate. The one called Welch^2 is the linear window multiplied by the square of a Welch taper. There is information about the Welch taper, and why it arises naturally here, here.
These are scaled to give correct derivative for a line. On the legend, I have written beside each the RMS value. Sqrt of the integral of square. That's the factor that you would expect to multiply white noise, as a variance of weighted sum. Ordinary regression has the least value. That is an aspect of its optimal quality. Better smoothing for HF comes at a cost.
So now to apply these to a sinusoid. In the following plot the yellow is the sine to be differentiated, the continuous curve "Deriv" is the exact derivative, and I have applied the three windows with three different footprints (line types). At each end, a region equal to half a footprint is lost - I have marked those parts with a horizontal line, which tells you how long each footprint was.
The first thing to note is that each gives a sinusoid, with the correct 90° phase shift. The is a consequence of being an odd function. Second is that as the footprint broadens, the amplitude shrinks. The inaccuracy is because a wider footprint is more affected by higher derivatives of the sine. In effect, it smoothes the result, to no good purpose here. But when there is noise, the smoothing is needed. That is the basic trade-off that we will encounter.
You'll notice, too, that both the Welch and step do better than regression. This is basically because they are weighted to favor central values, rather than any more subtle merit.
Now for something different, to differentiate white noise. There should be no real trend. I've dropped the step filter. Again the horizontal end sections indicate the half-footprint.
Now what shows is a marked oscillation, with period about equal to the footprint. The Welch filter is good at damping frequencies beyond this range; however, the actual amplitude of the response is much higher. That is associated with the higher RMS value noted on the first figure, and is commensurate with it.
Integrating with noiseOK, so what happens if we estimate the derivative of a sinusoid with noise. The next fig has sine with gaussian noise of equal amplitude added. Can we recover the derivative?
The pale grey is the original, sine and total. The blue is the derivative of sine. The red OLS regression tracks better than the smoother purple, but has more residual HF. Again the half-footprint is shown by the level sections at each end.
Improving the derivative formulaPekka, on another thread, recommended taking pairs of points, differencing, and forming some kind of optimal derivative as a weighted sum. Odd function windows automatically give such a weighted sum. The idea of improving the derivative (higher order) is attractive, because it allows the footprint to be expanded, with better noise damping, without loss of derivative accuracy.
A way to do this is to take a family of windows, and make them orthogonal to higher powers by Gram-Schmidt orthogonalisation. I did that using OLS, OLS with Welch taper, and OLS with Welch^2 (W0,W1,W0). Here are the resulting windows. W1 is guaranteed to be accurate for a cubic, and W2 for fifth order:
Again, they are scaled to give the same derivative. We see a more extreme version of the RMS inflation of the first fig. The higher order accuracy is got at a cost of much larger coefficients. They do indeed differentiate better, as the next plot shows:
W2 seems suspiciously perfect. But each result is just the correct sinusoid multiplied by a scalar, and I think it just approaches 1 in an oscillatory way, so W2 just happens to be a zero. Anyway, here higher order certainly works. What about with white noise?
Here is the downside. The higher orders amplify noise, in line with their RMS integrals.
Higher order may sometimes work. If you have an exact cubic, then you can expand the footprint as far as data permits without loss of derivative accuracy, and thus maybe overcome the RMS loss. But generally not.