New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggestion: Sliding Window Function #7753
Comments
While getting a sliding window with stride tricks is very cool, |
I'd agree that there is no real need -- e.g., the example is more efficiently done with |
The example was obviously only chosen to keep the example as small as possible. I am not sure I understand why "almost any function on top of that would it be inefficient". Another example, calculating the spectrogram of a signal, where the sliding window is >2x faster than a for loop (plus the sliding window implementation being easier to read and debug): import time
import numpy
import scipy.signal
import matplotlib.pyplot as plt
winlen = 256
stepsize = winlen // 2
siglen = 44100 * 10
############# Using sliding window ############
start = time.clock()
a = numpy.sin(numpy.arange(siglen) * 2 * numpy.pi * 16000 / 44100.0)
A = sliding_window(a, size=winlen, stepsize=stepsize)
B = numpy.fft.fft(A * scipy.signal.windows.hamming(winlen), axis=1)
print time.clock() - start # 0.037622 on my machine
############# Using for loop ############
start = time.clock()
a = numpy.sin(numpy.arange(siglen) * 2 * numpy.pi * 16000 / 44100.0)
idxlist = range(0, len(a)-winlen, stepsize)
C = numpy.zeros((len(idxlist), winlen), dtype=complex)
for o, i in enumerate(idxlist):
C[o, :] = numpy.fft.fft(a[i:i+winlen] * scipy.signal.windows.hamming(winlen))
print time.clock() - start # 0.080116 on my machine
##########################################
print numpy.allclose(B, C) # True
plt.imshow(numpy.abs(B))
plt.figure()
plt.imshow(numpy.abs(C))
plt.show() |
It's not the same as your example, but say you have an array with n items
and all of a sudden you are computing each new FFT in time O(m) instead of O(m Your half overlapping FFTs are actually the last step of the FFT algorithm, For many problems on a sliding window there are similarly clever approaches |
Yes, you may be able to optimize an overlapping FFT that way. Again, I am not talking about a specific usecase here. It pains me to say it but I am looking at things from the "rapid prototyping scientific code" side of things. I have yet to come across scientific code that actually does such optimizations and doesn't just go down the easy "slice and vectorize" road. When I am trying out some random idea I had I am not interested in prematurely optimizing my lapped operations, I just want it to be reasonably fast and maintainable for as little cost as possible.
When it goes to actual production code with more maintainers and tests and some inner loops done in C your implementation of course makes more sense. |
Considering that both scipy and matplotlib implement their spectrogram-related functions using exactly this approach rather than some more efficient approach, it seems we have already gone down this path. I think there are two issues with using more efficient approaches. First, it requires different approaches for any calculation you might want to do. Second, it requires someone to actually implement all of these special cases. Considering that even in the supposedly obvious case with FFT no one has stepped up to do this, it doesn't seem that this is happening. I think the simplest approach would not be so much to create special functions for each calculation you might want, but rather create a single function that takes the window length, overlap, a possible window, and a function. It would then use strides to create the correct shape, then apply the window (if any), then apply the function. matplotlib and scipy are already doing the first two steps, so it would be easy to translate their code into a general function that could work with any function that takes a vector. To be completely open, I implemented the strided approach in matplotlib, but before that it was using the same algorithm implemented using a loop, so this was an improvement over what existed before. |
This functionality is also replicated in scikit-image ( Also, if a new convenience function would live under |
Hiding it in I'm +1 on adding such a function |
A possible api extending the res = np.lib.stride_tricks.sliding_window_view(arr, shape, axis=None) with semantics:
The matlab example in #10754 would become: a = np.linspace(0,1,16).reshape(4, 4)
b = np.lib.stride_tricks.sliding_window_view(arr, (2,2))
new_a = b.mean(axis=(-2,-1))
|
+1 for this feature. It is quite useful in rapid prototyping. I have to copy&paste the moving window function frequently. It will be great if it is an std method in Numpy. |
scikit-image does this for any dimensions in |
@nils-werner Thanks for the info. That's a handy function to have. But, it will still be better to have something similar in Numpy to avoid additional dependency of the skimage package. After all, I don't think moving window function is a special need for image processing, but a common practice in many signal/data processing which Numpy/Scipy package is targeted at. |
@eric-wieser , @lijunzh def sliding_window_view(arr, shape):
n = np.array(arr.shape)
o = n - shape + 1 # output shape
strides = arr.strides
new_shape = np.concatenate((o, shape), axis=0)
new_strides = np.concatenate((strides, strides), axis=0)
return np.lib.stride_tricks.as_strided(arr ,new_shape, new_strides) Test case1arr = np.arange(12).reshape(3,4)
shape = np.array([2,2])
sliding_window_view(arr, (2,2)) input:
output:
###Test case 2 arr = np.arange(9).reshape(3,3)
shape = np.array([1,2])
sliding_window_view(arr, shape) input:
output:
|
I like this API, but I would suggest that |
Hmm, I see where you're coming from there. There seems to be a lot of precedent for Are there any existing function that take both arguments already? |
Regarding Furthermore, the
Furthermore, slicing lets an offset be specified, whereas a |
For an algorithm library, I needed rolling window standard deviations. Using the sliding window as suggested here was the only approach I found that was fast, did not require additional libraries and did not have (albeit very small) rounding errors. (Even pandas rolling function had small errors up to 3e-7 in specific cases - see pandas-dev/pandas#27593 for details.) Even if some functions (the examples listed before) can be done more optimized, it seems wrong to ignore the benefit it might have for other cases. |
@DieterDePaepe - there is a PR implementing this - #10771 - which seems to have stalled. it may be more useful to comment there (and check that it would do what you want it to!). |
Nowadays I am just using scikit image's |
Just a note, since this is a popular issue and inactive for a while. I think we decided that we are happy to put it in, initially only in |
Have you had a look at the Their API is quite clean, understandable and flexible, and their implementation is quite simple, too. So if this functionality is really such great demand, they could be "upstreamed" into NumPy? Of course in coordination and with agreement from skimage-authors. |
Coming back to this, I realize I missed something in this comment
When slicing between windows, chances are |
Test cases are shown in the issue page.
* implement sliding_window_view #7753 Test cases are shown in the issue page. * Add Example Cases * Add step_size and N-dim support * Add shape and step_size check. Remove warning. * Remove shape default Add step_size default's description. * Give proper parameter name 'step' * fix a parameter description mistake * implement test function for sliding_window_view() * implement test function for sliding_window_view() * Fix according to @eric-wieser comments * Change arange to ogrid in Examples * remove np.squeeze on return line * Clarify document to avoid parameter confusion. * add `writable` and more explanation in docs * resolve a write conflit * fixes according to @seberg review * resolve write hazard * remove outdated docs. * change referring according to @mattip. change 'writeable' to 'readonly' as @seberg suggest. remove 'step' as @eric-wieser request * fix test minor error * DOC: Grammar fixes * STY: Add missing line break required by PEP8 * + Change readonly parameter to writeable. + Update writeable description. + Fix a few parameter checks. + Other minor improvements. * Move to new api as proposed by @eric-wieser - Change api to follow suggestion by Eric Wieser in #10771 (comment) - Update docstring - Add more tests * Improve documentation * Add sliding_window_view to documentation index * Apply suggestions from code review Co-authored-by: Eric Wieser <wieser.eric@gmail.com> * Fix window shape check * Add `sliding_window_view` to __all__ * Add tests for error cases * Add array_function dispatching * Change dispatcher argument defaults to None * Simplify array function dispatching * Added "np." prefix to doctests * Split tests * Improved docstring * Add release note * Fix docstring formatting * Fix doctest * Remove namespacing in documentation indexing * Improve docstring * Improved docstring * Simplified docstring * Improve docstring to make pseudo code stand out * Improve docstring * Add simple application example * Correct release note * Improve link with as_strides * Add note about performance * Tidy up main doc string * Make language on performance warning stronger * Bugfix: pass subok and writeable to as_strided * Add writeable test * Add subok test * Change subok test to use custom array subclass instead of unsupported MaskedArray * Add version added information Co-authored-by: Eric Wieser <wieser.eric@gmail.com> Co-authored-by: Fanjin <fjzeng@ucsd.edu> Co-authored-by: Fanjin Zeng <Fnjn@users.noreply.github.com> Co-authored-by: Eric Wieser <wieser.eric@gmail.com> Co-authored-by: fanjin <fjzeng@outlook.com> Closes gh-7753
Using
np.lib.stride_tricks.as_stride
one can very efficiently create a sliding window that segments an array as a preprocessing step for vectorized applications. For example a moving average of a window length3
, stepsize1
:This is very performant but very hard to do, as the
shape
andstrides
parameters are very hard to understand.I suggest the implementation of a "simple
sliding_window
function" that does the job of figuring out those two parameters for you.The implementation I have been using for years allows the above to be replaced by
@teoliphant also has an implementation that would change the above to
both making it much more readable.
Seeing it is a common usecase in vectorized computing I suggest we put a similar function into NumPy itself.
Regarding to which implementation to follow, they are both assume different things but allow you to do the same thing eventually:
sliding_window
copy
parameter that can be removed and replaced by appending.copy()
after the callarray_for_sliding_window
tuple
parameteraxis[n]
requires you set argumentwshape[n] = 1
orwshape[n] = a.shape[n]
This means for flexible
stepsize
the following are equivalent (with some minor bug insliding_window
there):for sliding over one
axis
the following are equivalent (with some transposing and squeezing):and for sliding over two
axis
the following are equivalent:This issue is about sparking discussion about
After discussion I am willing to draft up a pull request with an implementation we agreed on.
The text was updated successfully, but these errors were encountered: