Recognizing gestures by the sound they make (DTW and Puredata)

This prototype follows the idea of [1] where Harrison uses a microphone to record and recognize a set of defined gestures. In this implementation we follow the same approach by using the Dynamic Time Warp algorithm[2,3] but propose it under a different setting: the PureData programming environment[4].

This research for IMMI course at IST will result in a small decoupled system that is able to recognize a set of gestures and send the resulting gesture in a OSC formatted message, to be received in any connected system via network (either locally of remotely) – thus achieving modularity. For our experiments will be using it to recognized gestures performed by a DJ with his foot to control a DJ setup, can be used for many more applications (such as proposed by Harrison in[1]).

Currently there is no implementation of the DTW algorithm for usage in PureData environment, although it has been proposed by Todoroff and Bettens [5] but still not ported to PureData, our implementation is based on on Andrew Slater and John Coleman’s DTW[6] but ported to PureData with several needed modifications. The DTW object is still in alpha phase but it is already working and available for public usage here[7], the official release will be published later once the API is fully defined.

Dynamic Time Warping in PureData (alpha) from PedroLopes on Vimeo.

This shows a possible implementation of a DTW as a pd external – written in ANSI C. For this demo there are 8 samples that are used as gesture patterns, the algorithm tries to find the best gesture for the sampled input.

Hardware: lo-fi built-in microphone (very bad!)
Software: Puredata 0.41 (works with extended too.); Ubuntu 9.10; Jack Audio Server

Work by: Pedro Lopes and Guilherme Fernandes

References

[1] Harrison, Chris and Hudson, Scott E. Scratch Input: Creating Large, Inexpensive, Unpowered and Mobile finger Input Surfaces. In Proceedings of the 21st Annual ACM Symposium on User interface Software and Technology. UIST ’08. ACM, New York, NY, 205-208.

[2] Toward accurate dynamic time wrapping in linear time and space.. S. Salvador and P. Chan. Intelligent Data Analysis, 11(5):561-580, 2007.

[3] FastDTW: Toward Accurate Dynamic Time Warping in Linear Time and Space. S. Salvador & P. Chan. KDD Workshop on Mining Temporal and Sequential Data, pp. 70-80, 2004

[4] Puckette, Miller Smith (2007). The Theory and Technique of Electronic Music. World Scientific Press, Singapore. ISBN 978-9812705419.

[5] Todor Todoroff , Fré́déric Bettens, REAL-TIME DTW-BASED GESTURE RECOGNITION EXTERNAL OBJECT FOR MAX/MSP AND PUREDATA , Sound Music Computing 2009, Oporto. Faculty of Engineering (FPMs) – TCTS Lab

[6] Andrew Slater and John Coleman’s DTW: –http://www.phon.ox.ac.uk/files/slp/Extras/dtw.html

[7] Github Repository for Pedro Lopes http://github.com/PedroLopes/PD-externals

Leave a Reply