Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Phase vocoder
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Short description|Vocoder algorithm}} [[File:GeneralizedPrinciple_TSM.png|thumb|right|450px|Decomposition of an audio signal into frames. Frames are then processed and reassembled.]] A '''phase vocoder''' is a type of [[vocoder]]-purposed algorithm which can [[Interpolation|interpolate]] information present in the [[frequency]] and [[time domain]]s of audio signals by using [[Phase (waves)|phase]] information extracted from a frequency transform.<ref>{{cite web |last1=Sethares |first1=William |title=A Phase Vocoder in Matlab |url=https://sethares.engr.wisc.edu/vocoders/phasevocoder.html |website=sethares.engr.wisc.edu |access-date=6 December 2020}}</ref> The computer [[algorithm]] allows [[frequency-domain]] modifications to a digital sound file (typically [[Audio timescale-pitch modification|time expansion/compression and pitch shifting]]). At the heart of the phase vocoder is the [[short-time Fourier transform]] (STFT), typically coded using [[fast Fourier transform]]s. The STFT converts a [[time domain]] representation of sound into a [[time-frequency representation]] (the "analysis" phase), allowing modifications to the amplitudes or phases of specific frequency components of the sound, before resynthesis of the time-frequency domain representation into the time domain by the inverse STFT. The time evolution of the resynthesized sound can be changed by means of modifying the time position of the STFT frames prior to the resynthesis operation allowing for time-scale modification of the original sound file. == Phase coherence problem == The main problem that has to be solved for all cases of manipulation of the STFT is the fact that individual signal components (sinusoids, impulses) will be spread over multiple frames and multiple STFT frequency locations (bins). This is because the STFT analysis is done using overlapping [[Window function|analysis windows]]. The windowing results in [[spectral leakage]] such that the information of individual sinusoidal components is spread over adjacent STFT bins. To avoid border effects of tapering of the analysis windows, STFT analysis windows overlap in time. This time overlap results in the fact that adjacent STFT analyses are strongly correlated (a sinusoid present in analysis frame at time "t" will be present in the subsequent frames as well). The problem of signal transformation with the phase vocoder is related to the problem that all modifications that are done in the STFT representation need to preserve the appropriate correlation between adjacent frequency bins (vertical coherence) and time frames (horizontal coherence). Except in the case of extremely simple synthetic sounds, these appropriate correlations can be preserved only approximately, and since the invention of the phase vocoder research has been mainly concerned with finding algorithms that would preserve the vertical and horizontal coherence of the STFT representation after the modification. The phase coherence problem was investigated for quite a while before appropriate solutions emerged. == History == The phase vocoder was introduced in 1966 by Flanagan as an algorithm that would preserve horizontal coherence between the phases of bins that represent sinusoidal components.<ref>{{cite journal|doi=10.1002/j.1538-7305.1966.tb01706.x|author=Flanagan J.L. and Golden, R. M.|title=Phase vocoder|journal=Bell System Technical Journal|volume=45|issue=9|pages=1493β1509|year=1966}}</ref> This original phase vocoder did not take into account the vertical coherence between adjacent frequency bins, and therefore, time stretching with this system produced sound signals that were missing clarity. The optimal reconstruction of the sound signal from STFT after amplitude modifications has been proposed by Griffin and Lim in 1984.<ref>{{cite journal|doi=10.1109/TASSP.1984.1164317|author=Griffin D. and Lim J.|title=Signal Estimation from Modified Short-Time Fourier Transform|journal=IEEE Transactions on Acoustics, Speech, and Signal Processing|volume=32|issue=2|pages= 236β243|year=1984|citeseerx=10.1.1.306.7858}}</ref> This algorithm does not consider the problem of producing a coherent STFT, but it does allow finding the sound signal that has an STFT that is as close as possible to the modified STFT even if the modified STFT is not coherent (does not represent any signal). The problem of the vertical coherence remained a major issue for the quality of time scaling operations until 1999 when Laroche and Dolson<ref>{{Cite journal | author = J. Laroche and M. Dolson | title = Improved Phase Vocoder Time-Scale Modification of Audio | journal = [[IEEE Transactions on Speech and Audio Processing]] | volume = 7 | issue = 3 | pages = 323β332 | year = 1999 | url = http://www.cmap.polytechnique.fr/~bacry/MVA/getpapers.php?file=phase_vocoder.pdf&type=pdf | doi = 10.1109/89.759041 | url-access = subscription }}</ref> proposed a means to preserve phase consistency across spectral bins. The proposition of Laroche and Dolson has to be seen as a turning point in phase vocoder history. It has been shown that by means of ensuring vertical phase consistency very high quality time scaling transformations can be obtained. The algorithm proposed by Laroche did not allow preservation of vertical phase coherence for sound onsets (note onsets). A solution for this problem has been proposed by Roebel.<ref>Roebel A., "A new approach to transient processing in the phase vocoder", DAFx, 2003. [http://www.ircam.fr/equipes/analyse-synthese/roebel/paper/dafx2003.pdf pdf] {{webarchive|url=https://web.archive.org/web/20040617224423/http://www.ircam.fr/equipes/analyse-synthese/roebel/paper/dafx2003.pdf |date=2004-06-17 }}</ref> An example of software implementation of phase vocoder based signal transformation using means similar to those described here to achieve high quality signal transformation is [[Ircam]]'s SuperVP.<ref>"[http://anasynth.ircam.fr/home/english/software/supervp SuperVP]", ''Ircam.fr''.</ref>{{verification needed|date=July 2011}} == Use in music == British composer [[Trevor Wishart]] used phase vocoder analyses and transformations of a human voice as the basis for his composition ''Vox 5'' (part of his larger [[Vox Cycle]]).<ref>Wishart, T. "The Composition of Vox 5". Computer Music Journal 12/4, 1988</ref> ''[[Transfigured Wind]]'' by American composer [[Roger Reynolds]] uses the phase vocoder to perform time-stretching of flute sounds.<ref>Serra, X. '[http://mtg.upf.edu/node/304 A System for Sound Analysis/Transformation/Synthesis based on Deterministic plus Stochastic Decomposition]', p.12 (PhD Thesis 1989)</ref> The music of [[JoAnn Kuchera-Morin]] makes some of the earliest and most extensive use of phase vocoder transformations, such as in ''Dreampaths'' (1989).<ref>[[Curtis Roads|Roads, Curtis]] (2004). ''Microsound'', p.318. MIT Press. {{ISBN|9780262681544}}.</ref> == See also == * [[Audio time stretching and pitch scaling]] == References == {{reflist}} == External links == {{wikibooks|MATLAB Programming|Phase Vocoder and Encoder|Phase vocoder and encoder}} *[http://www.panix.com/~jens/pvoc-dolson.par The Phase Vocoder: A Tutorial] - A good description of the phase vocoder *[http://www.ee.columbia.edu/~dpwe/papers/LaroD99-pvoc.pdf New Phase-Vocoder Techniques for Pitch-Shifting, Harmonizing and Other Exotic Effects] *[https://web.archive.org/web/20040617224423/http://www.ircam.fr/equipes/analyse-synthese/roebel/paper/dafx2003.pdf A new Approach to Transient Processing in the Phase Vocoder] *[http://www.guitarpitchshifter.com/algorithm.html#33 Phase Vocoder] - Phase vocoder description with figures and equations {{Speech synthesis}} [[Category:Signal processing]] [[Category:Speech synthesis]]
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)
Pages transcluded onto the current version of this page
(
help
)
:
Template:Cite journal
(
edit
)
Template:Cite web
(
edit
)
Template:ISBN
(
edit
)
Template:Reflist
(
edit
)
Template:Short description
(
edit
)
Template:Sister project
(
edit
)
Template:Speech synthesis
(
edit
)
Template:Verification needed
(
edit
)
Template:Webarchive
(
edit
)
Template:Wikibooks
(
edit
)