Editing Ambisonics (section)

== Introduction ==
Ambisonics can be understood as a three-dimensional extension of [[Microphone practice#M.2FS technique: Mid.2FSide stereophony|M/S (mid/side) stereo]], adding additional difference channels for height and depth. The resulting signal set is called ''B-format''. Its component channels are labelled <math>W</math> for the sound pressure (the M in M/S), <math>X</math> for the front-minus-back sound pressure gradient, <math>Y</math> for left-minus-right (the S in M/S) and <math>Z</math> for up-minus-down.<ref group="note">The traditional B-format notation is used in this introductory paragraph, since it is assumed that the reader may have come across it already. For higher-order Ambisonics, use of the [[Ambisonic data exchange formats|ACN notation]] is recommended.</ref>

The <math>W</math> signal corresponds to an omnidirectional microphone, whereas <math>XYZ</math> are the components that would be picked up by [[Microphone#Polar patterns|figure-of-eight]] capsules oriented along the three spatial axes.

=== Panning a source ===
A simple Ambisonic panner (or ''encoder'') takes a source signal <math>S</math> and two parameters, the horizontal angle <math>\theta</math> and the elevation angle <math>\phi</math>. It positions the source at the desired angle by distributing the signal over the Ambisonic components with different gains:

:<math>W=S \cdot  \frac{1}{\sqrt{2}}</math>
:<math>X=S \cdot \cos\theta\cos\phi</math>
:<math>Y=S \cdot \sin\theta\cos\phi</math>
:<math>Z=S \cdot \sin\phi</math>

Being omnidirectional, the <math>W</math> channel always gets the same constant input signal, regardless of the angles. So that it has more-or-less the same average energy as the other channels, W is attenuated by about 3&nbsp;dB (precisely, divided by the square root of two).<ref>{{cite conference |title=Practical Periphony |first=M.A. |last=Gerzon |author-link=Michael Gerzon |date=February 1980 |conference=65th Audio Engineering Society Convention |publisher=[[Audio Engineering Society]] |location=London |id=Preprint 1571 |quote=In order to make B-format signals carry more-or-less equal average energy, X,Y,Z have a gain of {{sqrt|2}} in their directions of peak sensitivity. |page=7}}</ref> The terms for <math>XYZ</math> actually produce the polar patterns of figure-of-eight microphones (see illustration on the right, second row). We take their value at <math>\theta</math> and <math>\phi</math>, and multiply the result with the input signal. The result is that the input ends up in all components exactly as loud as the corresponding microphone would have picked it up.

=== Virtual microphones ===
[[Image:Virtual Microphone Animation.gif|thumb|left|200px|Morphing between different virtual microphone patterns]]
The B-format components can be combined to derive ''virtual [[microphone]]s'' with any first-order polar pattern (omnidirectional, cardioid, hypercardioid, figure-of-eight or anything in between) pointing in any direction. Several such microphones with different parameters can be derived at the same time, to create coincident stereo pairs (such as a [[Blumlein pair|Blumlein]]) or surround arrays.
{| class="wikitable floatright" style="text-align:center"
! <math>p</math> !! Pattern
|-
| <math>0</math> || Figure-of-eight
|-
| <math>(0,0.5)</math>|| Hyper- and Supercardioids
|-
| <math>0.5</math> || Cardioid
|-
| <math>(0.5,1.0)</math> || Wide cardioids
|-
| <math>1.0</math> || Omnidirectional
|}
A horizontal virtual microphone at horizontal angle <math>\Theta</math> with pattern <math>0 \leq p \leq 1</math> is given by

:<math>M(\Theta, p) = p\sqrt{2} W + (1-p)(\cos\Theta X + \sin\Theta Y)</math>.

This virtual mic is ''free-field normalised'', which means it has a constant gain of one for on-axis sounds. The illustration on the left shows some examples created with this formula.

Virtual microphones can be manipulated in post-production: desired sounds can be picked out, unwanted ones suppressed, and the balance between direct and reverberant sound can be fine-tuned during mixing.
{{clear}}

=== Decoding ===
[[Image:Naive Ambisonic Square Decoder Example.png|thumb|200px|right|Naive single-band in-phase decoder for a square loudspeaker layout]]
A basic Ambisonic ''decoder'' is very similar to a set of virtual microphones. For perfectly regular layouts, a simplified decoder can be generated by pointing a virtual cardioid microphone in the direction of each speaker. Here is a square:
:<math>LF = (\sqrt{2}W + X + Y)\sqrt{8}</math>
:<math>LB = (\sqrt{2}W - X + Y)\sqrt{8}</math>
:<math>RB = (\sqrt{2}W - X - Y)\sqrt{8}</math>
:<math>RF = (\sqrt{2}W + X - Y)\sqrt{8}</math>
The signs of the <math>X</math> and <math>Y</math> components are the important part, the rest are gain factors. The <math>Z</math> component is discarded, because it is not possible to reproduce height cues with just four loudspeakers in one plane.

In practice, a real Ambisonic decoder requires a number of psycho-acoustic optimisations to work properly.<ref>Eric Benjamin, Richard Lee, and Aaron Heller, [http://www.ai.sri.com/ajh/ambisonics/BLaH3.pdf ''Is My Decoder Ambisonic?''], 125th AES Convention, San Francisco 2008</ref>

Currently, the All-Round Ambisonic Decoder (AllRAD) can be regarded as the standard solution for loudspeaker-based playback,<ref>Franz Zotter and Matthias Frank, [https://aes2.org/publications/elibrary-page/?id=16554 ''All-Round Ambisonic Panning and Decoding'']. Journal of the Audio Engineering Society, 2012, 60(10): 807-820.</ref> and Magnitude Least Squares (MagLS)<ref>Christian Schörkhuber and Markus Zaunschirm, [https://pub.dega-akustik.de/DAGA_2018/data/articles/000301.pdf ''Binaural Rendering of Ambisonic Signals via Magnitude Least Squares'']. Fortschritte der Akustik, DAGA, Munich, 2018.</ref> or binaural decoding, as implemented for instance in the IEM and SPARTA Ambisonic production tools.<ref name="IEMPI">Daniel Rudrich et al, [https://plugins.iem.at ''IEM Plug-in Suite'']. 2018 (accessed 2024)</ref><ref name="SPARTA">Leo McCormack, [https://leomccormack.github.io/sparta-site/ ''Spatial Audio Real-Time Applications'']. 2019 (accessed 2024)</ref>

Frequency-dependent decoding can also be used to produce binaural stereo; this is particularly relevant in Virtual Reality applications.

=== Higher-order Ambisonics ===
[[Image:Spherical Harmonics deg3.png|right|thumb|300px|Visual representation of the Ambisonic B-format components up to third order. Dark portions represent regions where the polarity is inverted. Note how the first two rows correspond to omnidirectional and figure-of-eight microphone polar patterns.]]
The spatial resolution of first-order Ambisonics as described above is quite low. In practice, that translates to slightly blurry sources, but also to a comparably small usable listening area or ''sweet spot''. The resolution can be increased and the sweet spot enlarged by adding groups of more selective directional components to the B-format. These no longer correspond to conventional microphone polar patterns, but rather look like clover leaves. The resulting signal set is then called ''Second-'', ''Third-'', or collectively, ''Higher-order Ambisonics''.

For a given order <math>\ell</math>, full-sphere systems require <math>(\ell+1)^2</math> signal components, and <math>2\ell+1</math> components are needed for horizontal-only reproduction.

{{See also|Mixed-order Ambisonics}}

Historically there have been several different format conventions for higher-order Ambisonics; for details see [[Ambisonic data exchange formats]].

=== Comparison to other surround formats ===
Ambisonics differs from other surround formats in a number of aspects:
* It requires only three channels for basic horizontal surround, and four channels for a full-sphere soundfield. Basic full-sphere replay requires a minimum of six loudspeakers (a minimum of four for horizontal).
* The same program material can be decoded for varying numbers of loudspeakers. Moreover, a width-height mix can be played back on horizontal-only, stereo or even mono systems without losing content entirely (it will be folded to the horizontal plane and to the frontal quadrant, respectively). This allows producers to embrace with-height production without worrying about loss of information.
* Ambisonics can be scaled to any desired spatial resolution at the cost of additional transmission channels and more speakers for playback. Higher-order material remains downwards compatible and can be played back at lower spatial resolution without requiring a special downmix.
* The core technology of Ambisonics is free of patents, and a complete tool chain for production and listening is available as [[free software]] for all major [[operating system]]s.

On the downside, Ambisonics is:
* Prone to strong coloration from [[comb filter]]ing artifacts due to high coherence of neighbouring loudspeaker signals at lower orders
* Unable to deliver the particular spaciousness of spaced omnidirectional microphones preferred by many classical sound engineers and listeners
* Not supported by any major record label or media company. Although a number of [[Ambisonic UHJ format]] (UHJ) encoded tracks (principally classical) can be located, if with some difficulty, on services such as [[Spotify]].<ref>{{cite web | url=http://www.surrounddiscography.com/uhjdisc/uhjhtm.htm | title=Ambisonic UHJ Discography "Complete List" of record labels }}</ref>
* Conceptually difficult for people to grasp, as opposed to the conventional ''"one channel, one speaker"'' paradigm.
* More complicated for the consumer to set up, because of the decoding stage.
* Sweet spot which is not found in other forms of surround sound such as VBAP
* Worse localisation for point sources than amplitude panning and counter phase signals blurring imaging
* Much more sensitive to speaker placement than other forms of surround sound that use amplitude panning