Home Computer Audio Asylum

Music servers and other computer based digital audio technologies.

Temporal resolution and RBCD

Thank you for bringing this work to our attention.

Temporal resolution of human hearing at 5.6 μs (~178k Fs) or better is not a surprise. It’s great to see time domain research being done to this level of accuracy. The paper’s conclusion that such high recording bandwidth and format is required for full fidelity is not clear though.

Let’s start here:

It also appears that the cochlea may sense ultrasonic stimulation if the latter manages to reach the cochlea in sufficient intensity, both when presented through the air (Henry and Fast, 1984; Ashihara et al., 2006) but especially when presented through bone conduction (Corso, 1963; Deatherage et al., 1954; Lenhardt et al., 1991; Lenhardt, 1998). It has also been conjectured that such high level ultrasound may possibly change the perception of timbre when superimposed on audible harmonics (Oohashi et al., 1991; Yoshikawa et al., 1995).

That is, human hearing is capable of (indirectly) perceiving ultrasonic frequencies. This is further described in the experiment as follows:

In the main experiment, subjects try to discern differences between a (7 kHz approximately square-wave shaped) signal with finite low-pass filtering versus a control signal with no filtering (waveforms depicted in Fig. 3). The control tone was perceived to have a sharper or brighter timbre whereas the filtered one had a duller quality (no difference in loudness was perceived except for the largest setting of τ=30 μs).

Indirect perception of ultrasonics is not in dispute. However, a similar time resolution is observed by displacing the speaker:

In a sister experiment (Kunchur, 2007), where signals were temporally altered by spatial displacement of speakers instead of by electronic means, a similar threshold of 6 μs was found. That work also makes a rudimentary neurophysiological estimate for the temporal resolution for transient stimuli that is in the 2–16 μs range.

In this test the low pass filter remains unchanged (i.e. we have the same ultrasonic information presented) but differences down to 6 μs are observed (through speaker time alignment changes).

By upsampling 44.1k to 192k we increase time resolution from 22.7 μs to 5.2 μs. Of course 44.1k is not ideal (as ultrasonics is limited to 22.05kHz) but with correct upsampling (i.e. bandlimited interpolation) we can get very close. Using Dr Kunchur’s words “23 μs rise/fall times that characterize the 44.1 kHz sampling rate of the digital compact-disk” is transformed to 5.2 μs rise/fall times (which is better then the desired time resolution of 5.6 μs).

Unfortunately, upsampling is not free from artifacts as we have ringing and overshoot to deal with. Interestingly, neither is high res:

Additionally, restricting the bandwidth by low-pass filtering necessarily attenuates all frequencies to some extent, and hence spectral amplitude changes can never be avoided absolutely (even when 1/τ> > fmax); how those amplitude changes affect timbre will depend on their magnitudes relative to the relevant just noticeable differences. For these reasons it can be expected that limiting the bandwidth of an audio signal by low-pass filtering may produce an audible change, even when the high-frequency cutoff (or equivalently [2πτ]−1) is well above fmax.

Some compromises will always be needed.



This post is made possible by the generous support of people like you and our sponsors:
  Schiit Audio  


Follow Ups Full Thread
Follow Ups

FAQ

Post a Message!

Forgot Password?
Moniker (Username):
Password (Optional):
  Remember my Moniker & Password  (What's this?)    Eat Me
E-Mail (Optional):
Subject:
Message:   (Posts are subject to Content Rules)
Optional Link URL:
Optional Link Title:
Optional Image URL:
Upload Image:
E-mail Replies:  Automagically notify you when someone responds.