MIR processing puzzlement

Macker

I'm in the dark about some (well, quite a few) of the "black box" functions in MIR Pro 3D.

Currently, uppermost in my puzzlement is this: what sort of decoding does MIR apply to the wonderfully useful "Stereo - MIR 3D HOA Capsules" virtual mic preset?

1. I'm puzzled because I'm hearing what seems to me nice wide HOA spatialisation from this mic when listening on heaphones, yet when I then pass the mix through dearVR Monitor, the sound field narrows substantially to simulate sound from two frontal stereo loudspeakers. So what does MIR do to decode this particular mic preset before passing stereo signals to my DAW's output? Is the decoding different to that applied to the other virtual stereo mic presets? What exactly am I getting in my headphones from MIR without dearVR Monitor? Does anyone actually need dearVR Monitor if we're interested only in heaphone mixes - not in simulations of loudspeaker reproduction of mixes? If so, why? Wherever possible I'd rather not lose any spatialisation quality by being forced to use an unwanted stage of loudspeaker array simulation.

2. Also, DearVR Monitor (according to its manual) simulates the left-right crosstalk the listener would receive from a stereo pair of real speakers. This of course tends to blur or otherwise degrade spatialisation cues (ITD and ILD). But furthermore, alas, I'm noticing some very tight echo effects from my instruments placed on stage in MIR. There is no hint of these echo effects when listening without dearVR Monitor. (I'm using only the Analytical Dry simulated room, Focus at nominal and headphone 'correction' switched off.)

I'm thinking about this in terms of constructing my own virtual 'stereo' mics in MIR in order to experiment further with HOA spatialisation for my own use. But without knowing anything about MIR's decoding I don't feel equipped to make any meaningful experiments. Of course it might just be me being stupid, and/or perhaps it's my tendency (acquired over many years) to avoid making unsafe assumptions like the plague when dealing with real or conceptual "black boxes" in development situations.

Any way you can help me in this, Dietz? I have quite a few of the notable papers on HOA and on spatialisation more broadly; any citations you think might be helpful would also be much appreciated.

Dewdman42

I will look forward to hearing anything and everything from Dietz, as always...

In the meantime, about dearVRmonitor. This is mainly designed to be a monitoring plugin...not so much as a tool to bounce your final mix through. You'd have to label your final bounce with something like "ONLY LISTEN WITH HEADPHONES" for it to be of any use. yes if you only plan to listen through headphones and you only want your listeners to always use headphones, then you could definitely render out the final mix through dearVRmonitor in order to obtain a binaural mix.

Now the question is do you really need or want to do that if you are working primarily in stereo? Probably not. As you said, dearVRmonitor is introducing cross talk, by design, to simulate as if you were sitting in front of real monitors where both ears hear some audio from both speakers, unlike in headphones where they hear each one side of the stereo mix in isolation.

If your intended audience will only be using headphones...and if you are mixing in stereo..then I see no point to using dearVRmonitor at all.

If you want your audience to listen with headphones but also allow non-headphones... Then you might want to monitor your mix through dearVRmonitor in order to hear the stereo separation that your non-headphone listeners will eventually be experiencing. But do not render the final mix through dearVRmonitor. Only use it to monitor your mix. dearVRmonitor is mainly meant as a monitoring device to simulate a variety of loudspeaker listening environments.

EXCEPT for one possible situation which is if you are doing 2D and 3D mixes, and headphones, then you need to use something like dearVRmonitor in order to hear the immersive audio in your headphones...and if you render the final mix through dearVRmonitor...then label it with "ONLY LISTEN ON HEADPHONES", they would hear the same thing you hear.

Mainly people are not going to distribute 2D/3D binaural renditions as such, they will be distributing Dolby Atmos MP4 renditions which have to be played back on systems that support Dolby Atmos...such as Apple Air Pod pros...or Sonos systems, etc.. Other then VR gaming applications, the main point of using dearVRmonitor binaural is strictly for monitoring purposes in order to simulate real world non-headphone listening environments.

Dietz

There's no "decoding blackbox" in MIR 3D at all, just state-of-the-art Ambisonics decoding, wrapped up in good (I hope!) audio engineering. Just open the Output Format Editor, and you'll see that every parameter is accessible.

However, as soon as you listen to a 3D setup on stereo headphones without manual downmix parameters or a binauralizer you will indeed just hear something arbitrary. Hard to say what happens in your DAW when you do that: Maybe you're just listening to the left and right channels, maybe to a uncontrolled mixture of all channels, maybe something created by an automatic mixdown process - no idea. If you like the sound, just use it, but I can't tell from the distance what it is.

Of course Dear VR Monitor narrows this uncontrolled image with its hard left/right panning. That's actually one of the strongest points for this kind of processing: You get rid of that silly "In-head-localization", even when your source is just stereo. ... the fact that we can hear sources from the back and above to a certain extent by means of binauralization comes as added benefit, actually. :-)

No idea where the echos you seem to experience come from. Maybe time to get in contact with the nice people of Dear VR directly ...?

Kind regards,

Dewdman42

@Dietz said:

No idea where the echos you seem to experience come from. Maybe time to get in contact with the nice people of Dear VR directly ...?

He must be referring to the ambience being added to the listening environments in dearVRmonitor, excluding the "dry" and "directional" ones. They are there by design.

Dietz

I don't think so. Macker wrote:

I'm using only the Analytical Dry simulated room, Focus at nominal and headphone 'correction' switched off."[/QUOTE]

Dewdman42

ah k...missed that. Not sure then. I have not noticed any echos here with my use of dearVRmonitor.

Macker

Dietz, thanks for your reply.

(I think we may have some nomenclature issues here - probably my fault.

I use the engineering jargon term "black box" to mean any piece of hardware or software for which the transfer function - or at least the 'functional processing' - between its inputs and outputs is not known or well defined. Also, I appreciate that the term "decoding" tends to mean something quite specific to many people wherever multi-speaker setups are involved. I should instead use the term "processing" to avoid any confusion with this popular usage. I was implying the question of whether or not there is any sort of intermediate ambisonics channel-processing prior to 'mapping' [i.e. 'decoding'] ambisonics channels to loudspeaker channels.)

You're saying that for this particular virtual mic there might perhaps be "an uncontrolled mixture of all [7 ambisonic] channels"? In other words, this mic is not subject to the same many-to-2 channel processing (or 'decoding") that appears to be obligatory for all other "stereo" virtual mics in MIR? Well, you call this mic "unique" in your little screed about it in MIR. (7 channels is, I believe, the channel count for horizontal-only 3rd order ambisonics virtual mics, whereas there are 16 ambisonics channels for full-sphere 3rd order virtual mics, both regardless of selected loudspeaker configuration). Your remark is indeed puzzling for me.

I guess you've answered my question about whether or not it's first obligatory to 'map' explicitly to a more-than-2 loudspeaker configuraration in order to get a full horizontal or spherical experience on headphones, by telling me to open MIR's Output Format editor - which appears to be for loudspeakers only.

Looks like I'll have to ask VSL Support to see if they can get a dev to spill some beans on this issue that's causing my puzzlement and deterring me from experimenting sensibly with my own speculative capsule configurations in MIR. As for the dearVR Monitor echo issue, I might ask their support crew - or I might not, given that I really don't want (nor, I hope, need) a simulation of loudspeakers in a simulated room, getting in the way of my headphone mixes.

Anyway, I appreciate your taking the time to reply.

Dietz

@Another User said:
As for the dearVR Monitor echo issue, I might ask their support crew - or I might not, given that I really don't want (nor, I hope, need) a simulation of loudspeakers in a simulated room, getting in the way of my headphone mixes.

Binauralization is not about loudspeaker simulation, it's first and foremost about the so-called head-related transfer function (HRTF, i.e. the very specific way our head determines spatial perception). Everything else, like room- and/or speaker simulation, head tracking, headphones profiles and so on are just add-ons. Let us put it straight and simple: If you want to make proper use of 3D audio, you need either 3D monitoring or binauralization for headphones.

Macker

Ok. Thanks to your detailed response I realise I must now do more experimenting in MIR, and read some more papers on Ambisonics (Graz paradigm mainly, but I've also picked up some interesting papers from Helsinki). No point in rushing in like a fool, lol.

I've been mixing binaurally for headphones since I bought the Logic Pro 9 Studio box set a long time ago; I'm now pretty familiar with that and its underlying concepts. However, I've also picked up a recent-ish paper on processing Ambisonic signals binaurally. Ahh what endless fun, lol.

The "black box" thing is relative. For example, what you treat as the clearly understood functionality of some particular subsystem or component, I might have to treat as a black box unless and until I can develop my own reasonably well-grounded understanding of it. Anyway, you've cleared up one part of my curiosity on the black box front: I had wondered if the Graz crew's contribution to MIR includes some UMPA-IEM 'confidential ingredients'; but you've said that all is open and clear from your standpoint. All good.

I must say I find your English is not only fluent but also of a high standard among native English speakers. No worries, I'll always endeavour to bear with you linguistically - and try to avoid letting myself fall into becoming a typical crusty and lazy old imperialist Brit, lolol.

Later.

Dietz

The only "secret sauce" is the upscaling process the genius people at IEM Graz developed to raise our 1st order RoomPacks to 3rd Order (... they suggest 7th order, actually, but then it would be like back in the days when a Mac could process two or three instances of AltiVerb in real-time :-D ...). I know in principle what they did, and I know a little about the sonic possibilities as well as the potential pitfalls of their procedures, having helped them evaluate different approaches acoustically, but the math involved is _far_ beyond my comprehension. 8-/

.... an important sidenote to the occasional reader of our conversation: While some basic knowledge of Ambisonics can be quite helpful in better understanding MIR 3D, it is by no means a prerequisite. :-)

Dewdman42

I have an outputFormat question. The manual talks about the mic array vs the loudspeaker sections, and I am wondering what happens when the faders are up on both sections...does that mean that the mic array results are summed together with the coefficients fader results (if and when coefficients exist)??

Also, another question, on the mic array section there is a slider called "Wet Volume Offset". What does that do exactly as an "offset"?

Rubens Tubenchlak

@Dietz said:

If you want to make proper use of 3D audio, you need either 3D monitoring or binauralization for headphones.

Dietz, how can I make the binauralization you have mentioned using DearVR monitor? I am using Digital Performer. I assume that I can achieve that using DearVR monitor in any DAW right?
DearVR monitor should be inserted at the master fader channel with which preset? Must be ON when bouncing right?

Dewdman42

Rubens, FYI, I did not get very good results with DP11 and surround audio. It can not support 3D audio at all, only 2D...and I managed to crash it several times while attempting to do it. I cannot recommend DP at this time for surround work of any kind...and mind you that is my preferred DAW. I have more or less decided to just do my midi work in DP, and once I have that done, then I can bounce the tracks, take them over to another DAW with better surround support such as logicPro, cubase or even ProTools and mix it from there. Make sure to send feature requests to MOTU to add atmos support like everyone else. They are way behind the curve on that one.

That being said... The way to use dearVRmonitor in general is to put it on the master bus of a surround project. In DP that means you create a surround bundle, as 5.1. use that bundle as the output of the project, and all the instrument channels sending to that bundle. put MirPro3D on to each instrument channel after you have established they are surround channels rather then stereo. Make sure you're using 5.1 version of MP3D plugin each time. Use one of the 5.1 outputFormats in MirPro3D. On the master channel, put dearVRmonitor, which will convert the 6 channels of 5.1 audio into two channels of binaural. That's it. use it for monitoring through headphones, not so much for bouncing.

Dietz

@Another User said:
Also, another question, on the mic array section there is a slider called "Wet Volume Offset". What does that do exactly as an "offset"?

Dry and Wet Volume Offset allow for different gain settings after the decoding in respect to the capsule's Volume set by the top-most fader (0 means "no offset - unity gain"). I used this a lot in 1st Order surround setups to create some bias towards the front speakers for the dry signal. for example.

Dietz

@Dietz said:

If you want to make proper use of 3D audio, you need either 3D monitoring or binauralization for headphones.

Dietz, how can I make the binauralization you have mentioned using DearVR monitor? I am using Digital Performer. I assume that I can achieve that using DearVR monitor in any DAW right?
DearVR monitor should be inserted at the master fader channel with which preset? Must be ON when bouncing right?

Like Dewdman42 I was under the impression that Digital Performer isn't 3D audio compatible. But in any case: A binauralization device like Dear VR Monitor should be used for monitoring (as the name implies). In Cubase/Nuendo there's a dedicated Control Room section for this task, and I created a setup that mimics this concept in ProTools, too. The routing principle is always the same: 3D (or surround, or stereo) input -> Binauralizer with the fitting input settings -> stereo monitoring output.

.... if you want to bounce this binauralized signal as a file, you would use the same approach in your master channel.

Dewdman42

@Dietz said:

They are simply summed, yes. After all, they are just different methods to derive output from the same source. Several factory presets use this approach to strengthen the perceived position of the dry signals in an otherwise predominantly coefficient-based wet signal environment.

Dry and Wet Volume Offset allow for different gain settings after the decoding in respect to the capsule's Volume set by the top-most fader (0 means "no offset - unity gain"). I used this a lot in 1st Order surround setups to create some bias towards the front speakers for the dry signal. for example.

Thanks for that explanation! Makes sense.

Products

Support

News

Music

Academy

Synchron

Starter

Software

VI & More

Instrumentology

Discover Strings

Discover Brass

Discover Woodwinds

Software Manuals

Instrument Manuals

Tutorials & FAQs

Company

Vienna Symphonic Library Forum

Forum Statistics

MIR processing puzzlement