VIRTUAL ENVIRONMENT SIMULATION - ADVANCES IN THE DIVA PROJECT

Lauri Savioja , Jyri Huopaniemi , Tapio Lokki , and Riitta Väänänen
Helsinki University of Technology
Laboratory of Acoustics and Audio Signal Processing
P.O. Box 3000, FIN-02015 HUT, Finland
Helsinki University of Technology
Laboratory of Telecommunications Software
P.O. Box 1100, FIN-02015 HUT, Finland
Lauri.Savioja@hut.fi, Jyri.Huopaniemi@hut.fi, Tapio.Lokki@hut.fi,
Riitta.Vaananen@hut.fi
http://www.tcm.hut.fi/Research/DIVA/

Abstract:

At ICAD'96, a real-time virtual audio reality model was presented, which included model-based sound synthesizers, geometric room acoustics modeling, binaural auralization for headphone and loudspeaker listening, and high-quality animation. The DIVA environment is an integrated implementation of a virtual reality system currently aiming at a virtual symphony orchestra performance. In the current version of the software, multiple sound sources (physical models of musical instruments) are conducted by a virtual conductor (controlled by a position tracker with 3 transmitters). The real-time calculation of auralization has been enhanced by accurate HRTF approximations, a new late reverberation model, and by an efficient image source method.

INTRODUCTION

The creation of virtual auditory displays for applications of virtual reality and multimedia requires an interdisciplinary approach and integration of several components. Advances in virtual acoustic environment simulation have been summarized by, e.g., Wenzel [14] [15] and Begault [2]. This paper reports the current status of the Digital Interactive Virtual Acoustics (DIVA) project carried out at the Helsinki University of Technology, where virtual environment modeling and simulation aspects have been studied from the audiovisual point of view, integrating both visual and three-dimensional audio cues [12] [4]. The design goals of the overall project have been to create a virtual musical event that is as authentic as possible both in terms of audio and visual quality. In the next chapter, modifications to the image source based real-time auralization system are presented. The improvements include an advanced visibility checking algorithm, image source interpolation, and optimized HRTF processing. In the following chapter, the auralization implementation is overviewed. A new method for late reverberation simulation has been adopted. A case study of the DIVA system is presented in the final chapter.

EFFICIENT REAL-TIME IMAGE SOURCE ALGORITHM

The basis of the image source method that we use is presented thoroughly in many articles [1] [3]. The real-time communication and the updating rules of the image sources are described in our earlier article [4]. In this section we concentrate on some performance issues and handling of multiple sound sources in the image source method.

Methods for Reducing Calculation and Memory Requirements

The major problems in the image source method are the computational and memory requirements. Figure 1 represents a geometric model of a 13th-century Gothic cathedral, Marienkirche (which is under construction to be rebuilt as a concert hall with 1200 seats), in Neubrandenburg in northern Germany. This geometry has 480 surfaces that may produce valid image sources. In a typical situation with this model, only 5-10 first order image sources are visible.

To avoid unnecessary calculation, only the image sources which might come visible during the first reflections must be searched. To achieve this we make a preprocessing run with ray tracing to check visibilities of all surface pairs. During the image source calculation, the image sources are reflected only to those surfaces which are visible to the original reflecting surface instead of all surfaces that have the normal toward the source. With the help of this technique the amount of image sources reduces with second order images about 65 % and with third order image sources about 85 %. For example in the case of Fig. 1 the number of possible second order image sources reduces from 110000 to about 40000. Of these possible image sources approximately 20-30 are visible to the listener simultaneously. Another task with the image source method which requires a lot of calculation capacity is the visibility checking. In geometrical terms it needs a lot of intersection calculations of surfaces and lines. To reduce that amount we use an advanced geometrical directory EXCELL [13].

Figure 1: The example geometry of Marienkirche in Neubrandenburg, Germany, which has 480 surfaces.

Dynamic Multiple Source Calculation

To avoid transients in the dynamic auralization process, interpolation of auralization parameters has to be performed. A typical update rate of the direct sound and image-source parameters is 20-30 Hz. Interpolation is carried out for the 1/r gain, propagation delay, ITD, and minimum-phase HRTFs. To obtain the exact distance of each image source, fractional delays are used [10]. In an updated version of the DIVA system it is possible to use multiple sound sources. The direct sound emanating from each source is calculated individually, but if the sources are close to one another the image sources (reflections) are common for all. The common image sources are calculated from the point which is the average point of the multiple source points. Naturally every new sound source increases computational load.

Efficient HRTF Calculation

In the DIVA setup, both headphone (binaural) and 2-loudspeaker (transaural) output formats are supported. Efficient ways for HRTF filter design and implementation have been considered in [5]. Based on these results, the use of 30-tap FIR or even 14th order Warped IIR filter realizations of HRTFs is motivated when minimum-phase reconstruction is applied. Depending on the calculation capacity, HRTFs can be applied to image sources as well as to the direct sound. A more detailed discussion can be found in [12] and [5].

AURALIZATION IMPLEMENTATION

The revised auralization unit of the DIVA system is shown in Fig. 2. This figure shows the process for auralizing one sound source, but the system is expandable to multiple sources as presented in the previous chapter. In the following, the blocks concerning real-time auralization are discussed in greater detail. In Figure 2, the filters implement the air absorption including distance attenuation, the material absorption of the walls [6] and the directivity of the sound source [9]. The filters implement the binaural filtering by adding a direction-dependent interaural time delay to the direct sound and reflections and by using minimum-phase HRTFs for directional filtering [5]. These binaural early reflections are fed to the output of the system together with late reverberation. In addition, a cross-talk canceling module (not shown in the figure) can be attached to the output in order to obtain transaural output. In the late reverberation module, the gain b(r) is dependent on the distance between the sound source and the listener. It adjusts the level of the early reflections fed to the late reverberation unit so that the level of the late reverberation remains approximately constant (property of a diffuse sound field). The late reverberation unit consists of several parallel feedback loops which contain a delay line DLk, a comb-allpass filter, and a lowpass filter [16]. The lowpass filters implement the frequency-dependent reverberation time (similarly as in [7]). They are implemented typically as one-pole lowpass filters. The comb-allpass filters bring about a diffusing effect on the recirculating signal in the feedback loops by increasing the reflection density of the reverberator output. The feedback connection resembles that of a Feedback Delay Network system (FDN) where the outputs of each delay line are connected to the inputs of the delay lines through a feedback matrix [7] [8] [11]. The feedback connection in our system corresponds to a feedback matrix which is unitary and circular, and contains different elements on the diagonal than elsewhere in the matrix. Compared to the FDN, our reverberator requires less computation, and the reflection density grows faster as a function of time. An advantage of FDN as well as the proposed reverberator is that the outputs of the delay lines contain energy from the modes of all the other delay lines and they are mutually incoherent. This gives a possibility to take outputs from different delay lines to each output channel. This yields a pseudostereophonic effect because of incoherence of late reverberation between different channels [7].

Figure 2: Illustration of the revised auralization scheme in the DIVA system.

APPLICATIONS

A number of applications for virtual acoustic environment simulation have been considered during the scope of this research. Firstly, this system has been proven useful for both acousticians and architects in the design, auralization and animation of planned concert halls. It is also a valuable research tool for acousticians and DSP experts in the research of 3-D sound, room acoustics and auralization. In the following, another example that involves both conductor and listener interaction with the virtual environment is presented.

Virtual Symphony Orchestra Performance - A Case Study

The other issues that the DIVA system encompasses are real-time animation of players, conductor following and real-time sound synthesis with physical models of instruments. When the entire system is running, the maestro (conductor) conducts the virtual orchestra using a baton that is connected to a motion tracker. The virtual players play their instruments with real fingerings animated according to the strokes of the conductor. Simultaneously another user (listener) can fly in the concert hall and see the animated virtual orchestra and hear the auralized sound output. A prominent application for this specific case study is the teaching of conductors, and the reply from experienced conductors using this system has indeed been very positive and encouraging. In the current version, four instruments are implemented; flute, guitar and double bass with physical models and one MIDI drum set all as individual sound sources. For all this calculation three Silicon Graphics workstations (two Octanes and one O2) are utilized and the computers communicate through an Ethernet network [4]. In Fig. 3, the setup of the system used in the case study is illustrated.

SUMMARY

Recent developments in the DIVA virtual environment simulation system have been presented. A case study that was presented in this paper (see http://www.tcm.hut.fi/Research/DIVA/ and http://www.medios.fi/pes/neu/index.htm for details) was featured as an interactive real-time performance in the Electric Garden at SIGGRAPH'97
(http://www.siggraph.org/s97). A demonstration will be given at ICAD'97 illustrating the current system and the results of the case study.

Figure 3: Overview of the DIVA system configuration.

References

1: Allen, J., and Berkley, D. Image method for efficiently simulating small-room acoustics. J. Acoust. Soc. Am., vol. 65, no. 4, pp. 943-950, 1979.
2: Begault, D. R. 3-D Sound for Virtual Reality and Multimedia. Academic Press, 1994, 293 p.
3: Borish, J. Extension of the image model to arbitrary polyhedra. JASA., vol. 75, no. 6, pp. 1827-1836, 1984.
4: Huopaniemi, J., Savioja, L., and Takala, T. DIVA virtual audio reality system. Proc. Int. Conf. Auditory Display (ICAD'96), Palo Alto, California, Nov. 8-11, 1996, pp. 111-116. Original revised version: <URL http://www.santafe.edu/~icad/ICAD96/proc96/huopan.htm>
5: Huopaniemi, J., and Karjalainen, M. Review of digital filter design and implementation methods for 3-D sound. Presented at the 102nd Audio Engineering Society (AES) Convention, preprint no. 4461, Munich, Mar. 22-25, 1997.
6: Huopaniemi, J., Savioja, L., Karjalainen, M. Modeling of reflections and air absorption in acoustical spaces - a digital filter design approach. To be published in: IEEE 1997 Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, New York, Oct. 19-22, 1997.
7: Jot, J.-M. Etude et Realisation d'un Spatialisateur de sons par Modeles Physique et Perceptifs. PhD Thesis, l'Ecole Nationale Superieure des Telecommunications, Telecom Paris, 1992.
8: Jot, J.-M., Larcher, V., and Warusfel, O. Digital signal processing issues in the context of binaural and transaural stereophony. Presented at the 98th AES Conv., preprint 3980, Paris, France, 1995.
9: Karjalainen, M., Huopaniemi, J., Välimäki, V. Direction-dependent physical modeling of musical instruments. In Proc. International Congress on Acoustics (ICA'95), Trondheim, Norway, Vol. III, pp. 451-454, June 26-30, 1995.
10: Laakso, T. I., VŠlimŠki, V., Karjalainen, M., and Laine, U. K. Splitting the unit delay. Signal Processing Mag., vol. 13, no. 1, pp. 30-60, Jan. 1996.
11: Rocchesso, D., and Smith, J. O. Circulant and elliptic feedback delay networks for artificial reverberation. IEEE Trans. Speech and Audio Processing, vol.5, no.1, pp. 51-63, 1997.
12: Takala, T., HŠnninen, R., VŠlimŠki, V., Savioja, L., Huopaniemi, J., Huotilainen, T., and Karjalainen, M. An integrated system for virtual audio reality. Presented at the 100th Audio Engineering Society (AES) Convention, preprint no. 4229, Copenhagen, May 11-14, 1996.
13: Tamminen, M. The EXCELL method for efficient geometric access to data. Acta Polytechnica Scandinavica, Mathematics and Computer Science Series, no. 34, 1981.
14: Wenzel, E. M. Spatial sound and sonification. In: Kramer, G. (ed.). Auditory Display. SFI studies in the sciences of complexity, proc. vol. XVIII, Addison-Wesley, 1994, pp. 127-150.
15: Wenzel, E. M. What perception implies about implementation of interactive virtual acoustic environments. Presented at the 101st AES Conv., preprint 4353, Los Angeles, California, Nov. 8-11, 1996.
16: Väänänen, R., Välimäki, V., Huopaniemi, J, and Karjalainen, M. Efficient and parametric reverberator for room acoustics modeling. To be published in: Int. Comp. Music Conf. (ICMC'97), Thessaloniki, Greece, 1997.