Basic Understanding of Virtual Reality Fundamentals





























Virtual Reality Training Manual




Version 1.0



Mike Brown














1        Table of Contents



1     Table of Contents 2

2     Introduction_ 5

2.1     Background_ 5

2.2     Goals 7

3     Training Agenda_ 8

3.1     Understanding Basic Principles 8

3.2     Modeling Techniques - I 8

3.3     Modeling Techniques – II 8

3.4     Scene (Run-time) Development 8

3.5     Behavioral Development 8

3.6     Facility Roles and Responsibilities 8

4     Fundamentals of Human Perception_ 9

4.1     Visual 9

4.2     Audio_ 12

4.3     Tactile / Haptics 13

4.4     Smell (Olfactory) 15

4.5     Taste_ 16

4.6     Vestibular 17

5     Fundamentals of Virtual Reality 18

5.1     A definition of Virtual Reality (What is VR?) 21

5.1.1       Virtual Environments 22

5.1.2       Augmented Reality terminology_ 22

5.1.3       Interactive Computer Graphics vs. Virtual Reality_ 22

5.2     History of VR_ 23

5.3     Photo-based VR_ 32

5.3.1       QTVR Panoramic Movies 33

5.3.2       QTVR Object Movies 36

5.4     Model-based VR Geometry 37

5.4.1       Modeling Philosophy_ 37

5.4.2       Modeling Techniques 38

5.5     Model-based VR Rendering_ 42

5.6     VR Resources 44

5.7     VR System Diagram_ 45

5.8     VR Hardware_ 46

5.8.1       Image Generators 48

5.8.2       Display Systems 56

5.8.3       Tracking Systems 71

5.8.4       Sound Systems 84

5.8.5       Interactive Devices 87

5.8.6       Computing Environment 109

5.8.7       Network Capability_ 111

5.9     VR Software_ 112

6     VR Applications / Industry 114

6.1     Architecture_ 115

6.2     Manufacturing_ 116

6.3     Advanced Vehicle to Terrain Simulation_ 117

6.4     Medical 118

6.5     GeoScience_ 119

6.6     Chemical & Specific sciences 120

6.7     Aerospace_ 121

6.8     Product Evaluation_ 122

6.9     Training & Education_ 123

6.10       Entertainment 124

6.11       Augmented Reality 125

7     VR Market and Demographics 127

8     3D Modeling_ 129

8.1     Spatial Modeling_ 130

8.2     Visualization_ 138

8.2.1       Image (or Texture) Mapping_ 140

8.2.2       Reflection Mapping_ 142

8.2.3       Bump Mapping_ 148

8.2.4       Opacity and Other Mappings 153

8.2.5       Lighting_ 154

8.3     Tessellation_ 161

9     Data Translation_ 164

9.1     Translation Methods 164

9.2     Model Optimization_ 168

10      Scene Development 171

10.1       Scene Graph Basics 172

10.2       Scene Graph Construction_ 173

10.3       Scene (Run-time) Optimization_ 175

10.3.1      Tips on Lowering Polygon Count 176

10.3.2      Lower the Virtual Texture Resolution_ 176

10.4       Level-Of-Detail (LOD) 177

10.5       Visibility Culling_ 178

10.6       Collision_ 181

10.7       Audio_ 185

10.8       Behavioral Development 188

10.9       Camera / Animations 189

10.10     Multiprocessing Presentations 192

10.11     Collaboration and Distance Learning_ 201

10.12     Scene Description Language_ 205

10.12.1        OpenGL_ 205

10.12.2        VRML_ 207

10.12.3        X3D_ 210

10.12.4        JTOpen_ 211

10.12.5        Others 213

10.12.6        Shaders 215

11      VR Facility Roles and Responsibilities 219

11.1       VR Facilitator 219

11.2       Translation Specialist 219

11.3       Modeling Specialist 219

11.4       Simulation Specialist 219

11.5       Application Development Specialist 219

11.6       System Administration_ 219

11.7       Maintenance / Facility Administration (Production Support) 219

11.8       Training Specialist 219

12      Appendix A: Glossary 220

13      Appendix B: Eye Terminology 226

14      Appendix C: Ear Terminology 228

15      Reference Bibliography 229



2        Introduction


2.1      Background


This training course will use a lecture and exercise style of teaching on how Virtual Reality can be used to benefit many tasks in our present day business environment.  This course can be segmented into sessions that will be used to help build a comprehensive understanding of how this technology can be used.


Virtual Reality (VR) is a powerful context, in which you can control time, scale, and physical laws. Participants have unique capabilities, such as the ability to fly through the virtual world, to occupy any object as a virtual body, to observe the environment from many perspectives. The ability to understand multiple perspectives is both a conceptual and a social skill; enabling users to practice this skill in ways we cannot achieve in the physical world may be an especially valuable attribute of VR.


What is VR?


Take a quick look at some of these researchers that have developed their own unique ideas of what VR is.


“Virtual Reality is a high-end user interface that involves real-time simulation and interactions through multiple sensorial channels.  These sensorial modalities are visual, auditory, tactile, smell, taste, etc . . .”

 - Gregory Burdea (Doctor of Computer Science at University of Rutgers, Computer Science Department)


“. . . Multi-sensorial interaction produces immersion or the ‘suspension of disbelief.’ Imagination is related to the brain ability to compensate for system shortcomings and to the developer ability to create useful VR applications.” – Burdea displays his VR Triangle or I3 acknowledgement of perhaps what VR is.







                    Interaction                                           Imagination


            Virtual Reality Triangle


Virtual Reality is a way for humans to visualize, manipulate and interact with computers and extremely complex data” – Jerry Isdale (Research Staff Scientist - Human Centered Systems Lab of HRL Labs)


“The immersion of people into a computer-generated environment.” – Resource for Science Education Program




While there are many different ways to articulate the definition of Virtual Reality systems, it is not defined as the one tool or process that many could perceive as the end-of-all cognitive thinking.  In this course, we will discover the history and thought process of developing a uniquely devised environment that certainly could stand to gain a new perspective on specific industrial knowledge or even “life” itself.  Of course, here we will focus on simply the basic fundamentals to creating a successful VR system that not only includes the understanding of specific components, but also sees the value in certain applications within many industries.


While VR is constantly developing new technologies and processes it has come across many barriers that seem to slow down the progress at times.  During the early 1990’s VR was clearly oversold, leading to customer dissatisfaction.  Many small VR companies disappeared or were simply sucked up into very large corporations who thought they had an understanding of how this new technology could help create more business and profit for them.  However, new developments in PC hardware, reduction in costs, progress in input/output devices, and the introduction of large volume displays lead to a rebirth during the late 90’s. 


Why use VR?


While there are many questions as to how this technology works and how it can help you, we often forget another important question that should inevitably be our first real question.        Why?


There are indeed many good technologies and techniques that can be used within any given industry that does not require a form of human interaction.  Simulation is a prediction of complex situations that can be used on a daily basis to govern the outcome of specific applications without human contact.  So why need VR?


Would a student learning how to drive notice a learning advantage using VR as opposed to a simulation of driving skills as some American teenagers discovered back in the 1970’s?  As the author, I have had first hand experience in using a driving simulator back in 1977 that visually gave me specific clues as to what to look out for while driving in certain situations.  I drive rather well now, without the need of ever using VR.  So would I have gained that much more knowledge to becoming a better driver?  Who knows? 


Some simulations can get very expensive, for example, training new personnel on Oil rigs in the ocean was done by introducing them to an Oil rig in a Louisiana lake that costs millions of dollars to maintain and yet only gives you a few attributes to using a rig on water.  Creating a Virtual Environment of an Oil rig has enabled these same folks to endure many different situations and oceanic probabilities costing far less than a full-scale model of this application.


Training with interactive abilities can give the sense that you are changing the outcome.  Especially for dangerous training exercises including; a reduction in time where technicians are exposed to radiation, driving and /or maintaining a large vehicle or piloting a ship can be cumbersome while using the real medium and perhaps it could help with developing a “mental model” of complex situations.


Having said all of that - VR has been quite successful with many applications in specific industries. 


“You probably can not buy a vehicle that was not designed in VR or fill it up with gas without the aid of VR.” - Jaron Lanier (Research Scientist for VPL, coined the phrase “Virtual Reality”, 1982)


An example of a successful application is that transportation corporations are progressively reducing design costs by using VR methods as an effective process to slowly eliminating expensive physical prototypes.


VR has given us a new way to observe specific problems.  Perhaps as the physical constraints of our cognitive thinking process are progressively lifted, there would be more opportunities to expand out knowledge, thus creating newly undiscovered methods and processes to delivering a better product, process, or solution.



Where to use VR?


Virtual Reality can be quite a useful tool as long as the customer or application does not have a preconceived notion that this technology can do everything and anything.


Using VR allows for exploration with:


·         places that would otherwise be inaccessible because of location or danger such as; studying volatile weather patterns, mining exploration, nuclear energy catastrophes, and oceanic / atmospheric research, 


·         abstract concepts that are otherwise difficult to conceptualize,


·         real systems that could not be examined without altering scale and time.





Next, let us understand what the goals of this course are with the following section.


2.2      Goals


The objective of this course is to understand and demonstrate a basic knowledge of Virtual Reality techniques and methods within specific industries that can be repeatedly performed for specific industrial applications.


One of the many goals is to understand the rudiments of Human perception.  We will explore the different senses that we humans use while considering our perceptions and how we use them in our surroundings.


Next we will discern the fundamentals of Virtual Reality Systems by defining in more detail what VR is and look at what could make up a virtual environment.  By taking some of the technology out of the research laboratory and bringing it into our lives, we can see how Virtual Reality may augment other applications through what is known as “Augmented Reality”.  We will take a quick look at the differences between some interactive computer graphic applications and virtual reality.


Understanding the history of the development and thought process of virtual reality can perhaps enlighten us to grasp why virtual reality was started in the first place.


We will briefly look at different kinds of VR modeling approaches, techniques, processes and see different languages developed for these virtual environments.


We will gain knowledge in understanding what VR resources are available today and possibly in the near future, by looking at possible system architectures, and the many different components that can make VR exciting not only to see but to allow oneself to immerse themselves into some of these uniquely developed environments.


We will show some examples of how you can apply this knowledge to many applications in many industries.


By now, we should have a solid understanding of how, why and where VR can be deployed.  Next, developing a 3D model and 3D environment can be cumbersome and thought provoking.  We will need to look at different techniques to developing 3D modeling and how to visualize these newly created models.  We will look into optimizing theses models and environments that can significantly change the outcome and performance of these environments.


An environment may have different names, such as a Scene Graph.  So we will need to investigate different construction and optimization approaches for developing an efficient, effective and inexpensive environment.


Prior to creating the final presentation of a specific application, you will want to understand some of the scene modeling techniques that can make a great impact on the outcome of your presentation.  We will discuss different behaviors that can be applied to most applications and see how to make animations or simulations of your virtual world.


Some applications require a large resource of personnel to overcome the complexity and wealth of information.  We will look at the different individuals that can make a rather large system, very controllable and efficient.











There are added appendix sections that go into more detail of specific terminologies that you may want to investigate.

3        Training Agenda


Discuss how each topic will be segmented into several sessions throughout a predetermined amount of time using lecture style teaching methods and various exercises with the discussed material.


3.1      Understanding Basic Principles

Discuss Fundamentals of Human Perception.  Define what Virtual Reality (VR) is.  Demonstrate different techniques and processes using VR tactics and machinery.  Discuss how VR can be marketable and show how specific industries are using these products and techniques.

3.2      Modeling Techniques - I

Discuss Basic 3D modeling techniques such as; spatial modeling, visualization characteristics, and optimization processes.

3.3      Modeling Techniques – II

Further expound upon the visualization techniques such as; textures, lighting, and shaders.

3.4      Scene (Run-time) Development

Discuss the development of large-scale environments that would include such attributes as Material properties and their behaviors to other materials, physics attributes with static and dynamic machinery, and include optimization development.

3.5      Behavioral Development

Discuss the techniques in developing behaviors that are designed for a specific purpose and demonstrate possible methods to developing these attributes.

3.6      Facility Roles and Responsibilities

 Discuss the current roles and responsibilities that make the facility work and show how this can be integrated with real-world industrial roles such as; designing vehicles, or perhaps developing a digital manufacturing plan.



4        Fundamentals of Human Perception


Discuss some of the intricacies of the human perception including, vision, hearing, touch, smell, taste, and understanding the balance of controlling these senses or vestibular reactions. 

4.1      Visual

Anatomy of the Eye  


The eye is a complex sensory organ specialized for the gathering of visual information. Each eye includes a lens system to focus the image, a layer of photosensitive cells, and a network of cells and nerves that collect, process and transmit visual information to the brain, all surrounded by a fibrous protective globe. The eyes are housed in protective bony structures of the skull called the orbits. Each eye is composed of a tough outer layer, the sclera and cornea; a middle layer, the choroid, ciliary body and iris; and the inner layer of nerve tissue called the retina. The photosensitive retina connects to the brain via the optic nerve.


Click a term for more information
Illustration by Mark Erickson


When you look at an object, light rays are reflected from the object to the cornea, which is where the miracle begins.  The light rays are bent, refracted and focused by the cornea, lens, and vitreous. The lens' job is to make sure the rays come to a sharp focus on the retina. The resulting image on the retina is upside-down.   Here at the retina, the light rays are converted to electrical impulses that are then transmitted through the optic nerve, to the brain, where the image is translated and perceived in an upright position. [1]


For a more descriptive understanding of how the eye works and its terminology look at Appendix B.


Dr. Oliver Staadt at the University of California, Davis has written on many topics for “Human Factors Perception”.  Here are some of his research conclusions within the Visual Perception. [2]


Light enters the cornea, passes through the pupil, lens, vitreous humor, and nerve fibers and blood vessels at the front of the retina.  Photons trigger nerve impulses in our photoreceptors, also known as Rods and Cones.


Our Cones perceive color through the color spectrum at 400 – 700 nm.  Our visual field appears colorless at low-light conditions.


Our Rods perceive brightness that can be described as Photoropic.  Our eyes adjust more quickly to bright light than to dim light.  It can take just a few minutes to capture bright information, while it may take up to an hour for the greatest sensitivity for dark perceptions.


The interpupillary distance (IPD) for most male adults falls between 50-76 mm.  This is an important topic when discussing Virtual Reality’s visual technologies.


The Field of View has the eyes able to rotate comfortably around 40 degrees from center.  The single eye can move about 150 degrees horizontally and about 120 degrees vertically.  A combined set of eyes around 180 to 200 degrees horizontally and seeing about 30 to 35 degrees of monocular vision per eye.


Our eyes have a resolution of about 1 arc minute and can discern .01-inch detail a distance of 3 feet.


The human perception has sensitivity to flickering images and usually will have a flicker-free resolution of 60 Hz.  Any screen refresh rate should meet or exceed this value and would certainly be necessary for a condition that can invoke visual fusion.


There have been many studies that try to understand the “Visual Depth Perception” that have been categorized as either a primary category, including; monocular (extrastereoscopic) cues, oculomotor cues, stereopsis and additional cues such as motion.


Monocular Depth Cues are linear and can change in size of a moving object.  There can be an interposition that includes: occlusion, obscuration, and contour interruption.  There can be textural gradients that change in apparent density in object surface patterns.  The proximity-luminance covariance shows that brightly illuminated scenes are generally perceived as closer than dimly illuminated objects.


Other monocular cues can be described as an aerial perceptive for the color attenuation caused by scattering and absorption of light by the atmosphere.  Nearer objects have more color saturation and brightness contrast.  Shadows convey depth to a viewer.  Highlighting can be light reflection off of certain curved surfaces.  Another cue may be the difference in size of two objects, which are believed to be similar, or the same while having knowledge of an object’s actual or true size.  Also, higher objects in the visual field are commonly farther away.


Some Oculomotor Cues accommodate the physical stretching and relaxing of the lens in the eye.  The range of accommodation decreases with age.  Vergence is the rotation of the eyes in opposite directions.  Convergence is the rotation of the eyes in the inward direction, corresponding to viewing an object moving toward the viewer.  Muscular feedback is also considered an oculomotor cue as well.


Stereopsis and Binocular Disparity is described as a separation between the eyes, different images arrive at each eye, and the difference is used to extract depth field information.  If images aren’t properly fused, binocular rivalry may occur and the visual system may suppress on of the two images.


An additional cue certainly could be Motion where cues are provided about the distance of objects, especially when Stereopsis doesn’t provide any cues.  Also a Motion parallax can be described as when driving, near objects viewed from the side window seem to move faster than far objects.  Another motion cue can be seen through rotating objects.


Image Courtesy of The Journal of Neuroscience




Here is a test of viewing stereo without any device to manipulate your eyes.  Try this test by trying to stare at the two red dots, until you see a new center red dot.  Focus on the center red dot.  You should be able to see three pictures in your periphery while focusing on the center photo.  If you are left-eye dominant, then the two pictures above would best suit you.  However, if you are right-eye dominant, then the two bottom pictures would work best for you.  Try both as a test on your ability to see stereopsis.


Left Image

Left Image


Left Image

Right Image


Left Image

Left Image


Right Image

Left Image


California Dreaming (September 26th, 1990)
Photo by Alexander Klein                                                                               



4.2      Audio

Discuss aural properties of the human aural senses.


                                                anatomy.gif (23123 bytes)

Anatomy of the Ear [3]                                                                                                                                                

The ear is the organ of hearing and balance and consists of three parts: the outer ear, the middle ear, and the inner ear. The outer ear and middle ear are the apparatus for the collection and transmission of sound. The inner ear is responsible for analyzing sound waves, and also contains the mechanism by which the body keeps its balance.


Based on the findings of Dr. Oliver Staadt of University of California, Davis we see that auditory perceptions can be defined as the following.


Air Vibrations (rapid changes in air pressure) are converted to mechanical vibrations in the middle ear.  The acoustical characteristics of sound can be described as amplitude or the magnitude of the pressure vibration, the frequency based on the pressure variation rate and the phase.  There are acoustic reflexes where an adaptation to high-intensity sounds is temporarily reduced to auditory sensitivities.  The Acoustic stimuli are necessary as a temporal component, where constant sounds drop out of conscious awareness.  Sounds are usually perceived from sources in all directions.


Another audio perception is Auditory Localization or process by which we determine the location of a sound’s source.  Interaural level difference is best described as the sound in the ear nearest the source.  And also where the head obstructs and attenuates sound waves.


The Interaural time difference and phase difference are simply the difference in sound arrival times at the two ears and the difference in sound wave phase at each ear.


Pinna filtering can be described as the outer ear collecting directional and distance information as incoming sound distortion signals.  Partially deaf individuals (with one functioning ear) can localize sounds.


Motion cues like the “Doppler Effect” (a frequency shift resulting from relative motion between the sound source and the observer) and Changing volumes (a perceived sound is approaching when volume gradually increases and decreases) are other examples of Auditory Cues.





4.3      Tactile / Haptics

Discuss tactile feedback through the human touch senses.


Touch is often referred to as a haptic sense, from the Greek word, “Haptesthani”, meaning to grasp or touch.  Another Greek word, “Haptikos”, simply states, “being able to come into contact with”. 


In psychophysics, the haptic system is defined as a perceptual channel based on the combined input from the skin and from the joints. The haptic system is an apparatus by which the individual gets information about the environment and its body; the person feels an object relative to the body and the body relative to the object. [4]

The haptic system, hence, can be understood as the union of the tactile and kinesthetic senses used in mechanical interaction with one's environment.

In human-computer interaction, haptic feedback means both tactile and force feedback.  Tactile, or touch feedback is the term applied to sensations felt by the skin.  Tactile feedback allows users to feel things such as the texture of surfaces, temperatures and vibration.  Force feedback reproduces directional forces that can result from solid boundaries, the weight of grasped virtual objects, mechanical compliance of an object and inertia.[5]

There are distinguishably five different types of senses; sense of touch, of pressure, of vibration, of coldness and of warmth.

Two more senses, the “sense of position” and the “sense of force” are related to the proprioceptors.  The proprioceptors are receptors (special nerve-cells receiving stimuli) within the human body. They are attached to muscles, tendons, and joints. They measure for example the activity of muscles, the stressing of tendons, and the angle position of joints.

Kinesthesis is the perception that enables one person to perceive movements of the own body. It is based on the fact that movements are reported to the brain (feedback), as there are:

·         angle of joints

·         activities of muscles

·         head movements (reported by the vestibular organ within the inner ear)

·         position of the skin, relative to the touched surface

·         movements of the person within the environment (visual kinesthesis)

Kinesthesis supports the perception of the sense organs. If some information delivered by a sense organ and by kinesthesis are contradictory, the brain will prefer the information coming up from the sense organ.

In order to correctly design a haptic interface for a human, the anatomy and physiology of the body must be taken into consideration.  In force feedback, the proportions and strengths of average joints must be considered. Since the hands are most often used for haptic interfaces, the properties of the hand should be considered when designing a new interface.  In tactile feedback, the interface must track several variables of the human sense of touch. The fingers are one of the most sensitive parts of the surface of the skin, with up to 135 sensors per square centimeter at the fingertip.[6]  Also, the finger is sensitive to up to 10,000 Hz vibrations when sensing textures, and is most sensitive at approximately 230 Hz.  The fingers also cannot distinguish between two force signals above 320 Hz; they are just sensed as vibrations.  Forces on individual fingers should be less than 30-50 N total.  For the ``average user'', the index finger can exert 7 N, middle finger 6 N, and ring fingers 4.5 N without experiencing discomfort or fatigue.

Humans are very adept at determining if a force is real or simulated. In an experiment conducted by Edin et al., [7] a device was used to determine how humans reacted when they sensed that an object they were holding began to slip. The device consisted of a solenoid attached to a metal plate, which was allowed to slide when the solenoid was turned off. None of the subjects were 'tricked' into believing that the object was actually slipping. They all noted that ``something was wrong with the object'', but none commented that the object behaved as if it were slippery.

Studies show that there is a strong link between the sensations felt by a human hand, such as an object slipping, and the motions the hand was going through to acquire that knowledge, such as holding an experimental apparatus.[8]  The human haptic system is made up of two sub-systems, the motor sub-system and the sensory sub-system. There is a strong link between the two systems. Unlike the visual system, it is not only important what the sensory system detects, but what motions were used to gain that information.

Humans use two different forms of haptic exploration: active and passive.  Active haptic exploration is when the user controls their actions.  Passive haptic exploration is when another person guides the hand or finger of the user. When the user is in control they often make mistakes. In the case of two-dimensional exploration the most common mistake is that of wandering off of a contour and the user must spend a large amount of effort to stay on the contour.  However, when the subject is being guided, the entire interest can be devoted to identifying the object represented.

Many features can be identified more readily with passive haptic exploration.  Experiments comparing the accuracy of active versus passive tactile stimulations show that passive haptics are more accurate at identifying features as a whole.  When a subject's finger was guided around a two-dimensional object, such as the profile of a swan, they were more likely to be able to identify the object.  Some studies point out that active observers make more distracting 'errors', and may have difficulty differentiating between the erroneous paths and the correct paths of exploration.[9]

photo                                                                            Image Courtesy of Optical Engineering Reports, 1999






4.4      Smell (Olfactory)

Olfaction is defined as the act of smelling, “whereas to smell is to perceive the scent of something by means of the olfactory nerves. Odorants are substances whose characteristics can be determined by chemical analysis.” [10]  A person’s olfactory system operates in a fashion similar to other sensing processes in the body.  Airborne molecules of volatile substances come in contact with the membranes of receptor cells located in the upper part of the nostrils.  The olfactory epithelium, the smell organ, covers a 4-10 cm^2 area and consist of 6-10 million olfactory hairs, cilia, that detect different smells of compounds.  Exited receptors send pulses to the olfactory bulb, a part of the cortex, with a pattern of receptor activity indicating a particular scent. Because the airways are bent and thus the airflow past the receptors normally is low, we sniff something to get a better sensation (see figure below). [11]


Image Courtesy of university of Washington (Human Interface Lab)


Anatomy of the Olfactory system


In addition to the cilia, the fifth cranial nerve (trigeminal) has free nerve endings distributed throughout the nasal cavity. These nerve endings serve as chemoreceptors and react to irritating and burning sensations. The trigeminal nerve connects to different regions of the brain and provides the pathway for initiation of protective reflexes such as sneezing and interruption of inhalation.


“Basic units of smell are caprylic, fragrant, acid, and burnt". [12]  Other terms that have been used are fruity, spicy, floral, and green. In general the description of odors is limited to adjectives, in contrast to the rich vocabulary used to report visual stimuli. "Olfaction is also similar to the visual and auditory modalities with regard to age-related changes, in that olfactory sensitivity deteriorates with age. Peak performance in olfactory identification is reported to occur between the third and fourth decades of life". [13]


Humans fare pretty well with odor detection when other senses, like vision, are used alongside. Without cues from the other senses subjects in a study failed for two out of three odors to make the correct identification. Furthermore, it is not uncommon to have a high rate of false detection and report the existence of smells that in fact are not present.

“Odors can be used to manipulate mood, increase vigilance, decrease stress, and improve retention and recall of learned material". [14]  Various scents have been shown to improve tasks performed by subjects. [15]  Even the suggestion of a smell can lead to reactions as if the odor was present.

4.5      Taste


Without going into too much detail on the complexity of the mouth and how it works with the brain to react to specific tastes, let us simply discuss what “mouth-feel” is, by looking at an article stating the development of the virtual mouth.




This article is from the and discusses the possibilities of conquering the “Sense of Taste.”

Taste combines the feel of food in the mouth with chemical and even auditory cues.  Hiroo Iwata of the University of Tsukuba in Japan and colleagues call it the "last frontier of virtual reality".

But it is a frontier they have now crossed. "The food simulator is the first media technology that is put into the mouth," says Iwata.

Before simulating a foodstuff, the team first measures and records various phenomena associated with chewing it. One such parameter is the force required to bite through a piece of the food. A thin-film force sensor placed in a subject's mouth records these forces.

Biological sensors made of lipid and polymer membranes record the major chemical constituents of the food's taste. And a microphone records the audible vibrations produced in the jawbone while chewing.

These parameters serve as inputs to the food simulator. The mechanical part of the simulator that is inserted into the mouth has cloth and rubber covers, mainly for sanitary reasons, and is intended to resist the user's bite in a similar way to the real foodstuff. When someone chews the device, a sensor registers the force of the bite and a motor provides appropriate resistance.

To augment the experience, a thin tube squirts a mixture of flavorings onto the tongue. The chemicals stimulate the five basic taste sensations: sweet, sour, bitter, salty and umami - the taste of monosodium glutamate. Meanwhile, a tiny speaker plays back the sound of a chewing jawbone in the user's ear.

Iwata says that his team has successfully simulated many foods, including cheese, crackers, confectionery and Japanese snacks. One remaining step still to be tackled is to use a vaporizer to deliver appropriate smells to the nose.

The researchers say their device is perfect for people designing new foods, and may even allow young designers to experience the difficulty older people may face in chewing food.

The technology can also be entertaining - for the researcher at the controls, at least. By suddenly changing the properties of a food in mid-chew - from a cracker to a jelly, say - the result is uniquely funny, says Iwata.



4.6      Vestibular

The vestibular system serves to resolve conflicts between other sensory systems.  Because there are no predefined limits on what an environment may or may not be, there may be large discrepancies between the proprioceptive feedback and the visual input to the nervous system. The receptors in the utricle and saccule respond to the pull of gravity and to the inertial movement caused by linear acceleration and deceleration. [16]  The receptors in the ampullae’s of the semicircular canals respond to rotation of the head, i.e. angular accelerations and decelerations. The vestibular nuclei have connections with the cerebellum and reticular formation.

Balance Control                                                                           Image Courtesy of On Balance Clinical Information   [17]

Cyber sickness symptoms can be divided into two independent classes. The first group of symptoms arises from the disruption to perceptual and sensorimotor activities involving the vestibular system such as disorientation, disequilibria, and inappropriate vestibular-ocular or vestibular-spinal reflexes.[18]   The second group consists of a largely autonomic response, including drowsiness, salivation, sweating and vomiting, which also appear to have a perceptual origin since the virtual environment triggers them.

There are two theories concerning the causes of cyber sickness. The computational lag theory and the sensory conflict theory. They attribute the cause of cyber sickness to two different factors that act on the body and the nervous system in a very similar manner. However when the problems of one theory are resolved there are still subjects who experience cyber sickness. The only groups of individuals who never experience the problem are those with no functioning vestibular system. This leads us to believe that because the limitations of computing power can be overcome, it is only a matter of cost to eliminate the computational lag. However the problems of cyber sickness are still present, the sensory conflict theory is the most plausible theory at this moment in time.  Therefore the question is can we train individuals to be tolerant to cyber sickness?

Almost all individuals eventually adapt to motions or situations that initially provoke sickness; continued exposure to a particular nauseogenic environment leads to a gradual reduction in the disorientation and associated symptoms.  Research has shown that the notion that both specific and general components to tolerate motion environments, like virtual reality, can be learned through individual training.

5        Fundamentals of Virtual Reality


In this section we will look at the history of virtual reality and what makes up most of the virtual reality systems with its current technology, tactics and processes.



Global Visualization Center(GVC)

Image Courtesy of General Motors (Global Visualization Center)


Global Visualization Center(GVC)

Image Courtesy of General Motors (Global Visualization Center)






Image Courtesy of General Motors (Global Visualization Center)


Image Courtesy of General Motors (Global Visualization Center)






















5.1      A definition of Virtual Reality (What is VR?)


Virtual Reality denotes a real-time graphical simulation with which the user interacts via some form of analog control, within a spatial frame of reference and with user control of the viewpoint's motion and view direction. This basic definition is often extended by provision for multimedia stimuli (sound, force feedback etc.), by consideration of immersive displays- that is, displays which monopolize one or more senses, and by the involvement of multiple users in a shared simulation. [19]


A typical VR System Architecture Diagram (University of Washington)


As a technology matures, the demands on the performance of key components increase. In the case of computer technology, we have passed through massive mainframes to personal computers to powerful personal workstations. A growth in complexity of software tasks has accompanied the growth of hardware capabilities. At the interface, we have gone from punch cards to command lines to windows to life-like simulation. Virtual reality applications present the most difficult software performance expectations to date. VR challenges us to synthesize and integrate our knowledge of sensors, databases, modeling, communications, interface, interactivity, autonomy, human physiology, and cognition -- and to do it in real-time.

VR software attempts to restructure programming tools from the bottom up, in terms of spatial, organic models. The primary task of a virtual environment operating system is to make computation transparent, to empower the participant with natural interaction. The technical challenge is to create mediation languages that enforce rigorous mathematical computation while supporting intuitive behavior. VR uses spatial interaction as a mediation tool. The prevalent textual interface of command lines and pull-down menus is replaced by physical behavior within an environment. Language is not excluded, since speech is a natural behavior. Tools are not excluded, since we handle physical tools with natural dexterity. The design goal for natural interaction is simply direct access to meaning, interaction not filtered by a layer of textual representation. This implies both eliminating the keyboard as an input device, and minimizing the use of text as output.

The figure below presents a functional architecture for a generic VR system. The architecture contains three subsystems: transducers, software tools, and computing system. Arrows indicate the direction and type of dataflow.  Participants and computer hardware are shaded with multiple boxes to indicate that the architecture supports any number of active participants and any number of hardware resources.  Naturally, transducers and tools are also duplicated for multiple participants.

Image Courtesy of University of Washington


Virtual Reality has been depicted with many names, i.e., Virtual Environments, Virtual Worlds, Computer Generated Environments, Computer Simulated Environments, Synthetic Environments, Spatial Immersion, CyberSpace, Virtual Presence, and Metaverse.  This section will discuss some of these technologies through the history of computer graphics, the various components and products that can help create this Virtual World with a solid understanding of what is currently available today.


First we will begin with briefly understanding what is considered a Virtual Environment (VE) and the many components that can contribute to specific perceptions in a different kind of world.  Next we will discuss how VR can be used beyond the virtual environment and how VR compares to other existing technologies. 


5.1.1      Virtual Environments


Most Virtual Reality Systems or Virtual Environments consists of these elements:


·         Three dimensional geometry

·         Real time Interactivity (time–crucial)

·         Viewer-Centered Perspective

·         Wide field of view

·         High-Resolution Stereoscopic Displays

·         Interactive Experiences

·         Multisensory Environment


5.1.2      Augmented Reality terminology




·         Immersion in an actual remote environment

·         Telerobotics are controlling robots at a distance

·         Telesurgery is performing human surgery operations remotely.


Enhanced Reality


·         Computer graphics or text overlaid on a real image.

·         Assembly Repair or viewing obscured components of a design or assembly.

·         Superimposing sonogram or MRI data onto an image of a human body.


5.1.3      Interactive Computer Graphics vs. Virtual Reality


Virtual Reality systems are not Interactive Computer Graphics and here is why:


ICG                                                               VR


User is outside                                           User is inside

Screen = canvas                                         Display = Space

3D-to-2D environments                            3D-to-3D environments

Tele-interactions                                        Direct manipulations

Single interactions                                     Simultaneous interactions

Visualizations                                             Sensorizations

Performance Tolerance                             Performance is Critical

Interaction Awareness                              Interaction Transparency

Tele-Scale                                                    True 1 : 1 Scale


So - how did VR get started?   Let us take a look at the history of this technology.

5.2      History of VR

The word "stereo" originates from the Greek and means "relating to space". Today, when we talk about stereo, we usually refer to stereophonic sound. Originally, the term was associated with stereoscopic pictures, which were either drawn or photographed. In order to avoid confusion with stereophonic sound, one now often talks about 3D pictures and especially 3D-film, where 3D, of course, stands for three-dimensional.

A person lives in a three-dimensional, spatial, environment. Without a feeling for space, we cannot move within it. Almost exclusively our eyes create our perception of space. There are many ways to orient oneself in space, e.g., by perspective, gradation of color, contrast and movement.

The lenses of the eyes in a healthy human being project two slightly different pictures onto the retinas, which are then transformed, by the brain, into a spatial representation. The actual stereoscopic spatial observation is a result of this perception through both eyes.

A number of otherwise healthy two-eyed people, however, have eye-defects since birth that make stereoscopic viewing impossible. As babies, they have, in the literal sense of the word, learned to "grasp" the world. They safely orient themselves in their environment by employing one of the other above-mentioned methods. Even a person with only one eye learns how to move around safely, using non-stereoscopic cues.

The normal picture on paper or film is only one-eyed. It is photographed with only one lens and can, therefore, not convey a true spatial perception. It is only a flat picture. But we do not have to abstain from the known natural effect. By taking two lenses and imitating the eyes, we can create such a space image.

When we examine with or without optical instruments a stereo picture created in such a manner, we form a similar perception of space in our mind.

The two necessary, somewhat different, single views can be generated by different methods. We can produce them like the old stereo artists did; first draw one, then the other single view. We may also take the exposure one after the other with a normal single lens camera. It is evident that the subject must not move during this procedure, otherwise the two pictures would be too different. A better approach is to imitate the head and mount both lenses in a common chassis. Now we have a true stereo camera. Basically it is only the joining of two mono-cameras. It is also possible to take stereo pictures with two coupled cameras. The two lenses can also be combined as interchangeable stereo optics in a single camera.

3D-Photography duplicates the way we view a 3D object or scene by taking a pair of photographs separated by a distance equal to the separation between a typical person's eyes. The two pictures then have a viewpoint similar to the view seen by the left and right eye. These images, if directed to the left and right eyes, are fused by the brain into a single image with the appearance of depth. Perhaps the most well known example of this is the View-Master™ many of us have played with as children (of all ages).

When science and technology have looked at two-dimensional drawings or photographs, by viewing with both eyes, they discover that these views appear to exist in three dimensions in space. A popular term for stereoscopy is 3D. Stereoscopic pictures are produced in pairs, the members of a pair showing the same scene or object from slightly different angles that correspond to the angles of vision of the two eyes of a person looking at the object itself.  Stereoscopy is possible only because of binocular vision, which requires that the left-eye view and the right-eye view of an object be perceived from different angles.  In the brain the separate perceptions of the eyes are combined and interpreted in terms of depth, of different distances to points and objects seen.  Stereoscopic pictures are viewed by some means that presents the right-eye image to the right eye and the left-eye image to the left.  An experienced observer of stereo pairs may be able to achieve the proper focus and convergence without special viewing equipment (e.g., a stereoscope); ordinarily, however, some device is used that allows each eye to see only the appropriate picture of the pair. To produce a three-dimensional effect in motion pictures, various systems have been employed, all involving simultaneous projection on the screen of left- and right-eye images distinguished by, for example, different color or polarization and the use by the audience of binocular viewing filters to perceive the images properly.  In holography the two eyes see two reconstructed images (light-interference patterns) as if viewing the imaged object normally, at slightly different angles. [20]



Here are some examples courtesy of Del Philips and his writings of “A Visual History of the Stereoscope”, August 2000.



The stereopticon was in some ways the end-of-the-Nineteenth-century's equivalent of the end-of-the-Twentieth-century's VCR. Though not inexpensive, at least one of these entertainment devices was to be found in nearly every middle and upper class parlor of the time.



The basics of how the stereopticon and all other stereo viewing devices work were first laid out as far back as ancient Greece when Euclid explained the principles of binocular vision. He demonstrated that the right and left eyes see a slightly different version of the same scene and that it is the merging of these two images that produces the perception of depth.

Though some experiments in stereo viewing were conducted earlier (most notably pairs of "stereo" drawings made by the sixteenth century Florentine painter Jacopo Chimenti), the advent of photography really made widespread 3-D viewing possible. The first patented stereo viewer was Sir Charles Wheatstone's reflecting stereoscope in 1838. The device was a bulky and complicated contraption that utilized a system of mirrors to view a series of pairs of crude drawings. In 1844 a technique for taking stereoscopic photographs was demonstrated in Germany, and David Brewster developed a much smaller and simpler viewer that utilized prismatic lenses in Scotland.


By 1850 crude stereoscopes and glass views were available.  Sir David Brewster invented a box shaped viewer that was popular at that time.  The Brewster viewer is one of the earliest means of viewing the stereograph. The design follows the requirements of the views being used at the time. The Brewster can be used for viewing tintypes, Daguerreotypes, glass views, tissue views and the early flat mount views made in quantity into the 1880's. Most Brewsters have a opaque glass at the back of the viewer to allow light to pass through glass and tissue views. The photo below shows the glass and the open top door with a mirror. The door can be adjusted as needed to point reflected light from the mirror onto non-transparent views.

                                                                          Figure History 1



Some views were hand tinted to provide color. Tissue views can be found tinted and pin pricked in places such as chandeliers or candle tips to simulate a flame. Some tissue views also contain unsuspected surprises when viewed at a light source, such as night and day or fire. The left half of the tissue view below is shown without light passing through the view while the right half is viewed with a light source. On the right half note the increase in quality, color has been added and a chandelier at the left between columns appears lit because the view has been pierced.

                                                                          Figure History 2

In 1859, Oliver Wendell Holmes developed a compact, hand-held viewer and Joseph Bates of Boston made improvements and manufactured them. With advances in photography a new industry and form of entertainment was created.  Stereo pictures are taken by means of a camera with two lenses. This provides two separate pictures 2.5 inches apart, about the distance between the eyes. Although the pictures appear the same, they are not. When looked at in a viewer, which has prismatic lenses, your eyes will blend the two views into one and the brain perceives it in three dimensions the same as normal vision.

                                                                      Figure History Holmes-Bates

By the 1860’s, Coleman Sellers invents the Kinematoscope that flashed a series of still photographs onto a screen.  A viewer turned a crank to view a sequence of image pairs that resulted in animated three-dimensional views.  Like David Brewster's double camera, the device simultaneously takes two pictures of the same subject, but at slightly different perspectives. This gives the resulting photographs a suggestion of depth and dimension when they are observed through a special viewing device.


                                                                      Figure History Kinematoscope

The discovery of anaglyptic 3-D appeared in the 1850's as the result of experiments by the Frenchman Joseph D'Almeida. Color separation took place using red/green or red/green filters and early anaglyphs were displayed using glass stereo lanternslides.  William Friese-Green created the first 3-D anaglyptic motion pictures in 1889 which first went on show to the public in 1893.

The stereo craze of that time had already diminished by 1900 and was only stimulated by the so-called Kaiser-Panorama of Fuhrmann (figure History Panorama) from Berlin for a short period of time. This consisted of a set of many stereo viewers situated side by side in a circle. The stereo slides rotated step-wise on a drum at a certain speed from one viewer to the next.

Figure Panorama

These anaglyptic films designated as plasticons or plastigrams enjoyed great success during the 1920's. The films used a single film with the green image emulsion on one side of the film and the red image emulsion on the other. In 1922, an interactive plasticon opened at the Rivoli Theater in New York titled "Movies of the Future". The film provided the viewer with an optional ending. The happy ending was viewed using the green filter whilst the tragic ending could be seen using the red filter.


Figure plastigram

In 1932, Edwin H Land patented a process for producing polarized filters that eventually led to the development of full color 3-D movies. This was possible because the left/right separation could be achieved using the polarizing filters rather than the color channel. Land also perfected a 3-D photographic process called vectography.


The View-Master became a popular toy for children from their introduction in 1939 through 1970’s.

Morton Heilig developed one of the very first Head Mounted Display’s in 1960 under the U.S Patent  - 2,995,156.  He had included such devices as Wide Peripheral effects, focusing controls and optics, stereophonic sound and the capability to include sound.


However he was unable to secure funding for further development and had ceased by 1961.

In 1962 he then produced the first Head Mounted Television device called Sensorama under the U.S. Patent number – 3,050,870.  Coincidentally, Morton Heilig was the Cinematographer for this device as well.

                                                                                       Courtesy of Burdea and Coiffet

This product was intended to recreate an entire experience; by using a 3d Video captured by a head mounted stereo cameras.  The system possessed motion, color, and stereophonic sound.  Its non-interactively lead to criticism about whether it was a true Virtual Reality device.  At the time it could be found in several amusement areas throughout California.


Dr. Ivan Sutherland suggested the Ultimate Display in 1965 by developing a Head Mounted Display as a Virtual World viewer.     


He invented a head mounted 3D stereoscopic Virtual Reality computer in 1966.  Dr. Sutherland designed the apparatus with two Cathode Ray Tubes (CRT) and supported this with a mechanical arm.

The tracking was provided by potentiometers in the mechanical arm.



Sutherland, Ivan 1968.  “Head-Mounted Three Dimensional Display”.




Other devices were being developed by companies throughout the world that were specific to tracking using an input device such as a “Data Glove” which was developed by Zimmerman, Lanier and Fisher from VPL in 1982.   Lanier added six degrees-of-freedom (DOF) tracking.



McGreevy, Humphries and Fisher, at the NASA Ames project, developed a Liquid Crystal Display (LCD) based on HMD constructed from disassembled Sony “watchman” television sets from 1981 through 1984.  by 1985 they had developed the “Virtual Interface Environment Workstation” (VIEW).  This included Polhemus tracker, LEEP-based HMD, 3d Audio from Crystal River’s Convolvotron, gesture recognition with VPL DataGlove, a book mounted CRT (Sterling Software) and a remote camera (Fake Space).

                                                                                          Image Courtesy of The College of the Arts, Ohio State University

U.S. Air Force developed the “Super Cockpit” that used visual, auditory and tactile feedback.  The head, eye, speech and hands served as input devices.  It was designed to give the pilot a tremendous amount of information for research in 1985.



Which developed into the “Virtual Cockpit”, 1987.




A British company, Division Ltd in 1991, developed the first integrated commercial virtual reality workstation.


Sense8 World Toolkit was developed in 1992 that integrated support for PC’s to high-end UNIX workstations, supporting several input/output devices and immersive displays.  This included support from the most popular 3D and CAD formats.






During the late 1980’s several universities were trying to develop a more liberal motion movement visual device that would enhance the viewing and tracking methodologies that have since been difficult to maneuver within.   By 1992, the University of Illinois, Chicago designed a Cave-like Automatic Virtual Environment or C.A.V.E., which is named for a reference to "The Simile of the Cave" found in Plato's Republic, in which the philosopher explores the ideas of perception, reality, and illusion. Plato used the analogy of a person facing the back of a cave alive with shadows that are his/her only basis for ideas of what real objects are.  Co-developer of the CAVE and director of the University of Illinois at Chicago’s Electronic Visualization Laboratory states that “In the CAVE, you are no longer on the outside looking in but on the inside looking out."




Initially the CAVE was a multi-person, room-sized, high-resolution, 3D video and audio environment. In the current configuration, graphics are rear projected in stereo onto two-to-six walls and the floor, and viewed with stereo glasses. As a viewer wearing a location sensor moves within its display boundaries, the correct perspective and stereo projections of the environment are updated, and the image moves with and surrounds the viewer. The other viewers in the CAVE are like passengers in a bus, along for the ride!


Now the CAVE is a surround-screen, surround-sound, projection-based virtual reality (VR) system. Projecting 3D computer graphics into a 10’ x10 ‘x9’ cube composed of display screens that completely surround the viewer creates the illusion of immersion. It is coupled with head and hand tracking systems to produce the correct stereo perspective and to isolate the position and orientation of a 3D input device. A sound system provides audio feedback. The viewer explores the virtual world by moving around inside the cube and grabbing objects with a three-button, wand-like device.

Unlike users of the video-arcade type of VR system, CAVE dwellers do not wear helmets to experience VR. Instead, they put on lightweight stereo glasses and walk around inside the CAVE as they interact with virtual objects.

Multiple viewers often share virtual experiences and easily carry on discussions inside the CAVE, enabling researchers to exchange discoveries and ideas. One user is the active viewer, controlling the stereo projection reference point, while the rest of the users are passive viewers.

The CAVE was designed from the beginning to be a useful tool for scientific visualization; EVL's goal was to help scientists achieve discoveries faster, while matching the resolution, color and flicker-free qualities of high-end workstations. Most importantly, the CAVE can be coupled to remote data sources, supercomputers and scientific instruments via high-speed networks, a functionality that EVL, the National Center for Supercomputing Applications, and Argonne National Laboratory are jointly implementing.

Since envision of the CAVE, the creators, have designed immersion desks and walls as well.  We’ll explore these and many other products within the VR hardware sections.

5.3      Photo-based VR

Photo-based VR is usually considered a 360° panoramic view of a specific environment or Virtual Tours.  There are several means of gathering these images to create the one panoramic view.  Early in the developing stages of this technology a specific camera was used at a specific position and took several pictures rotating in a specific amount of degrees.  The resulting pictures were then “stitched” together to create the one encompassing panoramic view.  The digital media standard was developed by Apple back in the early 1990’s and was dubbed Quick-Time Virtual Reality or QTVR.


The most recent technology has a specific camera designed to take one 360° panoramic view, thus eliminating the need to stitch successive pictures together and allowing the software to easily and quickly show the results in a panoramic developed viewer.  A company that has developed an easy QTVR toolkit known as EasyPano has an example of a Tour or Shanghai or perhaps a tour of someone’s house. 


There are QTVR hybrids that use the QTVR technology.


Here are many techniques to developing QTVR by an outside source.


There are several books that talk about a better understanding of these techniques at :


QuickTime Toolkit Volume One
Basic Movie Playback and Media Types

Tim Monroe
Published: 6/21/2004
"When QuickTime application developers get stuck, one of the first places they look for help is example code from Tim Monroe. Finally, Tim's well-crafted examples and clear descriptions are available in book form—a must-have for anyone writing applications that import, export, display, or interact with QuickTime movies." —Matthew Peterson; University of California, Berkeley; the M.I.N.D. Institute; and author of Interactive QuickTime

QuickTime Toolkit Volume One is a programmer’s introduction to QuickTime, the elegant and potent media engine used by many of Apple's industry-leading services and products (such as the iTunes music store, iMovie, and Final Cut Pro) and also used by a large number of third-party applications....


QuickTime Toolkit Volume Two
Advanced Movie Playback and Media Types

Tim Monroe
Published: 6/30/2004
"Buried inside QuickTime are a host of powerful tools for creating, delivering, and playing digital media. The official QuickTime documentation explains 'what' each API function does. But knowing what each function does isn't enough to allow a developer to take full advantage of QuickTime. QuickTime Toolkit fills in the gap—providing plenty of practical examples of 'how' to use QuickTime to perform all kinds of useful tasks. More importantly, [this book] goes beyond 'how' and into 'why' —providing readers with a deeper understanding of QuickTime and how to benefit from using it in their own products....


Interactive QuickTime
Authoring Wired Media

Matthew Peterson
Published: 8/8/2003
Interactivity is one of the most captivating topics for today's online community. It is a fast-growing field pushed by the rapid development and dispersion of Java, Shockwave, Flash, and QuickTime. While several good books are available about the interactive capabilities of Java, Shockwave, and Flash, until now there hasn't been a book about QuickTime interactivity. A logical follow-up to QuickTime for the Web, this eagerly awaited book by Matthew Peterson details the power of QuickTime's wired media technology and provides a resource for professionals developing and deploying interactive QuickTime content....


5.3.1      QTVR Panoramic Movies


For much more information a web site dedicated to understanding what techniques are available is located at


One form of developing high-quality panoramic images is to literally stitch several photos together.  There are several software tools developed specifically for stitching photos together to make a panoramic scene.  One such example is at Panoramic Tools Graphical User interface or PTgui.


Source images


All images should be taken from the same viewpoint.


When photographing, try to rotate the camera around the 'nodal point' (the optical center of the camera/lens). Otherwise, parallax errors will occur, similar to the difference between the images you see with your left eye and right eye. This is less important if there are no objects near the camera (i.e. landscape scenes).


Lock the camera’s exposure and white balance.


Although exposure differences can be corrected during stitching, you will get the best results if there are no brightness or color differences in the images.

















It is not necessary to initially align the photos correctly.


After dragging all six images to the panorama editor, we roughly align the images.  This is done by clicking and dragging images with the mouse.  Perfect alignment is not important yet.


Aligning images in the panorama editor



Now the idea is to add control points.  Control points are used to tell Panorama Tools which points should coincide in the final panorama. Adding control points is as simple as clicking on a point in one image, and then clicking on the corresponding point in the other image.



Adding control points




Once these control points have linked all images, the images can now act as one image through a grouping process.  If orientation is skewed, then you can rotate the new image until the desired perspective is generated.




Aligning the panorama



The panorama has been stitched. In this case we have done some additional editing by opening the result in a graphics program, cropping the image and enhancing its contrast and sharpness. Here's the result:



Stitched with PTGui


5.3.2      QTVR Object Movies


The difference between QTVR object movies and QTVR panoramas sometimes gets confusing. Think of a panorama as a scene in which you are placed in the middle. An object movie, on the other hand, involves you looking at something and turning it around to examine it closely.


Here is a quick snap shot of the creating process.  The images created will need to be specifically defined and incorporated through the same exact angles and planes derived by the camera images themselves.


top arrow






Other types of techniques allow you to add audio and video components


You can create movies by capturing audio and video from an external source, or by synthesizing sample data programmatically.



5.4      Model-based VR Geometry


We will briefly discuss a modeling philosophy, and describe some modeling techniques; however more information will be discussed on modeling concepts in section 8.1, 3D Modeling, Spatial modeling techniques.

5.4.1      Modeling Philosophy


If you wish to interact with a specific object or environment that is realistic enough to give the user the impression that one can exist within a virtual environment, then model-based geometry is necessary to randomly interact at real-time.  While photo-realistic images look great, they take about an hour to render.  For the human eyes and brain to digest visual cues and inputs, an interactive graphic world should take about 1/30 of a second to render; else the experience will feel jerky and sluggish to the user.  Interactive graphics must therefore take short cuts, the main ones being polygonal approximation of smooth surfaces and ad-hoc illumination and shading models.  As hardware and software improve, interactive graphics will tend to levitate towards photo-realistic graphics.


The common differences between a photo-based Virtual Environment and a 3D model world can be:


·         Actual physical scene need not previously exist

·         Allows interactions beyond the limited range possible with photo-based VR (going behind objects)

·         Can be cheaper and easier than photography in some special case (undersea, outer space).


However, 3D models:


·         Can be tedious and expensive to generate the models.

·         Photo-realism is not achievable in real-time, so it must be sacrificed or approximated using ad-hoc shading methods and polygonal approximation of smooth surfaces.


Before we dive into understanding the development process of a basic model, let us first understand why we model at all.


Goals of Modeling:


“Modeling is a way of thinking and reasoning about systems.  The goal of modeling is to come up with a representation that is easy to use in describing systems in a mathematically consistent manner.”  [21]


Creating a model of a single object or event is not very difficult.  However, creating a model of an entire virtual world in which all of the objects and events are represented in a consistent and complimentary way requires a great deal of mental and creative effort, as well as, hard-earned expertise.


Most people begin modeling by creating representations of systems that are inconsistent or inefficient.  Through trial-and-error they learn better techniques.  Unfortunately, this trial-by-error process is extremely expensive and time consuming.  [22]


Avoid too Much Detail:


“The tendency is nearly always to simulate too much detail rather than too little. Thus, one should always design the model around the questions to be answered rather than imitate the real system exactly …”  [23]



Design a model to answer the question, not to mimic the real world.


Detail in the model is an attractive and seductive thing.  The model is enticed by the intricacy and beauty of the creation itself and loses sight of the purpose for which the model is being built.  Additional detail requires additional software, data, debugging, CPU time, and storage space. It can also make you forget why you are creating the model to begin with. 

5.4.2      Modeling Techniques


There are basically two types of modeling techniques that allow one to see a specific object within a virtual environment.  The manufacturing industry uses a point-to-point modeling system described by many as Computer Aided Design (CAD).  The models are then translated to specific machine languages for the developed products to be manufactured, known as Computer Aided Manufacturing (CAM).  This form of modeling is known as “Vector modeling” where a particular machine would need to understand the exact location of specific points based on a tolerance zone within the product.   In most virtual environments a form of describing a product’s features is known as “Model for Rendering” or DCC, Digital Content Creation.


Reference books that can help understand the modeling process are:


Inspired 3D Modeling & Texture Mapping

Texturing & Modeling: A Procedural Approach


By Tom Capizzi  August  2002 (272 pages)

By David S. Ebert  December 2002 (687 pages)


Computer modeling is the primary step in creating 3D computer graphics. If the model is not built to be optimized and compatible with the rest of the production tools, the rest of the digital process becomes more and more difficult. Texture mapping digital models helps to create the surface detail that can make a computer-generated image appear photo realistic. Inspired 3D Modeling and Texture Mapping examines the strategies for modeling and texture mapping used by the largest and most respected computer graphic studios in the world. Loaded with concepts, case studies, tutorials, interviews, and hundreds of graphic examples, this book is a valuable exploration of the tested practices of computer graphics industry veterans.



There are certainly several methods that could develop the 3d model that you desire to produce including; box, subdivision surface (sub-d), spline, nurb, and patch modeling techniques.  Look at or perhaps to help you get more information.  Another wealth of information is at and RealSoft 3D.


The Box method is simply starting with basic primitives and developing your modeling by adding and removing unnecessary polygons to get the result you want.  (See figure below)  There are several types of primitives from 2D shapes such as; lines, curves, arcs, circles and planes.  To 3D shapes such as; cubes, spheres, cylinders, tubes, pyramids and cones.


Image Courtesy of Mike Brown


The Spline method uses a set of points and specific types of curves to connect these points to create splines that in-turn can develop very complex shapes.  (See figures below)  Think of this as a quick sketch tool.  [24]





Images courtesy of Drawing With Splines 



The Surface method uses specific 2d primitives that are stitched together to make a specific surface which in-turn can create a fairly quick representation of your model.  (See figures below)



Images Courtesy of Mike Brown, 2004 and pSculpt


In the figures above; the third image shows how it is fairly easy to extract and manipulate a given point, then render the model again to see the results.


The Nurb method uses a reverse engineering approach.  For example if you wish to model a particular item, you can scan an image into your modeling toolkit, add box style segments and then proceed to modify these segments to the desired model. (See figures below)







Images Courtesy of Lightwave 3D Tutorials


Here is an example of using Boolean operations to solid models (See figures below).



Images Courtesy of RealSoft 3D


The Patch method uses a minimum of two curves that are positioned in opposite directions (See figure below).



Another form of modeling uses reverse engineering technologies to scan a model into a 3d representation.

3D Data Scanners


Most of these products were initially designed with quality and inspection of manufactured components and machinery in mind.  During the 1980’s the basic premise was to scan the real-life 3d model or aperture and test for abnormalities.  As technology grew and the ability to scan became easier with the advent of laser recognition functionalities, it was developed for recreating existing complex models.  The human body, specifically the human face, was very difficult to reproduce in a 3d modeling environment.  This technology has since migrated over 3d modeling to create new modeling techniques.  Here are a few examples:


CyberWare Head & Face Color 3D Scanner

CyberWare WholeBody Color 3D Scanner Model WB4

Head & Face Scanner

WB4 Scanner

The Model 3030 3D scanner is an advanced, general-purpose implementation of Cyberware's proven 3D digitizing technology. It incorporates a rugged, self-contained optical range-finding system, whose dynamic range accommodates varying lighting conditions and surface properties.

In as little as 17 seconds, the Cyberware WB4 captures the shape and color of the entire human body. The scanner's rapid acquisition speed freezes motion and makes it easy to scan many subjects or to capture different poses appropriate to the application at hand.


Raindrop’s Geomagic Studio

Since the advent of computer-aided design (CAD), researchers have tried to create products in digital form. But, despite major advances, 99% of things used or treasured in our daily life do not have a digital representation.

Combined with new 3D scanning hardware, Geomagic software quickly and easily captures 3D geometry of shapes and forms with accuracy and speed not possible with traditional design software.

Raindrop Geomagic’s technology delivers digital convergence that unites physical and digital worlds, freeform and mechanical shapes, and discrete and continuous mathematics. Digital convergence promises major changes in the manufacturing world:

·         Medical devices such as hearing aids, dental appliances and artificial knees can be custom-fit to the individual.

·         Designers can capture classic products and rework them in many styles, shapes and colors.

·         Product manufacturers can differentiate their offerings through design and customization.

·         As-built physical parts can be compared to as-designed digital counterparts to ensure quality and accuracy.

·         Digital realizations of physical parts can be used to ensure accuracy of engineering analyses in the automotive, aerospace and consumer product industries.

·         Digital inventories can be made of legacy parts without CAD data.


Check out the 3D Scanner Companies web site:

5.5      Model-based VR Rendering


Here are some examples of different types of rendering capabilities


Images courtesy of ShutterBug and Pixar’s PhotoRealistic RenderMan Software


Color WireFrame Vectors

Hidden Line Removal

Hidden Surface Removal

Individually shaded polygons with diffused reflection

Gouraud shaded polygons with diffused reflection

Gouraud shaded polygons with specular reflection

Phong shaded polygons with specular reflection

Curved surfaces with specular reflections

Improved illumination model and multiple lights

Texture mapping

Displacement mapping

Reflection mapping










5.6      VR Resources

For research information check these web page links:  (Chinese group of Companies called TechTrend) (SGI’s third party applications) (latest publications on VR) (Study and research on “Being There) (History of Computer Graphics, Ohio State University) (Digital Media and Animation History) (Useful Links from VR Tech of Rutgers University) (The Chinese U. Hong Kong Medical Imaging and Visualization research) (A incredibly detailed summary of modeling humans)










5.7      VR System Diagram




5.8      VR Hardware


There are many components that make up a Virtual Reality or Visualization Environment System. 



Image Courtesy of Silicon Graphics Incorporated, SGI



In this section we will discover what devices are specific to virtual reality systems.  There are several components that make a system function as a virtual environment.  Most of the hardware designed for virtual immersion is designed for the human to interact with a given virtual world.  Some scenery can be rather complex and will need different peripherals to help engage the environment.  Image generators and specific types of display systems will be needed to render the images - for the human brain requires a minimum of 15 rendered images or frames per second to rationalize what they are visualizing.  Interactive devices coupled with tracking devices are required to help navigate through this world based on the specific type of application.  Computing power and networked configurations will be needed to help with real-time decisions based on visual and auditory cues as one is exploring this new environment.  The types of computer hardware and software will need to be adjusted and configured correctly to accommodate human perception without disorienting the subject(s).


The necessities of developing interactive devices are dependant on the technologies that are available.  Some devices allow the user to experience more than just navigation.  Here we see how virtual menus can help navigate with new conceptual movements or actions.  The Virtual Tools can be quite an exciting adventure.


Images Courtesy of Virginia Polytechnic Institute & State University


During the following sections we will need to discuss image generators, display systems, tracking devices, sound systems, interactive devices, computer environments, and network capabilities.


Cybersphere graphic

Image Courtesy WindoWatch, Virtual Reality, 2001.                                                                                            University of Warwick Manufacturing Group and VR Systems UK


As you will discover with the examples that follow in this section, most of the peripherals are joined together to create a new world.  Understanding what devices are available is certainly required, but after configuring them together, you need to utilize them cohesively for the best performance. 


An example of perhaps how these devices are thought of may be as follows;


Many inherently simple tasks become more complex if an improper choice of input device is made.  For example, toggling a switch is inherently a one-degree of freedom task (the switch is on or off).  Using an interaction technique that requires the user to place a tracked hand within a virtual button (a three DOF task) makes it overly complex.  A simple discrete event device, such as a pinch glove, makes the task simpler.  Of course, one must tradeoff the reduced degrees of freedom with the arbitrary nature of the various positions the user must learn when using a pinch glove or other such device.  Also, such devices can generate only a finite number of events.  In general, we should strive to reduce unnecessary DOFs when it is practical.  [25] 


So now we will simply look at the different types of peripherals and what each of these devices can do.






5.8.1      Image Generators


When designing a large-scale visualization center or simply demonstrating your visualization simulations on your computer monitor, a device known as an “Image Generator” is needed to help render the graphics quickly and efficiently without too much arbitrary tools to support this function.   While there are certainly several different choices of displaying your visualization through either projections and screens or simple visual peripherals, the consideration of how to run the images through your system may be crucial to the performance of your entire system.


Here are some examples of Projectors and Screens:


·         Active Matrix LCD

·         LCD Projector

·         Direct View CRT

·         CRT Projector

·         LED

·         Electroluminescent

·         AC & DC Plasma


Understanding how these images can be portrayed onto these specific types of projections will need to consider such aspects as:


·         Resolution

·         Refresh Rate

·         Stereo Switching – Ghosting

·         Cost

·         Mean Time Between Failure


While some companies have developed all-inclusive systems that do not require additional equipment, there are companies who have developed rather inexpensive systems based on PC-based graphic cards.  Some of the most popular of these PC-type graphic cards are developed by companies such as; NVidia, Evans & Sutherland, ATI Radeon, 3Dlabs and many more.


                                                                             Image Courtesy of HotHardware Articles


Here are some examples of specific companies that do develop an all-inclusive image generating system, Silicon Graphics Incorporated (SGI), Hewlett-Packard (HP), Sun Microsystems, as well as others.  Silicon Graphics Incorporated


The Silicon Graphics® Onyx4™ UltimateVision™ family comes in three basic configurations--power, team, and extreme--and they are all compatible with today's suite of Onyx® applications. Each configuration provides fine-grain scalability so that you can create the perfect balance of computational, bandwidth, and rendering power to meet your requirements. And as your needs change, modules can be easily added to your existing system to provide you more visualization power in the future.

Image of the Silicon Graphics Onyx4 in the power configuration

Enabling Power Users
The Silicon Graphics Onyx4 UltimateVision power configuration is ideally designed for small teams and power users. This configuration delivers Onyx capability with a power-user price and form factor, enabling individuals to blaze through the most demanding data sets. Delivering 20M pixels of highly interactive graphics, Onyx4 has enough power to interactively drive two ultrahigh-resolution 10M pixel LCD displays, all in a system that can slip under a desk.


Image of the Silicon Graphics Onyx4 in the team configuration

Transforming Team Productivity
Today's workflow, data sets, and distributed teams provide tremendous challenges to all organizations. The Silicon Graphics Onyx4 UltimateVision team configuration brings the performance, power, and production capability of large-scale SGI® Reality Center® facilities to small team rooms. Team productivity is transformed by the ability to load data sets at multi-gigabytes-per-second rates, interact with 32GB of data in memory, and scale graphics capability seamlessly, all in a deskside form factor.


Image of the Silicon Graphics Onyx4 in the extreme configuration

Tackling Extreme Problems
The Silicon Graphics Onyx4 UltimateVision extreme configuration is designed to enable leaders in their fields to understand the world's most demanding problems. Breakthroughs and discoveries are possible with the Onyx4 system's ability to rapidly digest and visualize multi-terabyte data sets, transforming insurmountable problems into powerful assets. Imagine groundbreaking insights that can come from driving holographic displays and interactively roaming through terascale computational data sets.



Onyx4 UltimateVision
Power Configuration

Onyx4 UltimateVision
Team Configuration

Onyx4 UltimateVision
Extreme Configuration

System Specifications

Processor quantity




Graphics pipes





Up to 16GB

Up to 64GB

Up to 128GB

I/O expansion

Up to 8 PCI/PCI-X slots

Up to 32 PCI/PCI-X slots

Up to 64 PCI/PCI-X slots

Form factor

19" rackmount or 17U tall 19" rack

19" rackmount or 17U tall 19" rack

39U tall 19" rack

Height (each U is 1.75 inches)

4U or 6U



Voltage per module

110/220 VAC auto-sensing worldwide power supply

110/220 VAC auto-sensing worldwide power supply

110/220 VAC auto-sensing worldwide power supply


1200 W max.

2700 W max.

6600 W max.

Heat dissipation

5000 BTU/hr max.

9000 BTU/hr max.

33000 BTU/hr max.

Electrical service type: racked systems

NEMA L6-30R, 208
VAC at 30 amp (rack PDU)

NEMA L6-30R, 208
VAC at 30 amp (rack PDU)

NEMA L6-30R, 208
VAC at 30 amp (rack PDU)

Visualization Features

Anti-aliasing multisample

2, 4, or 6 subsample full-scene AA; 2x, 4x, 8x, or 16x anisotropic texture filters

Color fidelity

32-bit RGBA rendered to the frame buffer; 96-bit RGBA floating-point rendered to off-screen buffers




Inter-pipe synchronization, with external source support

OpenGL® pipeline

OpenGL® 1.4 compliant, future compatibility with OpenGL® 2.0

Floating point buffers


24-bit Z buffer


Scalable Graphics Compositor

Yes  Hewlett-Packard



Invent in real time and real size

Imagine the power when teams are able to interact with their designs in real time and conduct reviews at full size. The HP Visualization Center sv7 pushes high-end visualization closer to your design teams with unlimited resolutions, image sizes, and performance levels.

What were once component level designs, integrated at manufacturing, have become complete digital prototypes-designed, tested, manufactured and sold practically before metal is even cut. With the success of the entire product design in the balance, accurate real-time visualization of complete digital prototypes is imperative.

Only the HP Visualization Center sv7 empowers designers and engineers to review, collaborate and visualize their designs as they exist in real life, with photo-realistic image quality, at a level of performance and interactivity that keeps pace with their thought processes. Not only is the HP Visualization Center sv7 a solution that grows as your needs grow—to match power to problem size but also a multi-use architecture. You can use idle render nodes as individual design stations or a compute cluster. Now, you can work concurrently and collaboratively across the extended enterprise to harvest the collective experience and insight of your entire organization.







Application visualization performance

Each graphics pipe has the whole bandwidth of the HP workstation xw8000 with nVIDIA FX 2000 graphics (maximum 8 systems per display). Each pipe renders in parallel only what is visible in its display region.

Truly interactive visualization viewing large, complex models. When model complexity is high, this performance makes the difference between a highly productive and immersive collaborative environment and an intolerably slow and frustrating experience.

Image quality

Full visual fidelity of fine grain detail and surface edges

Up to 64 independent pixel samples are computed and then blended down to the resolution capabilities of the display device.

A very high quality image without the "jaggies and crawlies" normally associated with computer graphics.

Resolution and image size

Each pipeline supports up to 1600 x 1200 resolution and can be configured in an m x n matrix to drive massive high resolution power walls.

Each pipeline can be overlapped to create a seamless display, or they may be edge matched or use a negative overlap to compensate for any frame that might exist between display panels.

Increased resolution improves image quality and accuracy; increased screen real estate enables individuals and teams to quickly absorb contextual information and cause and effect relationships. It also eliminates the loss of concentration when forced to zoom in to see detail and zoom out to see context.


Allows users to select the level of performance, image quality and resolution independent of each other.

Users can allocate graphics pipes on the fly to be used toward performance image quality or a combination of the two.

Users can interact with an image and enjoy at high performance levels, then instantly switch to the highest image quality to study a particular line of sight.

Native Active and Passive Stereo

Native Active and Passive Stereo is supported per display channel and is synchronized across all graphics pipelines.

Active Stereo supports resolutions up to 1280 x 1024 per display channel (100 Hz refresh rate). Left and right eye Passive Stereo is supported at resolutions of 1280 x 1024 (up to 100 Hz refresh rate).

Visualize models with life-like 3D realism, immersing the team in their design.


HP Scalable Visualization Architecture

HP solution can scale to map a solution to your problem size

Architecture scales in performance, resolution and image quality. Users can start small and grow the solution as their requirements change.


Flexible multi-use architecture

Each render node adds a fully functional HP Workstation xw8000 to the network.

Maximize compute and visualization resources within the orgranization.

Transparent application support

Standard HP-UX OpenGL applications are supported. No code modificatios are required. Qualified display list based applications will scale performance along with image quality and resolution.

Applications that know nothing about multi-pipe architectures and scalable scene antialiasing capabilities now support them transparently.

The same application tool sets that are deployed in your engineering and research environments are supported.


  Sun Microsystems


Sun Fire V880z Visualization Server







1.2-GHz UltraSPARC III processor

Cache per processor

Primary: 64 KB data and 32 KB instruction on chip
Secondary: 8 MB external




  • Up to 48 GB with six UltraSPARC processors and a single Sun XVR-4000 graphics accelerator installed
  • Up to 32 GB with four UltraSPARC processors and dual Sun XVR-4000 graphics accelerators installed





Gigabit Ethernet and 10/100-BaseT Ethernet ports


FC-AL disk controller


Two RS-232C/RS-423 serial ports (DB25) via splitter cable


Two USB ports


Nine full-sized, hot-swappable PCI slots compliant with PCI specification Revision 2.1:

  • Two at 66 MHz, 64 bits wide, 3.3 volts
  • Seven at 33 MHz, 64 bits wide, 5 volts




Host adapters

Internal SCSI to support the internal DVD and two optional removable media devices

Up to nine PCI-to-dual FC-AL adapters or PCI-to-Ultra SCSI adapters

Internal tape

Optional 12 GB DDS-3 or 20 GB DDS-4 single drive

Internal disk

Up to twelve 3.5-in. x 1.0-in. 10,000 RPM FC-AL disks, 73 GB capacity




Operating System

Solaris 8 10/01 or later, and Solaris 9 4/03 or later


C, C++, Java, OpenGL



System and network management

Sun Management Center, Sun Remote System Control, Sun Cluster, Solaris PC NetLink, Solstice Jump Start, Solaris Web Start, Solaris AdminSuite, Solstice DiskSuite, Solstice Backup, and other Solstice products





Optional Sun monitors from 1995 onward (17-, FP18-, 19-, 20-, 21-, and 24-inch)

Frame buffer

Up to two Sun XVR-4000 graphics accelerators

Keyboard and mouse port

USB keyboard and mouse

PCI options

One 10/100/1000-Mbps Ethernet port, FC-AL, Sun Quad FastEthernet, token ring, ATM-155, ATM-622, SONET network interface, SSL accelerator, high-speed serial, Ultra SCSI with 10/100-MBps Ethernet, dual-channel single-ended UltraSCSI, dual-channel differential UltraSCSI, combination dual Fast Ethernet, and dual SCSI




Sun Fire V880z system:

100 - 240 VAC, 47 - 63 Hz, 1.48 KVA

  • 14.8 A/power cord at 100VAC
  • 12.3 A/power cord at 120VAC
  • 6.15 A/power cord at 240VAC

Sun XVR-4000 graphics accelerator:

  • Maximum power: 315 W
  • Normal operation: 238.12 W




AC power

100 - 240 VAC, 47 - 63 HZ, 1.48 KVA

Operating temperature

5° C to 35° C (41° F to 95° F), 20% to 80% relative humidity, noncondensing, 27° C max. wet bulb

Nonoperating temperature

-20° C to 60 C (-4° F to 140° F), 5% to 93% relative humidity, noncondensing




Meets or exceeds the following requirements:


UL 60950, CSA C22.2 EN 60950 (from UL), TUV 60950, UL CD scheme IEC 950 (CE mark), GOST



RFI/EMI FCC Immunity

Class A: Australia/New Zealand AS/NZ 3548

Industry Canada ICES-003

European Community EN55022/CISPR22

Japan VCCI

Taiwan CNS 13438




US DHHS 21 CFR Subchapter J, PTB German X-ray Decree (for monitors)

Regulatory Markings





Sun Fire Visualization server:


714 mm (28.1 in.) with casters

700 mm (27.6 in.) without casters


480 mm (18.9 in.)


826 mm (32.9 in.)


88.1 kg (194.0 lb.) (minimum, approximate)

135.6 kg (299.0 lb.) (maximum, approximate)

Rack-mount enclosure

Fits into 19-, 23-, 24-in., and 600 mm standard racks

Sun XVR-4000 graphics accelerator:


8.3 cm (8.2 in.)


9.1 cm (1.8 in.)


20.3 cm (19.4 in.)




The Sun XVR-4000 graphics accelerator features:

  • Up to two Sun XVR-4000 graphics accelerators per Sun Fire V880z visualization server
  • 144 MB of frame buffer memory
  • 1 GB (4 x 256 MB) of texture memory
  • Anti-aliasing through dedicated, 5 x 5 programmable radial filter chips
  • Maximum 3-D resolution of 1920 x 1200 @ 75 Hz with 4 samples/pixel or 1280 x 1024 @ 75 Hz with 8 samples/pixel
  • Maximum 3-D stereo resolution of 1280 x 1024 @ 112 Hz with 4 samples/pixel
  • Multidisplay capability at dual 1280 x 1024 @ 75 Hz with 4 samples/pixel or dual 1280 x 1024 @ 112 Hz (non-aliased stereo)
  • Dynamic video resizing for guaranteed frame rates
  • S-video output for recording entire displays in NTSC/PAL format
  • Framelock for synchronizing multiple displays
  • Genlock for synchronizing to an external video source or for synchronizing buffer swaps between multiple graphic frame buffers
  • Support for two asynchronous video streams using dual 13W3 video output with 10-bit per color channel
  • Adjustable gamma correction (12-bit in and 10-bit out and one 3 x 1024 x 10-bit adjustable gamma correction table for each video stream)
  • Multiple hardware color maps (four 3 x 256 x 10-bit color maps for each video stream)




Other companies that certainly participating in developing an Image Generator are: IBM, and Quantum 3d.

5.8.2      Display Systems


Most display systems date back to the 1960’s where Ivan Sutherland discovered that by using separate monitors for the eyes one could create the stereoptic vision one would need to get the sense of immersion within a 3d environment.  This section will briefly describe some of the visual systems that are being used today.  Expect to see information on Desktop displays, Head Mounted Display (HMD) systems, Immersive Desks, Immersive Rooms, Walls and Theatres.  Desktop VR


With Desktop VR the user views and interacts with the computer represented images on a traditional computer graphics screen with the assistance of a visual device that can either give you a monoscopic or stereoscopic view.  Contrary to popular conceptions, non-­immersive desktop VR systems are by far the most common and inexpensive forms of VR there are. They consist of little more than a computer generated VE delivered by a conventional though usually high-powered PC (such as a Silicon Graphics Indy workstation). The graphical images themselves though described in three dimensions do not convey any difference in relative effectiveness unless viewed stereoscopically usually through special stereoscopic goggles. The main advantage of this form of VR is that it is relatively inexpensive (which is why it is popular among home brew enthusiasts). The high resolution of the screen provides good quality visualization of the graphical environment in contrast to the significantly lower quality performance of many HMD's. Major disadvantages with this form of VR include the complete lack of any feelings of immersion on the part of the user. This is relevant to educational applications since it is the quality of the immersive experience that determines the amount and rate of learning achieved.[26]


Crystal Eyes


SynthaGramTM Monitors
SynthaGram Monitors provide bright, crisp, and deep photographic, movie, video and computer generated Glasses-Free 3DTM images.

CrystalEyes® Eyewear with Emitters
CrystalEyes is comprised of liquid crystal eyewear that offers Stereo3D viewing. CrystalEyes 3 is designed for CAVEs, theaters and immersive environments. CrystalEyes Workstation is designed for the desktop.





Monitor ZScreen®
The Monitor ZScreen enables Stereo3D viewing on your existing CRT monitor. It provides true Stereo3D visualization capabilities when used with a pair of polarized glasses.

Projection ZScreen®
StereoGraphics Projection ZScreen works with CRT or DLP projectors to allow audiences to view stereoscopic content using polarized glasses.


The Panoram PV230 DSK is the first visualization display aimed at the "professional consumer" market and provides 2.4 Megapixels of working real estate featuring a fully articulated swing arm and Digital Direct inputs.                                                   

5DT (Fifth Dimension Technologies)


WindowVR is a friendlier approach to Virtual Reality. Historically, many believe that "real" virtual reality applications must include an HMD. WindowVR dispels this common misconception delivering a fully integrated VR solution complete with display, position tracking, and built-in navigation controls. Choose from a variety of flat panel display sizes, resolutions, and position tracking solutions.

WindowVR, unlike the HMD, is designed to allow natural, intuitive interaction with the virtual environment. The display requires no instruction. Users can walk up, grab the handles and instantly begin interacting. Others can follow the action by moving besides the primary user.


Reachin Desktop display




Medical Visualization with Haptic Feedback

The natural computer interface realized through Reachin Display makes the user feel, see and interact with objects just like in the real world. Reachin Display integrates a PHANTOM™ force feedback device from SensAble Technologies with stereo monitor and supporting systems.  The innovative use of a semi-transparent mirror creates an interface where graphics and haptics are co-located - the user can see and feel the object in the same place!

Developer Display
 Its standard tool is a stylus or thimble, with the option to integrate a customized end effector. This is the system of choice for advanced applications such as:
Medical simulation and training, Virtual dentistry, Assembly training and evaluation, Digital mock-ups, Human-Computer Interfaces, Advanced Research and Development.

Desktop Display
The result is a cost effective solution with a generous workspace. The high precision and haptic performance makes it suitable for applications such as: Industrial Design, 3D sketching, Data mining, Scientific visualization, Digital content creation, Education and training.  Head Mounted Display (HMD)


What is an HMD?  Short for head-mounted display, a headset used with virtual reality systems.  An HMD can be a pair of goggles or a full helmet. In front of each eye is a tiny monitor.  Because there are two monitors, images appear as three-dimensional.  In addition, most HMDs include a head tracker so that the system can respond to head movements.  For example, if you move your head left, the images in the monitors will change to make it seem as if you're actually looking at a different part of the virtual environment.


Here are some examples of this apparatus:


Virtual Research


V8 sets a new standard in high performance, professional HMDs. New active matrix LCDs with true VGA ((640x3) x480) pixel resolution provide bright, vibrant color and a CRT quality image. New mechanical designs means V8 is lighter, easier to use, and more durable. V8 dons quickly and comfortably using rear and top ratchets and a spring loaded forehead rest. Adjustments are quick and precise. The interpupillary adjustment doubles as an eye relief adjustment to accommodate glasses. High performance Sennheiser earphones swivel, rotate, and remove easily when not in use.

Inputs and outputs for audio, video, and power are handled through V8 external control box. Red LEDs indicate Power On and Stereo modes. Standard 15 pin VGA type connectors accept VGA (640 x 480 60Hz) inputs, readily available on today’s graphics engines and workstations. Overall brightness and contrast adjustments are easily tuned from the front of the control box. External VGA monitors may be connected to the MONITOR OUT located on the control box.


Be sure to check out the many different versions of HMD/headset/VR-helmet comparison Chart from stereo3d.  



  Immersive Desks


Fakespace’s  ImmersaDesk® is a versatile, permanent or portable virtual modeling station ideal for development and engineering review applications. While small enough to fit into an office, the self-contained M1 offers a large 44" diagonal high-resolution visualization screen. The desktop angle is adjustable to suit any work style or viewing preferences. Optional head tracking facilitates the correct perspective of stereo images as you move naturally around the desk. The optional tracking systems also facilitate the use of various interaction devices such as a stylus, V-Wand™, and CubicMouse™. The folding design allows for easy room-to-room movement, storage or transport.

M1's contain a number of features including; a single lever adjust for variable screen angles, its fast and reliable set-up (less than 30 minutes), and is compatible with a variety of computing platforms


Fakespace’s ImmersaDesk® R2 is a robust; proven performer that is ideal for internal data reviews, trade shows, special events and marketing presentations. The R2 is also available with digital DLP™ projection technology - the R2D. Designed with a rugged shipping container casing, an R2 in transport mode can be rolled through standard doors and easily sent to the next office or around the world. Yet, the R2 unfolds to fully operational status in less than thirty minutes. The large screen is perfect for group collaboration or audience captivation. Use the screen in vertical position for through-the-window stereoscopic visualization or presentations. In angled format, the R2's large screen places the user in an immersive bird's-eye view. Coupled with a standard motion tracking system and Fakespace interaction device, the ImmersaDesk R2 is a versatile, interactive visualization system with the added benefit of true portability and set-up reliability.

R2's contain a number of features including:  a variable screen angles and height accommodates, has a Fast, reliable set up (in less than 30 minutes) and has a rugged shipping container allowing for easy transportation.













The Baron from Barco is a high-performance projection table.  It proves to be a great tool for cost-effective research in very diverse domains, ranging from scientific research and molecular modeling, to prototype designing. This high-performance projection table enables multiple users to simultaneously evaluate a 3D representation and make changes as a group, thereby reducing the time-to-market and the cost of development. 




Specially designed DAB (Diffuse and Bright) screen
• Higher brightness and color uniformity than standard diffuse screens
• Wide viewing angle, provides maximum image perception from any angle


Superb image quality
Based on the BarcoReality 908 and the specially designed DAB screen, the Baron renders high quality graphics images with unprecedented sharpness and quality.


Easy to Transport and Set Up
• Modular design of the Baron enables easy setup and disassembly in less than 30 minutes.
• All parts fit through any standard door opening.


Outstanding ergonomics
• Can be repositioned at the height and angle best suited for the operator; the angle and height can be read out and controlled in a user-friendly manner.
• Convenient space for the operator's feet.


VR stereoscopic projection
• Standard with 'fast phosphor' green CRT for stereoscopic projection.

• The polymer-composite materials are 'tracking device-friendly' for stereoscopic projection.


The Digital Imaging Table (DIT-1) from Panoram Technologies is designed for projecting super high-resolution (2300 x 1840 pixel) still and motion images on a tabletop imaging surface, allowing decision-makers to review and analyze the displayed information in a collaborative environment. The 65-inch diagonal "light table" has near print quality resolution intended for use in critical imaging applications such as mapping, intelligence, planning, navigation, command and control, medical and scientific computing.

The display uses four seamlessly blended projectors to create the 4.23 Megapixel image. It offers a combination of resolution and form factor never before available.

The DIT-1 offers four modes of operation, making it compatible with a wide array of sources. Mode 1 displays an external multi-channel computer video source at full resolution (2300 x 1840), Mode 2 displays one of six external single-channel computer sources at SXGA resolution (1280 x 1024), Mode 3 displays full resolution internally-stored stills and Mode 4 displays full resolution internally-stored graphic animations.


With global positioning satellite and wireless technologies providing reliable location information for key assets such as equipment and personnel, this kind of mapping/imaging/asset overlay technology becomes a very important tool for a variety of applications that include resource management, oil/gas exploration, disaster management, law enforcement, forest fighting, search and rescue, military operations and a whole host of scientific imaging uses.


The VisionStation by Elumens is a low-cost 3D immersive viewing system with wide range of applications.



Standard flat-screen applications can display a field of view (FOV) of no more than 60°. The Elumens VisionStation allows for a fully immersive display of 160°. The VisionStation's ultra-wide FOV creates an amazing sense of space and depth, without need for goggles or glasses. The large size of the VisionStation screen (1.5 meters) also helps promote an excellent sense of immersive 3D.


SEOS has perfected a range of uniquely capable VR desks / workbenches.




  SEOS – Creator                                                                          SEOS – V-Desk 6  Rooms


Stewart Filmscreen Corporation uses a completely freestanding screen and frame systems built to specific project requirements using either front or rear projection systems.






The VisionDome by Elumens offers several configurations for audience sizes from 1 to 45 people. The projection system and 3D images are scalable across the product family. This means content can be developed once and used across the entire product family. Projection systems are interchangeable between models.


Model V3- With a ceiling height of 8 feet and audience capacity from 1-6 people, the V3 is a perfect fit for locations with small work groups or audience sizes. Highly affordable, the V3 should be considered as a financially attractive way to take 3D immersive display capability to the locations where the work is done.

Model V4- This model is the workhorse of the product family. It is used for audience sizes from 1-10 and is an ideal solution for mid-sized workgroups, small trade show venues or for entertaining small audiences. The V4 is also sized for many simulation applications.

Model V5- Larger audiences in customer events, trade shows, customer centers, entertainment or teaching environments will benefit from the roominess and large images of the Model V5. Large simulation devices also work well in the V5. The V5 accomodates 1 to 45 people.






Elumens' VisionDome is an immersive digital environment that, via proprietary projection and rendering technologies, creates a full color, high-resolution, interactive, 3D display in which a dozen or more people can simultaneously participate and collaborate without having to use restrictive head-mounted displays or goggles.

The VisionDome projects a 360 by 180 degree panorama on the interior of a dome, which is tilted from 0 to 90 degrees for viewing ease. Because of the Dome's parallax effect, how it engulfs viewers' fields of view, and Elumens' proprietary rendering technology, the display creates strong spatial 3D. This causes viewers to perceive 3D, and to feel completely immersed in the virtual environment.

The VisionDome is designed to be deployed for 3D applications in which many people need to participate jointly. Since there is only one projection source and nothing to wear, the VisionDome is the best solution for collaborative environments. Because of its simplicity, the VisionDome is a practical and cost-effective alternative to head-mounted displays.





Barco has designed a multi-purpose Virtual Environment that is portable and empowers the users to reconfigure the system between flat CADWall to a cubic immersive space environment and any position in between.


Cubic Immersive Space Environment (I-Space)
• Full immersive experience
• Single viewer typically
• Possibility to reconfigure into a sandwich configuration

Immersive Theater Configuration
• Typical angle of 45° between the modules
• Immersive feeling for about 20 viewers
• Collaborative work in a partial immersive environment




Flat Screen Configuration (CADWall)


·      Typical large Audience

·      Collaboration

·      Project reviews with large teams


Also, Barco delivers a projected Curved Screen with matching Virtual environment systems.   The BR Center is a complete integrated projection solution using Barco proprietary technology from projectors, screen structures to the peripherals. The combination of high quality hardware and our proven know-how of large projection technology ensure a successful integration and maximum return on investment. This multi-channel system is edge blended to provide one seamless image, projected onto a cylindrical screen, creating an immersive environment.

Application Area
This advanced immersive visualization environment is being used as a media facility in various industries from leading research institutes to commercial organizations. The systems are designed to support various business processes such as data analysis, design and modeling, training, planning and presentation.


The PanoWall from Panoram Technologies are bright, large scale, rear-projected image walls that are ideal for engineering, product development, presentation environments and Command and Control applications. They feature high brightness, DLP stereographic and non-stereographic projectors. The projectors are mounted behind the screen on Panoram's proprietary optically alignable projector assemblies featuring high grade front-surface mirrors and MicroWarp alignment optics.


The basic stretched vinyl screen can be upgraded to a specialty coated acrylic material. Cabling is performance matched to the extreme resolution capabilities of the system, Image integration electronics include Video Panoram(R) edge blending, the exclusive Panoram Black Level technology to improve dark scene viewing, eight channel sources switching, and Panoram's incomparable Integrator 2000 drag-and-drop user interface.

The systems are scalable and come in a variety of formats to suit various applications and needs.  Each PanoWall system comes with Panoram's incomparable Integrator™ system control. This ties all parts of the system; projectors, edge blending, input selection, audio, screen formatting and presentation sequencing into a unique, graphically intuitive, user friendly interface that makes the PanoWall system an ideal presentation and collaboration tool with unprecedented resolution, screen real estate and flexibility to fit most any need.  Theaters


 A most versatile simulation display technique is the segmented rear projection screen. Each screen panel can be manufactured in a variety of shapes and sizes to accommodate display scenarios ranging from a M-2 DART type system to a full vista 360-degree air traffic control (ATC) simulator.  Stewart Filmscreen’s proprietary cantilevering wedge screen framing system allows each image to be rear projected completely to the outer edge of the seam – which minimizes the apparent seam gap between all adjacent images. 


Whatever the project requirements, Stewart Filmscreen can meet your needs with its full line of rear projection screens.  Screen choices include: Stewart acrylic diffusion, fresnel lenticular lens or weight saving flexible diffusion.  When floor space is limited behind the screens, consider folding the projector optics with front surface mirrors.  Stewart Filmscreen can design, build and deliver turnkey front surface mirror systems, including projector stands and projector mounts.



This product coupled with a Cyviz “Stereo 3D Convertor” which enables high quality stereoscopic 3D visualizations can produce great results.  Also Cyviz produces a PC stereo edge blending solution as well.


















The commercially available CAVE is the original FakeSpace Systems CAVE from the Electronic Visualization Center located in Iowa. 





CAVE (an acronym for "CAVE Automatic Virtual Environment") is a room-sized advanced visualization solution that combines high-resolution, stereoscopic projection and 3D computer graphics to create the illusion of complete sense of presence in a virtual environment. The CAVE was the first virtual reality technology in the world to allow multiple users to fully immerse themselves in the same virtual environment at the same time. Today, more CAVEs are installed in visualization facilities around the world than any other spatially immersive display system. The CAVE is available with four projected surfaces (3 walls and floor); five surfaces, or fully enclosed six surface configurations for complete virtual immersion.




SEOS' leading-edge technology and specialist knowledge of virtual reality has empowered us to design, supply, project manage and integrate some of the most complex V-Domes in use today. These sensational fully immersive environments are scalable from 6m/20ft diameter to 21m/69ft diameter hemispheres and beyond.

The technical challenges associated with the projection of real-time images across huge hemispheres are many times more complex than those presented by conventional VR environments. SEOS has created, developed and adapted specific technologies to solve these problems delivering cutting edge display technology and show automation controls for V-Domes.



  Future Display Products


Lightspace Technologies has a product known as the DepthCube Z1024 3D Display system. 


It is a rear-projection volumetric display in which a high-speed DLP(TM) video projector sends a series of 3D image slices into a 3D projection volume. The projection volume is composed of a physically deep stack of 20 electrically-switchable liquid crystal scattering shutters. At any instant in time 19 of scattering shutters are transparent and only one is in a white scattering state. We sequential switch a single shutter into the scattering state and project onto it the appropriate image slice corresponding to its physical depth. Since each image slice is stopped in the projection volume at the correct depth, the DepthCube produces a 3D image that is truly deep. A patented 3D anti-aliasing hardware algorithm virtually eliminates the visual discontinuities between layers so that the 3D image appears to be completely smooth and continuous, so continuous you won’t believe there are only 20 layers!

With the high speed projector sending out 1000 image slices per second, the whole volume is refreshed 50 times a second. This is comparable to field refresh rate of NTSC video in the US and PAL video in Europe (although the actual frame refresh rate of these is 30 Hz and 25 Hz respectively).

Due to the high speed digital interface between the computer and DepthCube Z1024 3D Display, a completely new 3D image can be written to the display nearly 20 times each second. Although not quite fast enough for Virtual Reality, this update rate is fast enough for real-time user interaction with the 3D image.  

A prime example of this product’s use is as a tool to help scan airport baggage mandated by the US Congress and other worldly countries to obtain quick accessible information of one’s baggage without interruption of luggage traffic flow.


A 3D baggage scanner, such as the eXaminer 3DX® 6000 from L-3 Communications®, acquires true 3D information about the contents of a bag. However, all of the critical depth information must be discarded in order to display the baggage scan on a conventional 2D computer display. The screener must then attempt to recreate a 3D mental representation of the bag contents in order to detect contraband. Various computer tricks can be played that generate different perspective views or a rotating image, but these are still 2D projections of what is really 3D data.

The consequence of flattening 3D data onto a 2D display is to create ambiguity and uncertainty in the detection of contraband. These then lead to either an increase in viewing time with a resultant loss of throughput, or a reduction in accuracy resulting in an unacceptable increase in the volume of approved contraband.        

The DepthCube 3D Volumetric Display eliminates the need to convert 3D information into a 2D image. This critical step enables the screener to use all of their mental faculties to identify contraband instead of performing 2D-to-3D mental conversions. The result is greater throughput and accuracy and greater overall value obtained from expensive 3D baggage scanning equipment.








Perspecta Spatial 3D System includes a 20-inch dome displaying full-color and -motion images that occupy a volume in space, giving users a 360° god's-eye view - without goggles. Perspecta is plug-and-play, rendering all movement from widely used open-standard 3D applications. It features:

·      Unmatched 100 million-voxel ("volume pixels") resolution and brightness

·      High-speed interaction with supported applications

·      Hardware-accelerated dithering to display hundreds of colors

·      Tools for rapid application development based on the industry-standard OpenGL® API



Another form of the fully immersive spherical projection system is known as the “CyberSphere”.  There seems to be one important limitation for these types of VR systems.  This is the inability to move around the virtual environment in a natural way. An observer is either constrained by the physical boundaries (as with the CAVE system), or by the range of a head tracking system. The evident need to remove this limitation has been demonstrated by the development, in the U.S. of a device similar to a stationary unicycle, which attempts to simulate the walking motion of a person sitting upon it. It is not the ideal solution, as it introduces its own restrictions upon freedom of movement.


The spherical projection system comprises a large, hollow, translucent sphere, and 3.5 meters in diameter, supported by means of a low-pressure cushion of air. This air cushion enables the sphere to rotate in any direction.  An observer is able to enter the large hollow sphere by means of a closable entry hatch. Walking movements of the observer cause the large sphere to rotate.  See figure below.  [27]


Diagram Of Main sphere

Image Courtesy of VR-Systems – Ndtilda. UK


Rotational movement of the large sphere is transferred to a smaller secondary sphere, which is supported by means of a ring, mounted upon a platform, within which are mounted bearings (see figure below). The smaller sphere is pushed against the large projection sphere by means of spring-loaded supports. Rotational movement of the smaller sphere is measured by means of rotation sensors, pushed against the circumference of the sphere by means of spring-loaded supports.

Diagram of Small sphere

Image Courtesy of VR-Systems – Ndtilda. UK

5.8.3      Tracking Systems


Tracking devices were developed to help one maneuver through a virtual environment.  There are several types of tracking systems that use specific devices and configurations that integrate with existing virtual reality platforms and systems.  A study done by a committee of individuals introduced a summary comparison of these types of tracking systems.  The following sections include various inputs from that study from 1999.[28]  Electromagnetic


Six degree of freedom (DoF) electromagnetic tracking is based on the application of orthogonal electromagnetic fields.[29] [30] [31]  The system consists of a magnetic field transmitter and a receiver coupled via driving circuits. The electromagnetic transmitter contains 3 orthogonal coils that are pulsed in a sequence and the receiver has 3 coils that measure the field generated by the transmitter. The strength of the received signals is compared to the strength of the sent pulses to determine the position and compared to each other to determine the orientation.  To date two varieties of electromagnetic position trackers have been implemented: one implementation uses altering current (AC) to generate the magnetic field, the other uses direct current (DC). In an AC system, a current is sent to the emitter coils in a sequence so that 3 mutually perpendicular magnetic fields are generated. The field induces currents in the receiver that also consists of 3 passive mutually perpendicular coils. Sensor location and orientation therefore are computed from the 9 induced currents by calculating the small changes in the sensed coordinates and then updating the previous measurements. Carrier frequencies are typically in the 7 to 14 kHz range. The excitation pattern and processing are repeated typically at 30 to 120 Hz rates. 


In contrast to the continuous wave generated by the AC systems, DC systems emit a sequence of DC pulses, which is in effect an equal to switching the transmitter on and off. This design is intended to reduce the effect of the field distortion due to the eddy currents induced in nearby metals when the field is changing. The initial measurements are performed with all 3 antennas shut off so that ( ) z y x , , components of the Earth’s magnetic field are measured. Next, each transmitter coil is pulsed in a sequence and the induced current is recorded on each receiving coil after a short delay allowing the eddy currents to die out. Earth’s magnetic field components are then subtracted from the 9 measured values generated in each receiver coil by each pulse and the resulting 9 measured values are then used to compute the location and orientation of the receiver relative to the transmitter.



Electromagnetic trackers have several advantages over other technologies. They are popular and supported by many years of field use. The receivers are small and unobtrusive, typically smaller than a 1" cube, and can be fit onto an existing head mounted display with very little extra burden. Multiple object tracking is also available. The Flock of Birds transmitter made by Ascension, for instance, supports up to 16 receivers at one time.


There are also several disadvantages. First, the receivers are tethered. To transmit the information from the receivers, wires must be run from each one to the computer, limiting the motion of the user. Untethered magnetic trackers are now appearing on the market but comparatively bulky IR transmitters replace the wiring and battery packs. Second, transmitters typically have a small range, with accurate measurements ending at a few feet. Part of this is due to the limited power of the transmitter, but a large part come from side effects created from environmental conditions. Electromagnetic interference from radios and other devices can cause erroneous readings. Also, large objects made of ferrous metal can cause disruptions in the magnetic field if they’re too close. Too close is generally defined as being closer to the transmitter or receiver than the two are to each other. This means that for an environment to optimally function with a magnetic tracking system, metal objects must be kept clear. In an office environment where metal desks, filing cabinets, and computers can’t be removed, working range is limited. Another problem is that most buildings contain metal support structures in the floor, walls and ceiling that further reduce the working range of the tracker. In most practical environments, providing a metal-free area is virtually impossible.

Researchers have developed techniques to account for these environmental conditions and compensate for their effects. Warps in the magnetic field can be mapped allowing errors in measurements to be predicted and compensated for. Researchers at the University of Illinois came up with calibration matrices to correct for errors in their CAVE Extended Range Transmitter Flock Of Birds (ERTFOB).



Figure Polhemus series                                       Figure Polhemus tracking Configurator


Polhemus Inc. - LIBERTY is the latest development in electromagnetic tracking, with an unprecedented speed of 240 updates per second per sensor, up to 16 sensors, and accuracy of .03 RMS for X, Y, Z position and .15° RMS for orientation.


 LIBERTY makes it easy to track virtually anything that is non-metal. Within the Software Developer Kit (SDK), API libraries allow for easy integration into custom applications. The easy to use GUI (image below) allows four independent user definable profiles for setting system parameters such as filtering, output formats, coordinate rotations and many more, allowing multiple applications or users. New environmental calibration software is included that allows rapid calibration of most environments in minutes. Additional features include a data record/playback component, plus the capability to quickly export data via Microsoft® "Named Pipe".




                   Figure 5DT – Data Glove 5                                           Figure 5DT – Data Glove 16


The Fifth Dimension Technologies (5DT) Data Glove 5 measures finger flexure (1 sensor per finger on the left and 2 sensors per finger on the right) and the orientation (pitch and roll) of the user's hand. It can emulate a mouse and can be used as a baseless joystick. The system interfaces with the computer via a cable to the serial port (RS 232 - platform independent). It features 8-bit flexure resolution, extreme comfort, low drift and an open architecture. The 5DT Data Glove 5-W is the wireless (untethered) version of the 5DT Data Glove 5. The wireless system interfaces with the computer via a radio link (up to 20m distance) on the serial port (RS 232). Right- and left-handed models are available. One size fits many (stretch lycra). A USB adapter is available.



Immersion – CyberGlove (with a tracking device from either Polhemus or Ascension)




The CyberGlove® is a fully instrumented glove that provides up to 22 high-accuracy joint-angle measurements. It uses proprietary resistive bend-sensing technology to accurately transform hand and finger motions into real-time digital joint-angle data.


The CyberGlove has been used in a wide variety of real-world applications, including digital prototype evaluation, virtual reality biomechanics, and animation. The CyberGlove has become the de facto standard for high-performance hand measurement and real-time motion capture.



Ascension Technologies – 3D Navigator



3D Navigator comes comes with an extended range transmitter (far left), a wireless electronics unit with sensors (center) that is easily mounted on a user, and a compact electronic unit (far right)

·      Hip-mounted electronics unit in lightweight nylon pack with belt strap and holster for Wanda.

·      4 batteries and battery charger

·      Wanda with embedded 6DOF sensor for interactive pointing and navigation

·      6DOF sensor for head tracking

·      Compact base station with antenna and Ethernet interface

·      Extended Range Transmitter and Controller

·      42 inch (1.1 m) transmitter mounting pedestal

Wanda is designed into each 3D Navigator. It is a palm-sized, thumb-activated navigation and interaction tool with a joystick and 3 programmable buttons. It comfortably fits into your hand for pointing and manipulation of 3D objects in a virtual environment. This makes it an ideal tool for CAVE and large-scale visualization systems.

Its joystick-like control ball lets you control translational motions in both the "X" and "Y" axis. Three programmable, colored 'momentary' switches offer excellent tactile feedback for selection reassurance. In addition, its embedded 6 DOF magnetic sensor lets you track and mark locations within the virtual environment.







Ascension Technologies – Flock of Birds (Real-Time Motion Tracking)


  • Position and orientation tracking without restrictions. No need for a clear line-of sight between sensors and transmitter; blocking is never an issue.
  • Simultaneous tracking of all sensors without degradation in measurement rates. Track head and hands at the same time without delay or lag.
  • Pulsed DC magnetic tracking lets you operate in environments precluding use of earlier AC electromagnetic trackers.
  • Multiple configurations available to address most tracking requirements.  Mechanical


Mechanical tracking systems range from simple armature devices to complex whole body exoskeletons. They measure limb positions through mechanical linkages where the relative position of each link is compared to give final positions and orientations. Measurements are usually joint angles and lengths between joints. Potentiometers and LVDT’s are typically used as transducers, but a wide array of technologies can be used, including resistive bend sensors and fiber optics. Mechanical trackers can be ground-based or body-based. A tracker is called ground-based, when it is attached to the ground, allowing the user to work only within its constraints. Body-based trackers attach the entire system to the user allowing freedom of movement.



There are many advantages to mechanical trackers. They are impervious to external fields or materials, and thus, can be used in a wide range of environments. If properly attached to the body, the exoskeletons accurately reflect the user’s joint angles. The sensors used are taken from well-developed technology from other fields and are inexpensive, accurate, and fast. Good speed is inherent in these systems and does not severely limit bandwidth or latency. Systems that are contained on the body are not limited to a confined workspace and can be used in many environments impractical for other tracking techniques.


Unfortunately, mechanical trackers are very cumbersome. They can be bulky, heavy, and severely limit the motion of the user. Devices are attached to soft tissue, trading comfort for accuracy -- a tight fit is necessary to maintain accuracy. Making a system feasible for marketing requires it to be robust enough to fit multiple users of different height, weight, and gender. This is a difficult problem at best! The machine-to-human connection is subject to shifting. Relative motion between the user and the device can occur, resulting in random error during use.                                                          Fake Systems



           FakeSpace Boom 3C                                       FakeSpace FS2                                           FakeSpace Pinch



Another form of the Mechanical tracker can be expressed by the telepresence of robotic machinery.


























Acoustic tracking is a fairly simple technique, typically employing ultrasonic frequencies to avoid human detection. A transmitter emits ultrasonic sound, and a special microphone receives it. The time it takes the sound to reach the microphone is measured, and from that distance can be calculated. This technique is known as time-of-flight (TOF). One microphone and one transmitter allow determination of distance only, so triangulation of multiple signals is needed to find position. Three microphones or three transmitters are needed for this. To get both position and orientation, three transmitters and three microphones are needed.

Another less commonly used method is phase coherency. This technique measures the phase difference between the tracking signal and a reference signal. The difference is used to calculate changes in the position of the transmitter relative to the receiver. The problem with this method is that errors accumulate over time. Periodic re-calibration is necessary.

Another technique uses passive acoustic tracking techniques. This system uses a four-microphone array as detectors and the human voice as the transmitter [Omologo, 1996]. This technique uses crosspower spectrum phase analysis and in a room with moderate noise and reverberation shows locational accuracy between 2 cm and 10 cm.


The technology is inexpensive, and the parts used to make an auditory tracker are readily available. The microphones and transmitters are small, and can easily be placed unobtrusively on the body. It also has a longer range than most other technologies.


First, line of site must be maintained between the transmitter and receiver. Second, the speed of sound in air varies with temperature, pressure, and humidity. Steps must be taken to account for the environment in order to avoid inaccurate calculations. Furthermore, ambient sounds and echoes can cause interference and cause "ghost" signals to be received, and interfere with incoming signals. These effects can drastically reduce the accuracy and effective working area of acoustic trackers. Techniques to effectively address both ambient sounds and echoes are just starting to show up in the literature

Logitech – 3D mouse and tracker



The 3D Mouse and Head Tracker systems operate as follows: the transmitter emits ultrasonic signals. The control unit, via the receiver, detects these signals and derives receiver position and orientation data.  The control unit reports this data and 3D Mouse button activity to the host computer.  Infrared


The usual equipment for a standard infrared (IR) tracking system includes; several emitters in a fixed rigid arrangement, cameras receive the IR light to help triangulate the position based on the camera data.  Also, IR seems not to be affected by metal and has a high update rate and low latency.


Its disadvantages do require a line-of-sight and the fact that most IR systems can be affected by the glare or high-intensity light that may be in direct contact the tracking system.




The NDI Polaris System delivers precise, real-time tracking flexibility. Compact and easy to use, this innovative optical measurement system combines simultaneous tracking of both wired and wireless tools.

The Polaris System determines real-time position and orientation by measuring the 3D positions of markers affixed to both wired and wireless tools. The NDI family of renowned real-time 3D measurement systems continues a tradition of precise, accurate and versatile technology.


NDigital - Polaris                                                                       




For the Polaris position sensor, customer feedback has indicated a widely held misconception that tracking tools with passive markers is inherently less accurate than using tools equipped with active markers. This may arise, in part, from the inability to clean passive spheres in a repeatable manner, which limits them to single usage, and so discourages exhaustive testing with various appropriate configurations. But a major contribution stems from users often not fully understanding some of the key aspects of the representative statistics used to qualify the system’s performance. Many users have at best a rudimentary knowledge of some statistical principles based on ideal Gaussian distributions, but the specifications for spatial measurement systems are usually based on the distance error magnitudes, which are inherently non-Gaussian, and more importantly, are seldom distributed spatially in a uniform manner.


Optional Accessories for more accuracy



Autoclavable IRED markers with a 5 mm base                      A rigid probe with pointer tip and attached passive marker mounting posts and passive spheres.  Inertial


Inertial trackers apply the principle of conservation of angular momentum. They attach directly to a moving body, and give an output signal proportional to their motion with respect to an inertial frame of reference. There are two types of inertial tracking devices: accelerometers and gyroscopes. Accelerometers sense and respond to translational accelerations, and gyroscopes sense and respond to rotational rates.



Accelerometers sense and respond to translational accelerations. Their outputs need to be integrated once with respect to time to get velocity, and twice to get position. Many technologies are used to implement accelerometer designs, including piezoelectric, piezoresistive, and capacitive technologies.

Piezoelectricity is a common pendulous sensing technique. Piezoelectric materials develop distributed electric charges when displaced or subjected to forces. Piezoelectric accelerometers employ a cantilever design with a piezoelectric film attached to the beam of the cantilever. When accelerated, the proof mass causes the beam to deflect, which in turn causes the piezoelectric film to stretch, resulting in an electric charge difference (the output of the sensor). Piezoelectric accelerometers are called active devices because they generate their own signals. Since these sensors require a time-varying input (physical work), they do not respond to steady-state inputs such as the acceleration of gravity, hence they are called AC responsive (sense only changing signals).

Another common transducer technology used for accelerometers is piezoresistivity. Piezoresistive materials change their electrical resistance under physical pressure or mechanical work. If a piezoresistive material is strained or deflected, its internal resistance will change, and will stay changed until the original position of the material is restored. Piezoresistive accelerometers can sense static signals and are thus called DC sensors.



There are two main branches of gyroscope design: mechanical gyros that operate using the inertial properties of matter, and optical gyros that operate using the inertial properties of light.

Mechanical gyroscope designs are commonly of the vibrating type. Vibrating gyroscopes use Coriolis acceleration effects to sense when they rotate. This is accomplished by establishing an oscillatory motion orthogonal to the input axis in a sensing element within the gyro. When the sensor is rotated about its input axis, the vibrating element experiences Coriolis forces in a direction tangential to the rotation (orthogonal to the vibratory and rotating axes).

Optical gyroscopes operate based on the "Sagnac Effect." These sensors use two light waves traveling in opposite directions around a fixed path. When the device is rotated, the light wave traveling against the direction of rotation will complete its revolution faster than the light wave traveling with the rotation. This effect is detected by comparison of the phase difference in the two light waves.



Inertial trackers are small and potentially unobtrusive. AMP makes a model, the ACH-04-08, that has dimensions of 0.4 x 0.4 x 0.06 inches. Inertial trackers have an update rate that is one of the fastest of any tracking technology. Inertial trackers are unaffected by external fields and can be used in almost any environment.

InterSense ( markets a product, the IS-900 Mark 2 Precision Motion Tracker that has an update rate of over 500Hz.


InterSense – IS-900 Wireless Minitrax Head Receiver, Head Tracker and Transmitter.


The main disadvantage of inertial trackers is that they are subject to considerable drift. Positions are found by integrating over time (t) the signals of the trackers as well as any signal errors. As a result, position errors accumulate. To find position with an accelerometer, the raw data is integrated twice. Measurements are multiplied by t², thus even small amounts of noise can lead to large errors. In just 60 seconds, an accelerometer with an output noise level of just 0.004 g yields a position uncertainty of about 70 meters. Gyroscope errors increase linearly with time and are not quite as susceptible to large errors. They drift approximately 10 degrees per minute and are also sensitive to vibrations. Because of this inherit buildup of absolute positional errors, inertial trackers are limited in usefulness to relative motion sensing or tracking over brief periods.  Visual tracking


Video tracking methodologies are radically different from other VR tracking schemes. Other technologies provide direct, simple input to the VR system. This input usually takes the form of a 1-D data-stream of position, orientation, or acceleration of the object being tracked. This is not the case with video tracking. Video output is a series of 2-D, time sequential images. The size, frame rate, view volume, optical distortion and other factors are all dependent on the type of video used, any filters applied, and the analysis techniques employed. Currently there are very few integrated video-tracking products available for virtual environment work. This is partially because video output is much more complex than other tracking technologies' output, requiring substantially more processing power for analysis. The number of possible combinations of technical factors in video tracking systems is vast, and typically each system must be customized to fit the intended application.

Using Environmental Markers

Currently, the most common approach to registering object position and orientation, including the human body, is by placing known markers in the physical environment or on the object itself. These markers have known qualities that can be more easily detected in the video image. The techniques being developed are varied and new approaches are common in the research literature. One method is to use the inside-out approach and place multiple markers in the scene. The markers can then be used to determine the position and orientation of the camera. In one system the MIT Media lab used three fixed fiducial markers each with unique color values and ensured that those hues did not appear elsewhere in the environment [State 1996]. As long as the markers remained in the image frame the position of the user's head (with mounted camera) could be calculated.

Color is not a requirement, however, and the same techniques can be applied to gray scale intensity values. This is the approach taken by Kato and Billinghurst in the UW HITL (University of Washington – Human Interface Technology Lab) Mixed Reality project. Here, fiducial markers of known intensity values and shape are placed in the environment. Again, the camera is attached to the user and the position and orientation of the fiducial markers is determined relative to the user's head coordinates; a transform to world coordinates is not necessary. This allows a unique twist in the work done by Kato and Billinghurst. The markers are not static in the world. The markers can be moved and any virtual overlay associated with a marker moves with the marker in the user's display.


Video Tracking with Natural Objects

A growing area of research for vision-based tracking is that of natural 3-D object tracking. This includes accurate registration of a 3-D object, by which we will be able to track the head position and orientation, given the camera-model parameters. The main advantages of this approach are that the vision-based tracking system will no longer be dependent on fiducial markers, allowing greater mobility and a deeper sense of natural interaction.

A number of vision techniques for 3-D object tracking have been proposed in the literature. These techniques can be divided into the following major categories including: mesh-based modeling, neurofuzzy classification, simple shape fitting approach, feature-extraction based tracking and shape-volume approximation.


Video tracking has several advantages. It provides a flexible system that can be developed by instrumenting only the user, only the environment, or both. Video tracking allows many objects to be tracked by using a single sensor without having to tag each object. This capability to track multiple objects without placing transmitters on them removes an encumbrance that has hampered many other tracking technologies. When users are required to wear cameras, they are relatively small and light. Video tracking is also scaleable. The same technology can be used to track the motion of any of the objects in a room from hands to whole bodies. This scalability means that one technology has the potential to do the job previously accomplished only with several combined technologies.


Video tracking technology can be very complex. The output from a video camera is a 2D array, and the analysis method for that information is application dependent. The bounds and limitations of video tracking are not well understood at this time. Because there are no general purpose image tracking systems available, each application must develop its own video tracking system. Video tracking is still more of a research endeavor than an end user product.

Video tracking can also be significantly affected by environmental factors. Lighting conditions need to be controlled in order for the camera to see objects in the environment consistently. Significant changes in light, such as dimming the lights to present slides, can have a detrimental effect on the video data. A camera must also have an unoccluded view of the object it is tracking. Typically, redundant cameras are required to provide a continuous, unobstructed view.  Courtesy of an abstract for “Visual Tracking for a Virtual Environment”.[32]


There are few companies who have accelerated this process to fruition.  A research study shows that even though this process is understood it has been rather complex and difficult to produce.  However there are studies showing that this process may take place within the next few years. 


A research study has developed the VICON system using these components.



The workbench was fitted with 6 cameras, four were placed in front of the user orientated towards the working area and two were placed behind the user looking over their shoulder, see Figure (left) where four of the cameras have been highlighted. The cameras can either be fitted with visible light or infrared emitters for detection of the markers. As the workbench needs to be used in an unlit room the infrared emitters were attached the cameras. Components that are to be tracked are fitted with retro-reflective markers ranging in size from 4mm, 6.5mm and 12.5mm. (see Figure right).


For proper operation of the system, the cameras and the components within the real world need to be calibrated. Calibration of the cameras involves the tracking of a fixed size wand fitted with 2 retro-reflective markers at a known distance from one another within the operating volume of the cameras that takes about a minute to complete. The devices that are to be tracked also have the retro-reflective markers attached. The markers were placed on the required devices, in this case the shutter glasses (see Figure 1(right)) and the Intersense wand. Originally 5 12.5mm balls were placed on the devices. The locations of the balls were measured with respect to the origin of the device and using this information the system can calibrate the device for future real-time recognition. The 12.5mm balls can be seen attached to the wrist strap on the left of Figure 1(right).  After successful testing of the setup the markers were replaced with 6 4mm balls. These are much smaller are much less noticeable when placed on the devices in question. These were calibrated, tested and compared with the larger balls and were found to produce results of the same accuracy.


To implement gesture recognition, we attach a number of 4mm markers on a fabric glove. As shown in Figure 2 (left), there is a small number of markers placed on the metacarpal and some on each individual finger. The latter markers are placed at the finger joints, so that the distance between immediately adjacent ones remains fixed; this allows the formation of rigid body segments as seen in Figure 2 (right). The metacarpal is considered a single almost rigid body, while on average there are three bodies per finger. 


The VICON system is used in two stages. In the former, the user repeatedly performs a series of gesticulations, which span the entire human kinematic range at all, possible finger positions and angles. Subsequently, manual labeling of the markers takes place and the system calculates the kinematic information relating the trajectories of all markers of the entire motion trial to a predefined kinematic structure that correspond to the hand marker and body model. This is an off-line procedure and it only needs to take place once, unless the glove size and/or markers position change. The second stage is the real-time operation of the system that provides the spatio-temporal data, while the user is performing a series of gestures. During this phase, the captured marker trajectories are best fit to the stored model information.



Marker attachments on a glove (left). Schematic placement of markers and formation of body segments in a typical glove marker set configuration (right).




   Other Tracking Options


Simply put, is a combination of those aforementioned tracking systems.  This could produce an extended range of tracking.  One set of tracking data can be used in “dead reckoning” the other components.


Some Disadvantages are the cost, the complexity and correlating measurements between various systems.


With the latest technology advances in using Global Positioning Systems (GPS), Tracking is evolving into other areas of interest.  There are tracking methods being researched and developed that use GPS with a “Finger Tracking” from MIT and “Eye Tracking from RCI.


Eye Tracking by RCI

Eye gaze is a natural, fast, and low-effort physical act that may prove to be effective as a human-computer interface (HCI). When using a traditional pointing device such as a mouse or stylus, a user's gaze typically arrives at a point of interest, and the device-controlled cursor lags behind. Thus, advantages of a gaze-based interface include potentially increased efficiency as well as hands-free interaction.

At RSC, a combination of COTS eye and head trackers and custom algorithms is being employed in a system that determines the gaze position of a freely moving user. Referring to the figure, the system's major software processes are implemented as network servers. In this design, each module may be modified and upgraded independently. Furthermore, each piece is available for use by other client applications (e.g. the 6DOF tracking server is utilized in 3D audio applications, and the CST server is utilized in tablet/stylus applications). All of the servers interact with clients which may execute on any platform supporting Transport Control Protocol/Internet Protocol (TCP/IP).  [33]

Image Courtesy of Bauhaus-Universitat Weimar

Another Eye Tracking device Polhemus VisionTrak™ head mounted eye tracking system is a time tested, fully integrated turnkey solution for eye and target tracking. This robust system, developed by ISCAN, Inc. of Burlington, MA, collects pupil size, eye movement, and eye point-of-regard data from human subjects while allowing complete freedom of head movement. Distinctive system features allow for quick and accurate data collection and analysis for a variety of applications and environments.

Finger Tracking by MIT (Media Lab)


One of the simplest applications of this camera-based wearable computer is finger tracking. Many pen computer users appreciate the pen interface for its drawing capability. However, given that a computer can visually track the user's finger, there is no need to use a pen (see Figure below). Such an interface allows the user to replace normal computer pointing devices, such as the mouse, with his finger. The user can control the operating system in this manner or digitize an image and virtually annotate it.

                                                                                                                                                             Image Courtesy of MIT Media Lab


Research has given us many different types and methods of tracking information and with the combination of these types of trackers and “Augmented Reality Technology” there can be very interesting results.


Hybrid Tracking Systems


The folks at the University of North Carolina have developed a “Hybrid” tracking system that has a hybrid tracking scheme which has the (static) registration accuracy of vision-based tracking systems and the robustness of magnetic tracking systems. This system works well for static scenes and for scenes in which only the camera moves. We still need other techniques for latency management to achieve dynamic registration.


The image at the top of this page demonstrates the accuracy of our tracker registration. The virtual teapot uses a reflection map from the real environment to simulate a chrome surface. Notice the reflection of the hand and the flashlight beam in the computer generated object. In the real environment, a reflective sphere is sitting at the location of the teapot. In real-time we grab the image of the sphere in the AR head-mounted display (HMD). Only precise registration allows us to know where the sphere is located in the HMD image.













Image Courtesy of University of North Carolina, Chapel Hill


Show example of QTMovie here.

5.8.4      Sound Systems


This is information courtesy of, “Sound in VR”.

Human beings have two, not one, ears at about equal height on both sides of the head. This well-known fact is the basis of many of the outstanding features of human auditory perception. Identifying faint signals in a noisy environment, comprehending a specific talker in a group of people all speaking at the same time, enjoying the "acoustics" of a concert hall, and perceiving "stereo" with our hi-fi system at home, would hardly be possible with only one ear. In their effort to understand and to take advantage of the basic principles of human binaural hearing, engineers have done the groundwork for a new branch of technology - now known as Binaural Technology. Binaural Technology is able to offer a great number of applications capable of having noticeable impact on society. One of these applications is the representation of the auditory sensory domain in so-called Virtual-Reality systems. To this end, physiologically adequate treatment of the prominent sensory modalities, including the auditory one, is mandatory.

Technically speaking, auditory representation in VR systems is implemented by means of a sound system. However, in contrast to conventional sound systems, the auditory representation is non-stationary and interactive, i.e., among other things, dependent on listeners' actions. This implies, for the auditory representation, that very complex, physiologically-adequate sound signals have to be delivered to the auditory systems of the listeners, namely to their eardrums.

One possible technical way to accomplish this is via transducers positioned at the entrances to the ear canals (headphones). Headphones are fixed to the head and thus move simultaneously with it. Consequently, head and body movements do not modify the coupling between transducers and ear canals (so-called head-related approach to auditory representation) - in contrast to the case where the transducers, e.g. loudspeakers, are positioned away from the head and where the head and body can move in proportion to the sound sources (room-related approach). In any real acoustical situation the transmission paths from the sources to the ear-drums will vary as a result of the listeners' movements in relation to the sound sources- the actual variation being dependent on the directional characteristics of both the sound sources and the external ears (skull, pinna, torso) and on the reflections and reverberation present.

Virtual Reality systems must take account of all these specific variations. Only if this task is performed with sufficient sophistication will the listeners accept their auditory percepts as real - and develop the required sense of presence and immersion.

At this point it makes sense to begin the technological discussion with the earliest, but still a very important category of application in Binaural Technology, namely, "binaural recording and authentic auditory reproduction." Authentic auditory reproduction is achieved when listeners hear exactly the same in a reproduction situation as they would hear in an original sound field, the latter existing at a different time and/or location. As a working hypothesis, Binaural Technology begins with the assumption that listeners hear the same in a reproduction situation as in an original sound field when the signals at the two eardrums are exactly the same during reproduction as in the original field. Technologically speaking, this goal is achieved by means of so-called artificial heads that are replicas of natural heads in terms of acoustics, i.e. they develop two self-adjusting ear filters like natural heads. Applications based on authentic reproduction exploit the capability of Binaural Technology to archive the sound field in a perceptually authentic way, and to make it available for listening at will, e.g., in entertainment, education, instruction, scientific research, documentation, surveillance, and telemonitoring. It should be noted here that binaural recordings could be compared in direct sequence (e.g., by A/B comparison), which is often impossible in the original sound situations.

Since the sound-pressure signals at the two eardrums are the physiologically adequate input to the auditory system, they are furthermore considered to be the basis for auditory-adequate measurement and evaluation, both in a physical and/or auditory way. Consequently, there is a further category of applications, namely "binaural measurement and evaluation" In physical binaural measurement physically based procedures are used, whereas in the auditory cases human listeners serve as measuring and evaluating instruments. Current applications of binaural measurement and evaluation can be found in areas such as noise control, acoustic-environment design, sound-quality assessment (for example, in speech-technology, architectural acoustics and product-sound design,) and in specific measurements on telephone systems, headphones, personal hearing protectors, and hearing aids. For some applications scaled-up or scaled-down artificial heads are in use, for instance, for evaluating architectural scale models.

Since artificial heads are basically just a specific way of implementing a set of linear filters, one may think of other ways of developing such filters, e.g., electronically. For many applications this adds additional degrees of freedom, as electronic filters can be controlled at will over a wide range of transfer characteristics. This idea leads to yet another category of applications: "binaural simulation and displays." There are many current applications in binaural simulation and displays, and their number will certainly further increase in the future. The following list provides examples: binaural mixing, binaural room simulation, advanced sound effects (for example, for computer games), provision of auditory spatial-orientation cues (e.g., in the cockpit or for the blind), auditory display of complex data, and auditory representation in teleconference, telepresence and teleoperator systems.

Image Courtesy of Cybertherapy, Sound in VR, Binaural technology in virtual auditory environments

Binaural-Technology Equipment of Different Complexity:

(a) probe-microphone system on a real head,

(b) artificial-head system,

(c) artificial-head system with signal-processing and signal-analysis capabilities,

(d) binaural room-simulation system with head-position tracker for virtual-reality applications. [34]


The task of an auditory renderer is to compute the spatial, temporal, and spectral properties of the sound field at the subject's position for a given virtual world model taking into account all physical parameters that influence the sound field in the physical reality. To this end, a sound field model needs to be established whose results can be auralized by the front end. A suitable form of describing these results is the spatial map of secondary sound sources (Lehnert & Blauert 1989). The sound field is modeled by a cloud of discrete sound sources that surrounds the listener in a free sound field. Recently, (Møller 1993) an attempt has been made to define a standard format for a description of the spatial map of secondary sound sources.

The following events are relevant to the sound-field model and have to be treated in a different manner

·         Rotation of the subject: Head rotation only changes the spatial characteristics of the sound field. Accordingly, only the binaural filters in the auralization processors need to be updated.

·         Translation of the subject: In principle, all of the secondary sources have to be recalculated making it necessary to re-execute the sound field model

·         Rotation of sound source: Rotations of the sound source are only of relevance if the directivity of the source is not omnidirectional. In this case the directivity filters need to be updated.

·         Translation of sound source: Translations of the sound source require a re-execution of the sound field model.

·         Modification in the directivity of the sound source: In these modifications the directivity filters and possibly also the directivity filter database in the auralization unit need to be updated.

·         Modification in a surface property: Reflectivity filters that need to be updated model the reflection properties of the surfaces.

·         Modifications in the geometry of the virtual environment: These modifications include changes in shape, movement, creation, and deletion of virtual objects. In general a recalculation of the sound field is required.

Example of a general system architecture for sound in VR.

Image Courtesy of Cybertherapy, Sound in VR, Architecture of an auditory VR system

5.8.5      Interactive Devices

Interactive Devices are tools used to help navigate through a virtual environment which include:

·         Input Wands

·         Multiple Degrees of Freedom, Input Devices

·         Palmtop/Tablet Computers

·         Seated Interfaces

·         Human (or Body movement) Locomotion Interfaces

·         Sensing Gloves

·         Force Feedback Interfaces

·         Temperature Feedback Interface

·         Motion Capture Technology       Input Wands

Most Input Wands have these characteristics designed into them; small number of buttons, developed analog joystick, thumb pad, they can invariably be lightweight, also relatively small in size.  Most are ergonomically developed for most human hands.  Some integrate six degrees of freedom tracking.  Some can use other features built into the wand itself, for example; a laser pointer or Radio Frequency (RF) mouse pointer device.

Here is an example[35] from the University of Toronto’s Depart. of Computer Science, which uses a buttonless, 3D wand input system.  The VisionWand is a simple cylindrical piece of plastic with different colored ends.


Different wands can be distinguished by different colors of the bodies, the ends, or additional markers, allowing for different wands to be tracked using our camera setup.  No buttons or wheels are attached to the wand.  A pair of Logitech QuickCam Pro 3000 cameras are used for tracking.  The cameras face a back-projected display. The user interacts with the display using the wand.


3D reconstruction of wand from two cameras.                                            Mapping of wand to screen.

The system displays the red and blue circles, which show the orthogonal projections of the wand ends on the screen. The black cross-displayed by the system indicates the intersection of the 3D ray and the screen. This intersection denotes the screen position that the wand is pointing to. We display both colored circles simply to give the user an idea of how the wand is being tracked, while the black cross serves as a pointer. In addition to the spatial positions, we make use of the information of two angles: orientation, defined as the obliquity of the orthogonal projection of the 3D ray on the screen, and tilt, defined as the inclination between the 3D ray and the screen.



VisionWand postures and gestures. (a) Pointing posture: point to a position on screen; the end that is nearer to the screen is defined as the active end. (b) Parallel posture: keep the wand approximately parallel to the screen, in any orientation. (c) Tilt gesture: starting from a parallel posture, tilt the wand in either direction. (d) Tap gesture: quickly move the active end away from the screen and back again. (e) Parallel tap gesture: from parallel posture, quickly move the entire wand away from the screen and back again. (f) Flip gesture: quickly flip the wand end to end, keeping the orientation and tilt approximately the same as before the gesture. (g, h) Push and Pull gestures: change the distance between the wand and the screen. (i) Rotate gesture: change the orientation of the wand while keeping it in a parallel posture.


Another different approach developed by folks at Delft University of Technology[36], near Delft, The Netherlands describes an interaction with two visualization input devices working together to create a probing and navigating set interacting within a given environment.

They have developed a stylus, a Plexipad, and a combination of the two in use.

One-handed interaction: Either the stylus or the Plexipad is actively used to interact with the environment. The stylus is suitable for direct (0D) or ray-casting (1D) selection and manipulation, where the Plexipad allows direct control (positioning and orientation) of objects that are virtually attached to the Plexipad.


Double one-handed interaction: The stylus and the Plexipad are used to perform unrelated one-handed tasks. The Plexipad and stylus each have their own separate functionality: a direct coupling between the tools is absent. This scenario allows a combination of stylus-based and Plexipad-based one-handed interaction scenarios. Although this direct relation between the two interaction tasks is absent, usually there will be a higher-level goal that is pursued.


Symmetric two-handed interaction: Both hands perform identical tasks. This type of interaction task is not likely to be used in our scenarios, considering our use of two distinct input devices.  Systems that use two identical input devices like wands or gloves do support symmetric two-handed interaction tools [2].


Asymmetric two-handed interaction: The Plexipad sets the 2D reference plane for the stylus. This combination of the two tools exploits the familiarity with the “pen and pad” metaphor. In addition, the pad provides tactile feedback to the stylus’ movements.  The combination of the plane shaped panel and similarly shaped virtual object proves to be very intuitive. It feels as if you are holding the virtual tool in your hand.


The user holds the direct data slicer (Plexipad) in the non-dominant hand and can quickly probe through the volume to get an overview of the data values inside.


Concurrently the dominant hand can be used to operate the stylus for manipulation, zooming of the data, and the selection of new tools. This is an example of two one-handed tools.


The direct data slicer forms a two-dimensional reference plane for other tools. It not only provides a reference for asymmetric two-handed interaction and passive haptic feedback, but it also provides a visual reference plane in a volumetric dataset. This combined feedback assists in a more accurate and intuitive selection of points and lines in 3D space by using the stylus directly on the Plexipad.





Two-handed exploration tools in an immersive visualization of atmospheric simulation data.       Multi-Degree-of-Freedom Devices

Most of these types of devices are developed with a custom designed manipulator that supports interaction with 3D geometry.

Spacetec Spaceball 4000 and 3003



With the Spaceball FLX PowerSensor technology, you can easily move or rotate your 3D by gently pushing, pulling or twisting the PowerSensor ball.  The model then moves in the direction of the force or twist applied.

The Spaceball 3D Motion controller also provides smooth and dynamic model manipulation in that the greater the pressure applied to the ball, the faster the model moves or rotates. Its ergonomically designed base lets you rest your hands in a very natural, relaxed position, reducing hand and arm stress and fatigue.

3Dconnexion’s 3D motion controller  SpaceBall 5000

The SpaceBall 5000 delivers the comfort and efficiency users have come to expect from 3Dconnexion motion controllers with even greater performance than before. With 12 programmable buttons, you can keep functions and Keyboard modifiers macros right at your fingertips. Designed to enhance demanding 3D software applications at the highest level, the SpaceBall 5000 lets you grab onto the power of your software with both hands.


The SpaceCat and the quCat were developed together with ergonomics experts from the Institute of Hygiene and Applied Physiology at the Federal Institute of Technology (ETH) in Zurich, Switzerland.[37]

AxiGlaze has developed the SpaceCat.  The inductive spring technolgy (IS3D) allows for human-machine interaction in six degrees of freedom (6DOF). It may be embedded in any computer application providing 3D visualization.

IS3D allows for direct, precise and fast control of the virtual object, with features such as:

·          a large movement range that has been adapted to the hand's movement.

·          the elastic suspension provides a rich sensory feed-back allowing both fast and precise manipulation.

·          its movement range and spring suspension has been adapted to support both position and velocity control.



The support of both position and velocity control allows the user to either manipulate virtual objects (i.e. for actions like pan, zoom and rotate in a CAD-System) or navigate through virtual worlds (moving an eyepoint / viewpoint in walk through / fly through applications). The velocity control works fine for navigation and coarse manipulation. However, for precise manipulation, the position control is advantageous. This opens up a range of new possibilities in applications such as in medicine and animation. Moving the handle of an IS3D device in any way, i.e. a combined translation and rotation lets the object on the screen follow the movement one to one.

SpaceOrb 360

Simultaneous Six-Degees of Freedom Control (6D Control) technology allows users to strafe, rotate and freelook on any axis and create complex moves like Circle Strafes and Barrel Roll instantly & easily.

The SpaceOrb 360 advanced capabilities emanate from the Spaceball PowerSensor ball, which picks up all the pushes, pulls and twists that you apply, and translates these into smooth movement on-screen. Think of the PowerSensor as your head and mimic with your head what you want to do on-screen. Push with your thumb on the back of the ball to move forward. To move backward, pull on the front of the ball, and you'll go back. To view the length of an entity, push the ball side-to-side across the unit.

The PHANTOM (from SensAble) 1.5/6DOF and 3.0/6DOF devices allow users to explore application areas that require six degrees of freedom (6DOF). Examples include virtual assembly, virtual prototyping, maintenance path planning, teleoperation, and molecular modeling.  The 1.5/6DOF has a range of motion approximating lower arm movement pivoting at the elbow. The 3.0/6 DOF has a range of motion approximating full arm movement pivoting at the shoulder.       Palmtop/Tablet Computers

Most of these devices have attributes such as; touch sensitive screens, wireless networking capabilities, multimedia capabilities, voice/text recognition, integrated video cameras, standard WIMP interface methods, and PC hardware interfaces like USB or Serial.

The Palmtop computer allows Visualization Environment (VE) designers to integrate standard UI components into the VE.  Palmtop computers are small and easy to manipulate. A palmtop can be connected to the Internet with a single cable.  Wireless radio frequency networking options are increasingly available with transmitters providing ranges in the hundreds of feet with megabit per second data rates. The palmtop is designed for low power consumption and features a self-contained power source. These wireless capabilities make the palmtop idea for integration into the virtual environment.[38]



Some of the interaction tasks that the Palmtop can be used for are:

·         Updating Alphanumeric information – utilizing the touch sensitive display enables character recognition for text input or simply put displacing the keyboard.

·         Indirect Object Manipulation – uses graphical icons, sliders and gestures to modify object attributes.

·         Way Finding – helps display map information of VE’s for the user.  A “You are here” guide.

·         Help information – could provide a large amount of text information on how to use specific functions.

·         Navigation – could be used to select direction or motion or velocity / acceleration.

·         Operation Selection – buttons, menus and dials could be used to perform operations in the VE.

·         Appearance – object colors, textures, and material properties could be selected, enabled or disabled.

These elements, put together, give the VE designer more control with quick input response time.

       Seated Interfaces

There have been many uniquely developed interactive systems for specific environments ranging from the use of a bicycle to driving simulators, flight simulators and virtual treadmills.  Here are some examples of these systems:

Bicycle simulator

The basic appearance and design of the CyberGear bicycle is similar to a recombant.  Steering is accomplished using two handles left and right of the bicycle riders’ seat. In practice pulling the left and pushing the right handle achieve a rightward turn. Since pulling and pushing alone does not provide the subjects with reasonable feedback about the magnitude of their turns, the designers of the VR Bike have implemented a special mechanical feedback system in order to provide the subjects with a more realistic simulation of turning. For this purpose, they connected the two steering handles to the body of the bicycle in a way that the bicycle is tilted along its longitudinal axis whenever the bicycle rider is turning left or right. Referring to the above example of turning rightwards, this means that the body of the bicycle is also tilted rightwards.


A large truncated cone shaped project ion screen replaced the flat projection screen of the old setup. We decided to use a tilted (approx. 10 degrees), instead of a vertical, projection screen since we guessed that the depth perception under these conditions improved.  The new screen's diameter is 6 m at its bottom and 7m at its top with a height of 2.8 m. Assuming that the bicycle rider is sitting in the center of the truncated cone the field of view of the current setup is 180 x 50 deg (old setup: 50 x 40 deg). The simulation scene computed on a Silicon Graphics Onyx Reality Engine II is displayed using three projectors (AMPRO-3600). Softedge-blending technique (Panoram's Panomaker) is used to adjust the overlap zones between the projectors.[39]


Driving Simulators

The VR Driving Simulator at KMRREC is a computer system that allows individuals to "drive" through several computer generated virtual environments. Through the use of a computer and a head-mounted device the person enters the "virtual car" and uses a steering wheel and foot pedals to simulate driving tasks.[40]

Example of a person driving in the VR Driving Simulator. Image seen in computer screen is what the individual sees through the HMD.

While in the VR Driving Simulator, the individual can control the "virtual car" much like a real car. They can make the "virtual car" turn, stop, go fast or slow down. In addition to being able to "see" the road in front, the driver will also hear the sounds of the engine roaring as he/she is driving through the virtual routes.

The VR Driving Simulator is unique because it can allow for the evaluation of driving capacity in different and complex situations, which may not be possible in the "real world" due to safety concerns. For example, with the VR Driving Simulator, nighttime driving skills can be evaluated safely. Also the driver can try different routes, such as highways, residential streets or commercial areas. [41]


Simulator Setup: Haul Truck Training Simulator
The mock-up cabin is situated in the midst of three high resolution, high brightness, projection screens for the left-, forward- and right view. The 3 screens provide the trainee operator with a wide field of view (180 degrees), similar to what he would have experienced in a real vehicle. The left side-view mirror can be seen on the left screen. The mock-up cabin is mounted on a high performance motion base that provides realistic motion cues to the trainee operator.[42]


The ReelMotion Simulator is a standalone PC application that allows you to animate a car. You simply

import a terrain and vehicle model and use a mouse or joystick to drive your vehicle in real time – just like a

game or flight simulator. User-friendly vehicle setup menus allow you to change vehicle constants and

basic dynamic (suspension) characteristics.

                       Vehicle Setup panel                                                                            Joystick Input

The STISIM from System Technology, Inc simulates real world driving situations and safety hazards. 

STISIM - curve on a mountain road

STISIM - traffic in the 

			residential area

STISIM - road 
			construction with traffic

STISIM - highway 

STISIM - busy 

Image Courtesy of System technology, Inc.  ( )

This system can be applied to many training scenarios including; human factors research, driver evaluation, cab design and ergonomics, medical research on human body fatigue and much more.

STISIM systems

This fully functioning system includes interactive driving with visual and auditory feedback.  Speed sensitive steering wheel for force feedback “feel”.  Simulated configurations are used with Windows User interface.  Driver and Vehicle performance can be measured as well.






Flight Simulators

Flight Simulation has certainly come a long way from the original 1930’s flight simulators to the sophisticated versions that are produced by commercial aeronautical agencies today.

An example of the Army’s Link Trainer D2 built around 1937 and fitted with moveable ailerons, rudder, elevators and many instruments.  This trainer was suitable to show the principles of flying to the public. [43]

Image Courtesy of Wings of Liberation Museum Park  (

Considered the “Ultimate Flight Simulator”, the Boeing Company trains its pilots using this form of high-technology flight simulator for its 777 aircraft that’s totally immersive and whole lot of fun. Boeing 1995.


Images Courtesy of the Boeing Company.  ( )


Here are some example models created for the Flight Simulation gaming industry at


       Locomotive Interfaces

A unique problem to many large-scale presentation visualizations occurs when the individual is asked to maneuver in such a way as to create his own locomotive power and tracking these movements to the presentation, whether it is designed with a CAVE or perhaps simply a Wall presentation.[44] [45] 

Research is focused on the Treadport, a unique locomotion interface created by Sarcos Research Corp. The Treadport comprises a treadmill, a mechanical tether, and a CAVE visual display. A second-generation Treadport is a redesign with a larger treadmill, a stronger mechanical tether, and three back-projection systems for surround vision.


This type of locomotive interaction requires an understanding of the information gathered from the device.  Locomotion rendering is the presentation of mechanical stimuli to simulate the various aspects of natural locomotion. Our research focuses on the use of the Treadport's active mechanical tether to achieve a variety of locomotion aspects.

·         Enforcing unilateral constraints. The mechanical tether will apply a force to prevent people from walking through walls, making the experience more realistic.

·         Inertial force display.  When people run on a treadmill with constant velocity, there is no difference from ground locomotion.  However, when a user accelerates on a treadmill, the body is stationary and there is no f=ma force.  With the tether, we can provide an artificial inertial force.  Psychophysical experiments show that users prefer this inertial display in comparison to other force control strategies. [46]

·         Slope Display. Treadmill tilt is often too slow to display sudden slope changes. Pulling or pushing on a user can simulate an artificial gravity force. Psychophysical experiments show that tether force is a reasonable facsimile for treadmill tilt. In addition, measurements of the leg joint angles while walking on a real slope versus walking on a level slope with tether force showed that the biomechanics are the same. [47]

Current research is focused on:

·         Displaying uneven terrain such as stairs by time varying tether force profiles.

·         Evaluation of the benefits of 3D force application to the body, rather than just the single axis direction of the current tether. Pulling from the side can in principle simulate side slope walking. Biomechanical similarity has been established between walking on a sideways tilted platform using ATR's ATLAS system, and walking on a level treadmill under the influence of a passive pulley-weight system.  This result supports the addition of another active degree of freedom to the tether for side forces. Taken together with the frontal slope results, torso force feedback is an attractive alternative to moving the treadmill platform.

Traveling on foot is an intuitive and natural practice in the real world. However, the problem of moving around in the virtual environment (VE) on foot is one of the major obstacles to be tackled in virtual reality research. In order to realize natural navigation in the VE, we have developed walking systems.

Torus Treadmill [48]is a locomotion interface equipped with special arranged treadmill. It can provide Infinite plane for creation of sense of walking. Torus treadmill consists of ten belt conveyers. Ten belt conveyers are connected side by side and driven to perpendicular direction. Torus Treadmill can provide infinite plane for walking.

GaitMaster is a locomotion interface that generates omni-directional uneven surface. The core elements of the device are two 3 DOF motion-bases mounted on a turntable. A walker stands on top of the plate on the motion-base. Each motion-base is controlled so that it can trace positions of the foot, and the turntable traces the orientation of the walker.       Sensing Gloves

This form of input is usually developed for tracking and hand gesturing.  There are sensors that are integrated into glove.  Some gloves are able to determine information based on simple hand gestures and fingertip contacts.  Others have combined this information with six Degrees of Freedom Tracking.  Some have even augmented them with haptic systems.


Two types of glove-based input devices have been developed.  First, bending-sensing gloves, which measure finger joint movement, and second, the Pinch Glove, that detects electrical contacts between each of the fingertips.   Bend-sensing gloves are good at extracting geometrical information that enables them to represent the user’s hands in the virtual environment. They can be used to mimic interface widgets such as sliders and dials, [49] but do not have useful methods for signaling the activation or deactivation of the widget. Bend-sensing gloves are also used in conjunction with hand posture and gesture recognition, but it can be difficult to determine when one gesture begins and another ends without applying constraints to the users gesture space.[50]  Conversely, Pinch gloves provide a series of button widgets that are placed on each fingertip that allows for the extraction of topological data for interactions such as pinching postures. However, they have no way of determining the flexing of the fingers and they make it difficult to represent the hand in a virtual environment.


A user wearing the Flex and Pinch input device is about to invoke the Head Crusher object selection technique on a round table. By placing his middle and index finger together, the user can activate the selection operation and move the table.[51]


A user pointing at and selecting a desk in the virtual environment. The user makes the selection by pressing the thumb to the right side of the middle finger.



Cyber Glove SystemImmersion’s CyberGlove is a fully instrumented glove that provides up to 22 high-accuracy joint-angle measurements. It uses proprietary resistive bend-sensing technology to accurately transform hand and finger motions into real-time digital joint-angle data.


The basic CyberGlove system includes one CyberGlove, its instrumentation unit, a serial cable to connect to your host computer, and an executable version of our VirtualHand graphic hand model display and calibration software.


Another Immersion product is the CyberTouch which uses vibrotactile stimulation.


The CyberTouch glove (as shown below)($15000 including the required CyberGlove) has arrays of SMAs on each finger and the palm. Each SMA can be individually controlled. So the SMAs can be vibrated together to create simple sensations such as vibration or they can be used in combination to simulate complex textures. Systems that use this technology such as CyberTouch have the advantage of high update rates while still being lightweight.       Force Feedback Interfaces

Many devices have been developed in the last 30 years in order to address the somatic senses of the human operator, but only few have become widely available. The most probable reason for that is that the devices are either not very useful or really expensive (from US $10,000 up to more than US $1,000,000). By “devices for tactile/haptic output'' [52]we mean devices that have been especially designed for this purpose.  In some sense, a standard keyboard and mouse do also provide some kind of haptic feedback, namely the so-called breakaway force when a key or button, respectively, has been pressed. Although this is important, as tests have proven to increase the input rate in the case of a keyboard, we do not consider these devices within this section.

Devices with tactile, haptic, or force output address the somatic senses of the user. This can be done by the following methods:

Pneumatic stimulation

This can be achieved by air jets, air pockets, or air rings. Problems arise due to muscular fatigue and the pressure or squeezing effect, which means that the ability to sense is temporarily disabled. Another drawback of pneumatic devices is its low bandwidth.


Vibrotactile stimulation

Vibrations can either be generated by blunt pins, voice coils, or piezoelectric crystals. These devices seem to be the best ones to address somatic senses because they can be building very small and lightweight and can achieve a high bandwidth.


Electrotactile stimulation

Small electrodes are attached to the user's fingers and provide electrical pulses. First results are promising, but further investigation is needed in this area.


Functional neuromuscular stimulation (FMS)

 In this approach, the stimulation is provided directly to the neuromuscular system of the operator. Although very interesting, this method is definitely not appropriate for the standard user.


Other methods do not address the somatic senses directly. For example, a force-reflecting joystick can be equipped with motors that apply forces in any of two directions. The same method is used for the Exoskeleton.

Nearly all devices with tactile output have been either developed for graphical or robotic applications. Many different design principles have been investigated, but the optimal solution has not been found yet.  Most probably, the increasing number and popularity of Virtual Reality systems will push the development of force feedback devices to a new dimension.[53]

Machine Haptics is the complimentary study of machines, including the development of technology to mediate haptic communication between humans and computers as illustrated in the next figure.[54]

Image Courtesy of MIT Touch Lab  ( )

In the figure, a human (left) senses and controls the position of the hand; while a robot (right) exerts forces on the hand to simulate contact with a virtual object. Both systems have sensors (nerve receptors, encoders), processors (brain, computer), and actuators (muscles, motors). The type of application depends on how the computer, in turn, interacts with the rest of the world (not shown).

Force feedback provides direct perception of three-dimensional (3D) objects and directly couples input and output between the computer and user. It acts as a powerful addition to graphics display for problems that involve understanding of 3D structure, shape perception, and force fields. 

Humans have the ability to distinguish the feel of items by their texture (rough, smooth, etc.), by their give (hard, soft, springy, pliable …), and by their temperature (hot or cold, insulating or conductive.)  Many items are recognized by a subtle combination of all three.  A child’s plush toy verses a porcelain doll.  While we often take these senses for granted, trying to accurately recreate these senses virtually presents several engineering challenges – mathematical and mechanical. 

There are many variations of feedback with Human-to-Computer interfaces, but one that invariably stands out is a particular force that is given back to the user while in a Virtual Environment.  Most of these types of force feedback are generally involving the hands.  Some of these products and research developments use such additional devices as:

·         Gloves – that which can be outfitted with pneumatic pistons with variable piston pressures to stimulate interaction with virtual objects.[55]

·         Exoskeletons – where the user is surrounded with specific attached hardware that resists the user’s motion. [56]

·         Telepresence – where robots place a real object at a location corresponding to the virtual world geometry.[57]




Here are some examples of hand and arm feedback.

The PHANTOM Premium 3.0 device (from SensAble) provides a range of motion approximating full arm movement pivoting at the shoulder. The device includes either a finger sled or a handle gimbal (choice of one) and provides 3 degrees of freedom positional sensing and 3 degrees of freedom force feedback.


The Mantis Haptic device (from Mimic Tech) with three degrees of force feedback and six degrees of tracking. Intended for professional users requiring top grade components that maximizes output force and resolution. At a customers request specialized tools, such as forceps or laparoscopic instruments for surgery simulation, can be attached to the Mantis device instead of the comfort grip.

Mimic provides an application programming interface (API) for software access to its devices.

Haptic Master (developed by U. Tsukuba) is a Desktop Force Display Force sensation that plays an important role in the recognition of virtual objects.  Users can feel the rigidity or weight of virtual objects with a compact force-feedback device for desktop use.  A six-degree-of-freedom manipulator employs a parallel mechanism to apply reaction forces to the fingers of the operator.  Most of robotic manipulators have large-scale and high-cost hardware, which inhibits their application to human-computer interaction. 

The Haptic Workstation from Immersion includes right & left handed CyberForce® whole-hand haptic-feedback systems mounted behind you on vertically adjustable columns that can be configured for both seated and standing applications.  Immersive 3D viewing is provided by a head-mounted display working with Immersion’s VirtualHand® software for CATIA or other 3D CAD models.

Can be configured to work with CAVE or screen environments.

Telepresence and Telerobotics give the ability to actually participate in real world applications while visiting a virtual environment.


The “Large Dextrous Arm” (developed by Sarcos) is used in high-strength applications where dexterity and long reach are required.  The arm has a 7-foot radiu, lifts 300 pounds, swings at 30 MPH and handles objects as fragile as eggs.



The “SenSuit” allows direct interactive, real-time measurements of up to 32 DOF of an operator’s body.  Signals can be used to control full-body icons operating in synthetic environments, or any of the Sarcos mobility platforms or robotic systems.

The Gaming and Entertainment industry has certainly come up with some unique feedback interface devices, including this “Interactor Vest” by Aura.


Realistic action is provided by following the joystick motion, when you move the joystick the R&R moves in that direction and when the joystick centers, the R&R centers.       Temperature Feedback Interface

A small part of the haptic devices market and research effort is directed at temperature feedback. There are two currently commercially available products. Firstly there is the TiNi Alloy Tactile Feedback system that uses the TFS memory metal that can be remotely heated using electric current. The TFS elements are positioned at several places in the hand and can therefore be used to give the impression of different heat levels throughout the hand. The other temperature feedback technology comes from CM research and uses thermodes along with DTSS (Displaced Temperature Sensing System). Thermodes consist of a thermoelectric heat pump a temperature sensor and a heat sink. The heat pump moves heat in or out of the heat sink producing the sensations of hot and cold at the thermodes surface. These thermodes can be fitted to the fingertip (as in the illustration), and controlled via the DTSS X/10, which can command up to 8 thermodes. Hence the thermodes can be individually controlled to give heat sensation. [58]

Temperature Feedback example                                                         Smelling Feedback example


Image Courtesy of Virtual Environments and Computer Group from University College London         Image Courtesy of Symbolic Olfactory Display by Joseph Kaye, MIT, 1999.       Sensing Olfactory Interface

In a telepresent virtual reality system the sensing technologies would be limited to a predetermined set of odors. [59]

Electronic noses have been developed as automatic detection systems for various odors, vapors, and gasses. Detectors for odors such as natural gas and gasses such as carbon monoxide are used with safety alarms. In a virtual reality telepresence experience you in addition to sense and determine the smell, want to transport a signal of the smell to a distant site and reproduce it. To accomplish this the electronic nose must consists of a sensing unit and a pattern recognition system. The sensing element is built either as an array of several different elements measuring specific characteristics of the smell, or as one sensing device that outputs an array of measurements for each odor. This signature pattern is then compared to a pattern in a database to determine the odor. A unique mapping is produced for each odor. Another method to determine odors is to build an electronic nose with a unique sensor for every odor the nose must detect. This becomes difficult when there are numerous odors to be detected, and the highly selective chemical sensors required are expensive and not easily built. [60]       

Yes, research is getting better - but this hasn’t really been a profitable resource of information.   Motion Capture

Although Motion Capture technology is not necessarily considered a tool of real-time interaction, it does include new technologies that certainly can entertain the idea of capturing dynamic movement of either humans or machinery.  This section simply covers some examples of this type of technology.

Alberto Menache defines motion capture as “The creation of a 3D representation of a live performance.” in the book Understanding Motion Capture for Computer Animation and Video Games. This is in contrast to animation that is created 'by hand' through a process known as keyframing.

Motion capture (a.k.a. MoCap) used to be considered a fairly controversial tool for creating animation.  In the early days (1980’s), the effort required to 'clean up' motion capture data often took as long as if the animation was created by an animator, from scratch.  MoCap has been used for medical purposes, as in the study of physiological movement and design and engineering applications.  An example is the use by jet engine manufacturers to use the immersive technology to study the performance of jet engines.  The great advantage of MoCap over traditional animation techniques such as keyframing and simulation is the ability to generate natural-looking animation.


Several Different types of Motion Capture systems including:

·         Magnetic

·         Optical

·         Electro-Mechanical


Motion capture is typically accomplished by any of three technologies: Optical, Magnetic and Electro-mechanical. Though each technology has its strengths, there is not a single motion capture technology that is perfect for every possible use.




Magnetic motion capture systems utilize sensors placed on the body to measure the low-frequency magnetic field generated by a transmitter source. The sensors and source are cabled to an electronic control unit that correlates their reported locations within the field. The electronic control units are networked with a host computer that uses a software driver to represent these positions and rotations in 3D space.

Magnetic systems use 6 to 11 or more sensors per person to record body joint motion. The sensors report position and rotational information. Inverse kinematics (IK) is used to solve the angles for the various body joints, and compensate for the fact that the sensors are offset from the actual joint's center of rotation. The IK approach produces passable results from 6 sensor systems, but IK generally adds system overhead that can cause latency in real-time feedback. Although 6 sensor systems are less expensive, they are more likely to produce 'joint popping' since the IK solution needs to guess about a lot of the information it is receiving.

Although high frame-rates may be boasted by magnetic systems, the frame rates are generally greatly reduced by the need to filter noise from the data, which can greatly reduce frame-rates and increase latency. The markers tend to move a bit during capture sessions, and require repeated readjustment and recalibration. Since each sensor requires its own (fairly thick) shielded cable, the tether used by non-wireless magnetic systems can be quite cumbersome.

Magnetic systems can also have issues with azimuth (tilt). If an actor is doing a push-up type posture, the system can get confused. Multiple actor magnetic setups may also have problems with two or more actors in close proximity. Sensors from the different actors will interfere with each other, providing distorted results.  [61]



There are two main technologies used in optical motion capture: Reflective and Pulsed-LED (light emitting diodes).

Optical motion capture systems tend to utilize proprietary video cameras to track the motion of reflective markers (or pulsed LED's) attached to joints of the actor's body. Single or dual camera systems are suitable for facial capture, while 3 to 16 (or more) camera systems are necessary for full-body capture. Reflective optical motion capture systems use Infra-red (IR) LED's mounted around the camera lens, along with IR pass filters placed over the camera lens. Optical motion capture systems based on Pulsed-LED's measure the Infra-red light emitted by the LED's rather than light reflected from markers.

The Motion Captor system is a reflective system. It uses widely available off-the-shelf hardware, resulting in very low cost and easy upgrading of system capabilities.

The centers of the marker images are matched from the various camera views using triangulation to compute their frame-to-frame positions in 3D space. Several problems often occur during the tracking process, including swapping of markers, noisy or missing data and false reflections.

Most systems use a skeleton after the marker positions are captured. The Motion Captor system software uses a biomechanical skeleton during capture, allowing it to provide cleaner data with lower likelihood of noise, missing data or false reflections. The captured skeleton moves around the character's skeleton, which moves the mesh that makes up the skin of the character. This results in animation of the character. [62]




The Gypsy motion capture system (by Meta Motion) provides a better technology for majorities of motion capture situations.


The Gypsy is a patented electro-mechanical system consisting of an exoskeleton made of lightweight aluminum rods that follow the motion of the performer's bones. Potentiometers (variable resistors) at the joints change voltage (varying resistance) based upon angular rotation of the rods. A gyroscope (mounted in the hip piece) is used to calculate the bearing (rotational direction) of the hips.

Gypsy Gyro uses small solid-state inertial sensors ('gyros') attached to the body to measure the motions of the actor's skeleton.



Images Courtesy of Gypsy Motion Capture from Meta Motion

Any of these types of Motion Capture coupled with the motion software technologies such as, Kaydara from Alias or perhaps Virtual Director from the National Center for Supercomputing Applications can help create the desired effect from the input data received by these types of devices.

Virtual Director


The Virtual Director™ is a virtual reality interface that enables gestural motion capture and voice control of navigation, editing, and recording in the CAVE, ImmersaDesk™, and Infinity Wall™. The Virtual Director provides virtual choreography of multiple applications including Cave5D and also provides remote virtual collaboration (see section 10.12) capabilities, linking together CAVE devices and people represented as customized avatars. A patent has been filed for this technology with the Research and Technology Management Office, University of Illinois at Urbana-Champaign (UIUC).

The Virtual Director team is part of the Grid Technology Group in the NSF Partnerships for Advanced Computational Infrastructure (PACI) program at the National Center for Supercomputing Applications (NCSA), UIUC, and the leading-edge site of the Alliance.

Here is an example of a live 4-way collaborative session with four people at 3 different virtual devices and 1 workstation at remote locations.

Bob Patterson in the CAVE

Donna Cox at I-Wall

Glen Wheless at I-Desk 

Stuart Levy at SGI 

This image shows the map of the high-speed connection that links the above researchers.  The process of remote virtual collaboration over high-speed networks is called tele-immersion.


Image Courtesy of Virtual Director from NCSA

Another resource is the FAVE (Framework Architecture for Virtual Environments) developed by the Christian Michelson Research Center in Bergen, Norway, that uses the existing Internet and XML. [63]


For more on Motion Capture products; Phoenix Technologies, and Vicon.

5.8.6      Computing Environment


The computing environment requires an abundant amount of computing power and peripherals that work cohesively to create a virtual world.


Earlier we discussed VR hardware, specifically Image Generators that embodies all the necessary computing required.


Let’s discuss the actual specifics designed for this type of generation.


Any computing environment will need to be able to load many tasks and functionality into its source and thus developing a presentation that will meet the client’s needs.  Some of these tasks / functions are:


High Frame Rate -

Low Latency -

High Resolution -

Sound Generation -

Development Tools -

Operating Systems -

Graphic Application Program Interfaces -

Debugging -

Compiler Updates -

Technical Support -

Stability –



Most initial computing environments started as “Single Host” systems, where all of the above mentioned references are concealed within a single system.  Another form of computing environment that was push through the development of Microsoft Windows platforms are known as “Multi Host or Cluster” systems.


Let’s look at the comparison of these two different types of computing environments.  Keep in mind that there are specific VR software toolkits that work with these specific types of systems, that some of these systems can integrate Internet activities for collaboration. 


Single Host


Multiple Processors > 4

Multiple Display Systems

Large System Memory

High Bandwidth Bus

Large Shared Storage Disks

Higher Costs to Initialize and Maintain

Required Hardware Genlock to synchronize the activities / functions together


Multiple Host / Cluster


Less number of Processors

Small Number of Graphic Pipes

Smaller individual System memory

Local Disk Storage

Lower Costs

Networked / Software Genlock methods








There are Graphical Processing Units (GPU) that serve a purpose to carry out operations independently from the host CPU. 


Courtesy of SGI



The Geometry Engine takes as input geometric primitives such as triangles, lines, and points. It handles transformations and lighting. The Raster Manager owns the framebuffer. It executes a process called scan conversion – the process of computing the pattern of dots (pixels) that most closely matches the object to be displayed. The Display Generator handles digital to analog conversion for CRT monitors.[64]







5.8.7      Network Capability

While working on Collaborative tasks and assignments one must deal with a large amount of information through specific requirements such as: High Bandwidth, and Low Latency.  Some options available for pursuing the best of ones capabilities include; Ethernet, Optical Fiber and Myrinet.


Ethernet – comes in various products capable of either 10 Mb/s, 100 Mb/s or a Gb/s.


Optical Fiber – comes in various products such as; Fibre-Channel (133, 266, 530, and 1000 Mb/s per every 10 km.), FDDI-II, SCSI (Small Computer Systems Interface), HIPPI (High Performance Parallel Interface), and the Gb System Network (about 6.4 Gb/s).


Myrinet – was developed for the purpose of clustering together Workstations/PC’s, adding a full duplex of Gb/s links, switch ports and interface ports.  It can also be scalable to tens of thousands of hosts.


PC Clusters


In recent years a new development in understanding and optimizing network capabilities has generated a new success driven by the videogame industry that allow PC’s to handle complex visual scenes at interactive rates.  The ability to have high-quality interactive graphics in the PC opens new and exciting opportunities for the VR community and many possibilities to bring VR to disciplines and markets that can’t afford current workstation-based technologies. [65]


The virtual reality (VR) community has been traditionally associated with high-end and high-cost graphics systems, which required highly trained personnel to operate and large physical spaces with special temperature, power and others. This situation has been partially responsible for making virtual reality technology hard to afford in many disciplines.  However, virtual reality professionals are starting to look at PCs as an alternative to the high-end systems. Powerful graphics in the PC driven by the videogame industry mean that PCs are able to handle complex visual scenes at interactive rates.  The ability to have high-quality interactive graphics in the PC opens new and exciting opportunities for the VR community and many possibilities to bring VR to disciplines and markets that can’t afford current workstation-based technologies.


Challenges for PC Clustering


The main advantaged of current VR systems are that a single shared-memory machine can handle a large number of graphics boards and processors, providing a relatively straight-forward computing environment for complex applications.  In this environment, graphics and computations access a common memory area where the application data is located.


When moving VR to the realm of PCs, we face with a much more complex system from a software perspective. There is not a common shared location for the application data. Programmers need to be aware of the cluster configuration and provide ways to distribute the application, keep it synchronized and coherent across the PCs in the cluster, and minimize the performance impact of these actions. Many of these tasks are very difficult for average application programmers who may not have formal computer science training.  Thus, the seamless integration of the PC hardware and software components in a clustered environment for virtual reality presents a challenging infrastructure-engineering problem. Even more challenging is the problem of creating a general environment to create, support, adapt, and maintain applications. Several groups around the world on an ad-hoc basis are currently handling these issues. But as technology advances and more application areas start using PC clusters for VR, the integration problem will become intractably complex unless we define some base guidelines, rules, or standards on VR frameworks.  Another challenge is the rapid rate of technological change in interactive technologies. Many current applications have a very short “life-span” because of compatibility issues. Minor changes in technology components can cause major – if not complete - rewrites of applications.  And finally, when VR applications are implemented, developers must be aware of the effects of the applications in the context in which they will be used. For example, would a humanities scholar be able to update the applications as new findings in his research are identified? 

Furthermore, if we consider PC clusters in the area of collaborative virtual environments, the support infrastructure for them is very complex, and software tools to build them are lacking.

5.9        VR Software


There are certainly several variations of commercially available software API’s and toolkits that can help define the system architecture and rapidly develop scenarios within visual environments.  Of course there needs to be a distinction from API’s and toolkits designed specifically for internet functionality through the World Wide Web and simply virtual worlds.   


Let’s first discuss types of API available.  While Certainly OpenGL and Open Inventor are API’s, there has been several developments within the PC industry including Microsoft’s DirectX, NVidia products, and 3Dfx’s GLIDE, (however GLIDE is no longer supported by 3Dfx).


Here are a few toolkits also known as SDK (Software Development Kit) that have been developed for perhaps specific industries or simply a generalized toolkit readily available for development for any industry:




VisMockup from UGS (Presentations for Automotive, and Manufacturing Industries)


TeamCenter Visualization from UGS (Collaboration and Design/Manufacturing Industries)


Opus Realizer from Opticore (Automotive, and Manufacturing Industries)


Division Mockup from Parametric Technologies Corporation (Automotive, and Manufacturing Industries)


EON Reality (General purpose; developed for the Internet)


Sense8 from Facit Visual Simulation (General purpose, cross many industries)


VEGA from Multigen-Paradigm of Computer Associates (Training Simulation Industry)


Tucan Series from Awaron (formely known as REALAX)


Geoprobe by Magic Earth (owned by Halliburton) (Geoscience Industry)


Digital Mockup/VR by Ruecker (Automotive and Manufacturing Industries)


Vizard by WorldViz (General purpose)


Reachin API by Reachin (Medical and Geoscience Industries)


Gaia (Scene development tool) by Dynamic Animation Systems, Inc. (Terrain Simulation Industries)


Facit by Visual Simulation (Retail, Medical, Data Visualization, and Construction Industries)


Maelstrom Virtual Productions Ltd (Education and Training Simulation industries)


Virtalis (VR Consultants)


TerrainView by ViewTec (Terrain Simulation and Medical Industries)


Many Terrain tools and software packages at




Most of the software toolkits do not have a way to initially model any objects, so will need to model using commercially available modeling toolkits that build Digital Content Creation (DCC) formats such as;


Alias AutoStudio – from Alias


Maya – from Alias


3D Studio Max – from Descreet


FormZ – from auto-des-sys


Rhino - Rhinoceros


visConceptUnigraphics (mostly developed for large-scale presentations)





CAD modeling packages (these however require additional translations to the render modeling formats.)




Pro/Engineer WildfireParametric Technologies Corporation

CATIADassault Systems






SolidWorksDassault Systems






And many, many more . . .











6        VR Applications / Industry


How you can apply this technology to current industries and its benefits.  VR has been used in many applications such as;


·         Architecture,

·         Manufacturing,

·         Advanced Vehicle to Terrain Simulation,

·         Medical,

·         GeoScience,

·         Chemical and Specific Sciences,

·         Aerospace,

·         Product Evaluation,

·         Training and Education,

·         Entertainment, and

·         Augmented Realities.


Other industries that are certainly looking into the aspect of using VR technologies are:



Insurance Industry


·         Accident Reconstruction

·         Safety determination for Product detects


Marketing and Advertising Industry


·         Product demonstrations


Judicial and Government Entities


·         Forensic Animation and Simulations

·         Safety Issues


Security Industry


·         Police Training

·         Large-scale Criminal Simulation


Supply Chain Management


·         Network usage

·         Scheduling issues


Environmental Industry


·         Catastrophic planning of seismic disturbance


Weather Predictions and Phenomenon


·         Catastrophic planning of Large-scale weather patterns

·         Oceanic disturbances





6.1      Architecture


One of the most obvious applications of Virtual Reality was the so familiar architectural walkthrough.

Architecture had this particularity to be very closely compatible with what is a basic Virtual Reality system. That is, to let the user explore a 3D scene in real time, showing exactly what this user wanted to see, on demand, without having previously computed it.  The real time aspect of such systems revealed to be very appreciated by the users as it was finally enabling them to show, in much more details and realism, their designs to others.

The communication power of these kinds of tools is a major point of interest that will surely keep this application, one of the most used and successful one of Virtual Reality.

                                                Image courtesy of The University of North Carolina at Chapel Hill, College of Arts and Sciences

Image courtesy of artist Chris LeBlanc


In the field of architecture, virtual reviews of designs are commonly done with rendered 3D animations. In some cases, interactive review is possible with high-end graphics machines. These renderings provide a centerpiece that is used for discussions amongst architects, builders, owners, and city planners.


6.2      Manufacturing


In manufacturing applications, we see all levels of 3D computer assisted design systems (CAD) up to the full blown Virtual Reality system, using high-end head mounted displays and VR gloves.


This is to say that 3D computer graphics has been in use for years in that field, but it wasn't until recently that the conventional desktop CAD system shifted toward more advanced techniques such as virtual prototyping of mechanical devices and many more.

Image courtesy of The University of Michigan Virtual Reality Laboratory (VRL) at the College of Engineering


Image Courtesy of EDS Virtual Reality Center                                                   Image courtesy of PTC Division Reality


6.3      Advanced Vehicle to Terrain Simulation



Image courtesy of Terrex, Terrain Experts, Inc.


There are many terrains that involve specific vehicle maneuverability and simulation.  Here are just a few examples.



Click to swap rendered and wire-frame images.

Image courtesy of MetaVR

6.4      Medical


The medicine field is one of the big players at the moment. It has been one of the first few scientific applications of Virtual Reality. The uses are numerous and quite adequate to adapt closely with a VR system.

Sensible surgical interventions can be made much less hazardous when complemented by a VR visualization system. Surgeons can wear head mounted display helmets to provide them diverse visual aids. The use of such helmet can, among other things, let the operator see through opaque tissues of the patient's body and enable him to precisely manipulate surgical instruments while avoiding direct contacts with critical zones inside the body.

More than often, the medical applications use augmented reality VR systems. Other medical applications are more oriented toward a fully synthetic virtual world and are used for training purposes.

This shows how much medicine can benefit from using various incarnations of Virtual Reality to enhance the efficiency of the treatments given.


Image courtesy of virtualvision Inc.

Image courtesy of Artma Biomedical Inc.






6.5      GeoScience


The ability to understand the relationship between seismic, well and cultural data is crucial to success in the development of oil and gas reserves.  In this window, the seismic data is displayed as a chair cut, instantaneous frequency is draped on an interpreted horizon, the productive limits of the D-Sand are outlined in yellow and the property boundaries are clearly shown in turquoise. The next likely drilling location is south of the well displayed in block 28.


Image courtesy of Geoscience gallery of TeraRecon, Inc.