Monday, June 1, 2009

THE HUMAN EYE


Perceiving Depth

In order to understand how current 3D imagery methods work, and why they are limited, it is important to have some basic knowledge of how the human visual system perceives depth. By depth I am referring to how far away an object is perceived. There are primary and secondary depth cues, or indicators, that the human visual system uses. I will cover the secondary depth cues first as these are ones already commonly used in games and in the movies.


Relative Size

As a very general and often inaccurate rule, larger objects are closer. The more field of view that an object takes up, the closer it is. This particularly applies if you have two of the same object. As shown below, one monitor looks closer than the other because it is larger. This example works well because we have two instances of the same object, which your brain assumes are the same size.

The above example causes our brain to make a couple of assumptions: First, we assume that both monitors are the same size. Second, we assume that they are standing the right way up, and not upside down, and that they are being viewed from above. Now if we do something to break these assumptions by changing the assumed viewing angle.

We now see that the image doesn't look right. Your brain is confused because it can interpret the above image in two different ways: There is one very large monitor behind a much smaller monitor that is closer to you or, there are still two monitors of the same size but they are hanging from the ceiling, or the larger one is floating in the air in front of us.


Linear Perspective

Here is a perspective drawing. The lines on the road fill more of your field of view closer to you and grow narrower and narrower until they reach a single point in the distance. This point is called a vanishing point and is labeled "vp" on the drawing. You probably did pictures like these at school.

Your brain assumes that the road, the fence, and the lights are all uniform in size, so it interprets the narrowing of them towards the center of the picture as an effect of distance. Here is a Quake screen shot where I have highlighted the main vanishing point.

Size and perspective alone can be effective at representing a three dimensional object, although they are usually used in conjunction with secondary methods such as shading.


Overlaying and Parallax and Speed

When one picture overlays another, the eye assumes the picture doing the overlay is on top or in front. Illustrated below is an example where one man appears to be standing behind the other even though both are drawn the same height so there is no perspective.

Overlaying occurs in games also. If a wall was overlaid on top of a person who is standing in front of it, things would appear very strange. Sometimes you see these sort of anomalities happening if a game messes up its z-buffer or you have your graphics card clocked too high. A common flavour of overlaying is called parallax scrolling which typically are seen in horizontal and vertical scrolling shoot-em-ups and platform games.

The following animated GIF will demonstrate (it may run slow on some browsers but you get the idea). In this example, it's obvious that the pillars are in front of the wall behind it. The brain doesn't have to perceive it this way, but with parallax scrolling, it's nearly impossible to persuade your brain to perceive it in any other way.

Also notice that the pillars are moving faster across the field of view than the wall behind it. This is another secondary cue that the human eye uses. As a general rule, if something moves fast across your field of view, then it's closer to you than something that is moving slower.


Camera Focus And Depth Of Field

The human eye, like a low cost camera, can only focus on one area at a time. If you are focused on one point, then objects closer or further away from you will be out of focus, or blurry. To simulate this, close one eye, and hold your right thumb in front of you at arms length. Next, hold your left thumb in front of you close to your nose. Now if you focus on your right thumb, the left thumb goes blurry. If you focus on your left thumb, then your right thumb goes blurry. See it? I asked you to close one eye because if you don't, the other thumb will not only go blurry but you will see double (which relates to a topic I will discuss later).

The human eye is an active system, even when you think that you are looking at one thing, your eye is bouncing around analyzing the surrounding environment. By seeing which objects blur and by how much when you focus on another object, your eye gets a secondary cue of depth. By deliberately making one part of the image (that we hope the user is concentrating on) out of focus we can try and convince the viewer that it is far away, or closer to, as these examples show.

This technique is often used in films and photography. It is has very limited application to computer games because it constricts the viewer to focusing on one particular part of the image. Imagine playing Quake when all of a sudden your PC decided that the guy immediately in front of you with a machine gun should be your focal point and everything else, including the guy with the railgun behind him gets blurry. You can see that with an interactive environment you really can't force the user to focus on one particular spot. Depth of field remains a useful tool in films and photography where you want to draw attention to a particular part, and we may see it in real-time generated cut-scenes in future games.


Lighting And Shadows

By rendering correctly the lighting and shadows in an environment (light maps, etc.), a scene can look more 3 dimensional. Here is a picture of a cube and a sphere, before and after shading and shadowing is applied. Here I have assumed that the light is being cast from the upper left hand corner of the screen.

By grading the polygons in a linear way we can create an illusion of distance, not necessarily from the user, but from the light source (unless the user is the light source).


Haze, Fog and Atmospheric Distortion

Haze refers to atmospheric distortion. The atmosphere we breath isn't totally clean, there are little bit of dust and alike floating around in it. If a mountain is really 10 miles away then there will be some atmospheric distortion between you and it. The mountain will look slightly less detailed with slightly less color definition. Haze is something essential to give a computer generated (Bryce, Vista, etc.) landscape that extra touch to make it look realistic and believable. This Falcon 4 screen shot shows haze being used on the horizon.

A variation of haze is fog, which is the same as haze only closer to and more opaque. Some games simulate darkness by using black fog. With fogging, closer items are more visible, and objects further away are hidden in the fog. Racing games like Motorhead (shown below) often use fogging.

Fogging is also a convenient way of introducing objects into the immediate location of the player within a game. Without fogging, in racing games like Motorhead, the surrounding world would have to be rendering to the horizon which would be very CPU consuming or have objects simply pop into existence when you got close to them.

No comments:

Post a Comment