Desktop VR: Head tracking and assymetric frustum with OpenCVSharp and Unity

Introduction

This TP requires Windows and UNITY 5.

In this TP we will implement head tracking using OpenCVSharp (a C# wrapper for openCV), as well as projection correction that compensates for the dynamic change of the viewpoint relative to the screen. Start by watching this video:

we don’t always notice, but what is rendered in a screen when watching movies or playing a game is always wrong to some degree. Most often, a big field of view (FOV) is compressed into a smaller angular size of our retina, causing rendered objects to seem smaller than their real counterparts (the opposite happens in a movie theater). Below we illustrate this mismatch:

enter image description here

If we look at the projected images in the center, the rendered object becomes bigger in the screen as the user gets further from it. This may sound counter-intuitive, but this happens because the screen now falls under a smaller portion of the FOV of the user. In practical terms, the objects are projected into a smaller portion of the user’s FOV, even though they are represented in a bigger portion of the screen. In the image to the right we see a distinct mismatch. In this case the viewpoint of the user is not centered relative to the screen, and the lines connecting the viewpoint with the screen have different length. Using a regular camera will render portions of the scene that should be occluded, and omit parts of the scene that should be visible.
To correct for these mismatches we should modify the camera frustum as follows:

  • Dynamically change the FOV so that the virtual camera and viewer head are matching.
  • Dynamically change the shape of the frustum (deforming the usual pyramidal shape into an asymmetric shape).

The limitations of such corrections are:

  • A normal screen can only support one user with corrected viewport, the image will look incorrect to all other viewers.
  • Prior knowledge of the screen size and camera position relative to the center of the screen are required.
  • We need to track the users’ head position.

Material

We will first download and load the packages containing the assets for this tutorial (textures/materials/scripts/scene)

You may find the TP5_HeadTracking.unitypackage at any of the following links:
https://drive.google.com/file/d/0BwA8OpKLTKR2NzZBc0hnVFVQUUU/view?usp=sharing
https://www.dropbox.com/s/dkqe60ta0vfnxzw/TP5_HeadTracking.unitypackage?dl=0

Now open Unity 5, create a new project named VR_TP5 and load the TP package ( Assets -> Import Package -> Custom Package … )

enter image description here

Make sure that all files are marked.

You will also need a calibration for the features we are going to track in this tutorial, download the file haarcascade_frontalface_alt.xml from: https://drive.google.com/file/d/0BwA8OpKLTKR2RGJqY3BwTEJWVHM/view?usp=sharing and place it in the project folder (i.e., in the same folder where the Assets, Library … folders are located).

Head Tracking

In this section we will learn how to use OpenCVSharp (OpenCV stands for Open source Computer Vision, Sharp stands for a C# wrapper of OpenCV) to track the head position of a user.

It is subdivided in the following parts:

  1. Preparation.
  2. Gather webcam feed, render it as a textured plane, and convert to the openCV format cvMat.
  3. Create a Unity camera and fit the textured plane from 2 on its Field of View (FOV).
  4. Track the face of the user using the Haar Cascade Classifier openCV algorithm.
  5. Transform the coordinates system from 4 in order to overlay the textured plane with the head position.
  6. Transform the coordinates system from 4 in order to present the head position in 3D.
  7. Improve the tracking reliability of item 6 with a smoothing algorithm.

1 Preparation

Open the scene HeadTracking (by double-clicking it in the Project tab), and start by creating a new C# file named HeadTracking.cs (Project tab -> Create -> C# Script). Now create a new camera by selecting GameObject -> Camera, set the new camera position to 0 in the Inspector, then rename the camera to unityWebcam (it will serve as a virtual representation of the actual webcam)

enter image description here

Finally, drag and drop the newly created HeadTracking.cs file to the unityWebcam game object.

2 Gather webcam feed, render it as a textured plane, and convert to the openCV format cvMat

Open the file HeadTracking.cs and replace its content with the code below:

The class Webcam encapsulates what is required for gathering the webcam feed and rendering it to a plane in the Unity scene. The class OCVCam encapsulates the code for converting a Unity 2D texture into a cvMat, which is the image format used by openCV. You may optionally open the files where these classes are declared in order to find details on how this is done.

Save the HeadTracking.cs file and go back to the editor. Select the unityCamera in the hierarchy tab. If you look in the inspector, you will notice that there are a lot of parameters that you can set for the component HeadTracking.cs. These are variables from the Webcam and OCVCam classes exposed to our HeadTracking class:

  • Wc (Webcam)
    • Cam Frame Rate: frame rate request to the webcam (-1 for no request)
    • Cam Width: image width request to the webcam (-1 for no request)
    • Cam Height: image height request to the webcam (-1 for no request)
    • Cam FOV: you must set the vertical FOV of the webcam. Most webcams have a 40~45 degrees FOV
    • Flip Up Down Axis: flip image vertically?
    • Flip Left Right Axis: flip image horizontally?
  • Ocv (OCVCam)
    • Reduction Factor: factor to divide the webcam feed resolution before converting to the CvMat format (this will significantly improve the performance of the program, as this conversion is the main bottleneck of OpenCVSharp)
    • Parallel Conversion: use multiple threads to convert the webcam feed to CvMat format (unless it is causing problems, keep it checked as it significantly increases the execution speed)

Ideally you should set Cam Frame Rate to 30, and Cam Width and Cam Height to the smaller supported widescreen resolution (in my case it was 424 x 240). Mind that these parameters are only requests to the camera, if the camera isn’t compatible with the settings or doesn’t allow these requests the standard camera settings will be used.

In the tab Game open the Stats display. Now hit the PLAY button, there is a new GameObject in the scene named “imgPlane”, which is rendering the camera feed:

enter image description here

The image should be rendered as a mirror (e.g., if you move to the left, your image in the object should also move to the left). If this is not the case, use the flip parameters to correct the image.

In the Stats display, check the frame rate (FPS) that this scene is running. If it is bellow 40 FPS, reduce the Cam Width and Cam Height, and increase the Reduction Factor until your frame rate is in the desired range.

3 Create a Unity camera and fit the textured plane from 2 on its Field of View (FOV)

Currently we are presenting the webcam feed at an incorrect aspect ratio (1:1), and the imgPlane doesn’t span all the unityCamera FOV. The following modifications to the code address these issues: Include the following variable declarations to the HeadTracking class

Append the following additional code to the Start function

And add the following function to the end of the HeadTracking class `\c

Now hit the PLAY button, and the webcam image should have the correct aspect ratio and cover the whole Game tab screen (at least in the vertical axis).

enter image description here

Although not evident at this point, the main reason why we want to make the imgPlane span the whole FOV of the unityCamera is to facilitate the conversion from the coordinate system of the screen to the coordinate system of the world (seen in step 5 and 6). Instead of applying these transformations ourselves, we will use the Camera built in capability for that.

4 Track the face of the user using the Haar Cascade Classifier openCV algorithm.

We will first create the following variables in the HeadTracking class:

scaleFactor will define the tracking resolution of the distance from the webcam. The closer to 1, the higher the resolution. Mind that a low value will significantly affect the performance of the tracking algorithm.

Append the following code to the Start function. It is used to load the face tracking calibration.

Replace the Update function with

Finally, copy the following function to the end of the HeadTracking class. It will track a face in the CvMat, and return a Vector3 containing the approximate position of the center of the eyes in the x and y coordinates, and the approximate diameter of the face (all in the cvMat coordinate system)

Now hit PLAY. As you move your head in front of the webcam, the Console tab should be printing the tracked position in the cvMat coordinate system.

enter image description here

If at this point your FPS is bellow 25, you should consider reducing the resolution of the image (e.g. reductionFactor, Cam Width and Cam Height) as well as the scaleFactor of the Haar classifier (to a bigger value).

5 Transform the coordinates system from 4 in order to overlay the textured plane with the head position

We first implement a function to convert from the cvMat coordinate system to the screen coordinate system in the HeadTracking class. Mind about the difference in the y axis between these coordinate systems. The cvMat image has the x=y=0 point at the top left corner, with +y pointing down, while the screen has its x=y=0 in the bottom left corner, with +y pointing up. We use Height – pos.y in order to compensate for this axis inversion.

Add the following variables to the HeadTracking class

Append the following to the end of the Start function

Replace the Update function with the following

Finally, paste the tracking function bellow to the HeadTracking class.

This function converts x,y coordinates from the cvMat coordinate system to the world coordinate system. Notice that we use the command camUnity.ScreenToWorldPoint to convert from the screen position to the world position, which is only possible due to the fact that the imgPlane perfectly spans the unityCamera FOV.

The obtained position is used to set the controledTr object position. We also use the head diameter to proportionally rescale this object.

Save the script. Create a new GameObject in the shape of a sphere (Game Object -> 3D Object -> Sphere), and assign it as the controlledTr in the Inspector of the unityCamera object.

Hit Play, you should see a flattened sphere overlaying your face in the Game tab

enter image description here

6 Transform the coordinates system from 4 in order to present the head position in 3D

Similarly to step 5, we now want to track the position in the z axis too. That is, instead of setting the size of the trackedTr we want to use the scale information to approximate the face distance from the webcam.

We use the difference between the obtained face diameter and the expected face diameter (knownFaceSize variable) in order to estimate the z displacement.

To do so, copy the following function to the HeadTracking class:

In the Update function, comment the call to TrackHeadOverImgPlane () and uncomment the call to TrackHead3D (). Now hit PLAY.

It is hard to perceive how the controlledTr navigates in the 3D space, to make it clearer you can activate the SplitscreenDebugView game object, which will render 3 orthogonal viewpoints of the scene (plus a perspective viewpoint).

enter image description here

7 Improve the tracking reliability of item 6 with a smoothing algorithm.

As you might have noticed, the tracking is very unreliable and the sphere is very shaky. We address this issue using a double exponential smoothing (http://www.itl.nist.gov/div898/handbook/pmc/section4/pmc434.htm) The double exponential smoothing can greatly stabilize the tracked position with a good response time. We do not use its prediction capability for the missing data cases (i.e. frames when no face could be tracked).

First declare the following variables in the HeadTracking class

Then we add the TrackHead3DSmooth function to the HeadTracking class:

In the Update function, comment the call to TrackHead3D () and uncomment the call to TrackHead3DSmooth (). Now hit PLAY.

A noticeable improvement in the quality of tracking should be evident, with an added tracking latency as a drawback. You can refine the smoothing alpha parameter in order to control the quality/latency tradeoff.

Now that our head tracking pipeline is ready, and you have refined the webcam/opencv/smoothing parameters, create a prefab of the unityCamera object (drag and drop it from the Hierarchy tab to the Project tab)

Asymmetric Camera Frustum

Now we will use our HeadTracking class in order to define the position of a camera, and deform the camera frustum in order to replicate the effect discussed in the introduction section.

We will test it in two different scenes, the FishtankVR and the WindowVR, which demonstrate distinct conceptual scenarios. Fishtank refers to scenes of very limited dimensions, and the fact that your reach is limited by a glass-like barrier (the computer screen). Window to virtual reality refers to scenes where the computer screen behaves as a window to a vast immaterial world of virtual reality.

1 FishtankVR

Open the FishtankVR scene. Create a new Empty game object gameObject (GameObject -> Create Empty), rename it to screenPlane. Set the scale of the screenPlane x,y scale to the size of your notebook screen (in meters), and set the y position to minus half of the y scale:

enter image description here

In the figure I use an approximation of my computer screen size.

In the FishtankVR object, set the Reference Transfer parameter (from the TransformScene.cs script) to screenPlane (drag and drop the object from the hierarchy tab to the Inspector tab).

enter image description here

The FishtankVR object will be scaled to fit the screenPlane size. It will also be translated/rotated to the same position/orientation.

Now create a new camera (GameObject -> Camera), rename it to asymFrustum. Attach the script AsymmetricFrustum.cs to it, and set the parameter Screen Plane with the screenPlane object (drag and drop).

enter image description here

Instantiate your unityCamera.prefab (drag and drop from the Project tab to the Hierarchy tab). Set the position of the new unityCamera to 0, and rotate it by 180 degrees in the y axis. Next disable the Camera component of the unityCamera object, and set the controlled Tr parameter to the asymFrustum object (drag and drop). Finally, you will have to invert your option Flip Left Right Axis in the HeadTracking.cs component.

enter image description here

Save the scene (ctrl + s) and hit PLAY. The asymCamera should be moving according to your head position, and the frustum of this camera should be deformed so that its boundaries are passing through the screenPlane limits. The dynamically changing frustum allows for motion parallax, which in turn gives relevant cues about the relative depth of the objects in the scene.

 

2 WindowVR

Open the WindowVR scene. Repeat the steps given for 1-FishtankVR, the only difference is that the TransformScene.cs script is located in the windowFrame object (before it was located in the FishtankVR object).

enter image description here

11 thoughts on “Desktop VR: Head tracking and assymetric frustum with OpenCVSharp and Unity

  • Hi, noob here. in the headtracking script, at the start function ive added this as u did but there is an error

    Camera thisCam = this.GetComponent<Camera<();

    in this sentence, unity says an unexpected symbol ')'

    I cant seems to understand it. Can u help me?

    • Hello Ooi,
      Camera thisCam = this.GetComponent>();
      By the way, I have just noticed an issue with the < symbols in the tutorial, I will try to fix that. Best

      • hi,
        Sorry again for the bother. I have another problem, but it is about exporting into exe.
        After i export it to exe, it doesnt track and pop up this error

        ArgumentNulException: Argument cannot be null.
        Parameter name: cascade

        please help

        • Hello Ooi, your problem might be related to the path to the haarcascade_frontalface_alt.xml file.
          The way it is being set now, this file has to be in the same folder as the .exe file.

  • Hello There

    I am stuck on step 4 and placing…

    // load the calibration for the Haar classifier cascade
    // docs.opencv.org/3.1.0/d7/d8b/tutorial_py_face_detection.html
    cascade = CvHaarClassifierCascade.FromFile(“./haarcascade_fr” +
    “ontalface_alt.xml”);
    }

    void Update () {
    ocv.UpdateOCVMat ();
    Vector3 cvHeadPos = new Vector3 ();
    if (HaarClassCascade (ref cvHeadPos))
    Debug.Log (cvHeadPos);
    }

    This code. It comes up with a number of problems. Just wondering how to fix this

    Thanks

  • Thanks for this code! I am having trouble on step 7. Everything works correctly up to that point. When I start using TrackHead3DSmooth() instead of TrackHead3D() the tracking stops working entirely. The sphere does not appear in the HeadTracking scene and stays at the origin, although it does change scale when the scene starts. Any idea what is wrong?

      • I think the issue is that because priorPos initializes to (0, 0, 0), the conditional “if ((cvHeadPos – priorPos).magnitude < 0.4f)" never returns true. I added in a check for priorPos.magnitude == 0 and now it does work. Still a bit jittery but better than the unsmoothed version.

  • hi i went through the above tutorial one by one and integrated entire code, on macOSX unity 2017,
    after integration when i play , the below errors are came:

    1Error:
    DllNotFoundException: opencv_core249
    OpenCvSharp.Utilities.PInvokeHelper.TryPInvoke ()
    Rethrow as OpenCvSharpException: opencv_core249
    *** An exception has occurred because of P/Invoke. ***
    Please check the following:
    (1) OpenCV’s DLL files exist in the same directory as the executable file.
    (2) Visual C++ Redistributable Package has been installed.
    (3) The target platform(x86/x64) of OpenCV’s DLL files and OpenCvSharp is the same as your project’s.
    System.DllNotFoundException: opencv_core249
    OpenCvSharp.Utilities.PInvokeHelper.TryPInvoke ()
    OpenCvSharp.Utilities.PInvokeHelper.DllImportError (System.Exception ex)
    OpenCvSharp.Utilities.PInvokeHelper.TryPInvoke ()
    OpenCvSharp.NativeMethods..cctor ()
    Rethrow as TypeInitializationException: An exception was thrown by the type initializer for OpenCvSharp.NativeMethods
    OpenCvSharp.CvMemStorage..ctor (Int32 blockSize)
    OpenCvSharp.CvMemStorage..ctor ()
    HeadTracking..ctor () (at Assets/Scenes/HeadTracking.cs:32)

    2 Error:
    NullReferenceException: Object reference not set to an instance of an object
    OCVCam.Texture2DToCvMat () (at Assets/Scripts/WebcamOpenCV.cs:157)
    OCVCam.UpdateOCVMat () (at Assets/Scripts/WebcamOpenCV.cs:141)
    HeadTracking.Update () (at Assets/Scenes/HeadTracking.cs:94)

    • Hi atul,
      as mentioned in the start of the tutorial, it only works on Windows.
      That is because the binaries for OpenCVSharp are only available for Windows.
      Alternative solutions would require a different way to use OpenCV, such as this paid package for unity https://www.assetstore.unity3d.com/en/#!/content/21088 ,
      or a separate opencv program that tracks the face and communicates to the Unity application.
      Best,
      Henrique

Leave a Reply

Your email address will not be published. Required fields are marked *