# Spatial Reality Display Sample program manual

This is an explanation of the Spatial Reality Display sample program.

## Spatial Reality Display Sample overview

This sample draws a 3D object to Spatial Reality Display using OpenGL.
The OpenGL version used is 3.3 or later.

### Develop environment

Use the following environment:

- Visual Studio 2019
- Spatial Reality Display SDK 2.5.0
- Spatial Reality Display Settings Installer 2.5.0
- Native API 2.0 (XR_API_2.4.zip)

### Used libraries

Use the following external libraries:

|Name|Version|Content|URL|
|---|---|---|---|
|GLFW|3.3.9|OpenGL initialization|<https://www.glfw.org/>|
|GLM|0.9.9.8|3D math|<https://github.com/g-truc/glm>|

Use GL3W to initialize OpenGL 3.3.

GL3W is generated by a Python script.

- URL: <https://github.com/skaslev/gl3w>
- hash: 3a33275633ce4be433332dc776e6a5b3bdea6506
- Generate date: 2024/02/16

These libraries are placed in the third_pary directory.

### Sources

|Source|Content|
|---|---|
|main.cpp|Sample main|
|GLManager.h/.cpp| OpenGL wrapper |
|StandardMaterialShader.h/.cpp| Surface shader code |
|QuadCopyShader.h/.cpp| Quad copy shader code |
|HomographyCopyShader.h/.cpp| Homography transformation shader code |

### Sample process

Sample processing is performed as follows.

1. Initialize Spatial Reality Display.
2. Check user input/Update mouse object position.
3. Get head tracking information from Spatial Reality Display.
4. Create a View/Projection matrix for drawing from head tracking information.
5. Draw the scene as seen from the left eye.
6. Draw the scene as seen from the right eye.
7. Perform homography conversion on the drawing results for the left and right eyes and draw.
8. Submit the homography transformation results to Spatial Reality Display.
9. Update the drawing surface.

Repeat steps 2 - 9.

## Initialize and terminate

### Spatial Reality Display initialize

Spatial Reality Display is initialized using `SampleApp::OnSRDEnable()`.

First load the Spatial Reality Display runtime.

```cpp
result = LinkXrLibrary("Spatial Reality Display");
if (result != SonyOzResult::SUCCESS) {
    std::cout << "Runtime is not found." << std::endl;
    return false;
}
```

Enumerate the connected Spatial Reality Display devices and decide which one to use.
This sample uses the first enumerated device.

`SonyOzDeviceInfo` has information about Spatial Reality Display devices.
`target_monitor_rectangle` is used to place the drawing window on the Spatial Reality Display monitor.

```cpp
SonyOzDeviceInfo deviceInfo[3];

uint64_t deviceInfoSize = uint64_t(std::size(deviceInfo));
result = sony::oz::xr_runtime::EnumerateDevices("Spatial Reality Display", deviceInfoSize, deviceInfo);

if (result != SonyOzResult::SUCCESS || deviceInfoSize <= 0) {
    std::cout << "There are no SR Displays." << std::endl;
    return false;
}
```

Start a session with the Spatial Reality Display device.

`BeginSession` returns `SonyOzSessionHandle`.
This handle is used to operate the Spatial Reality Display.

```cpp
result = CreateSession("Spatial Reality Display", &deviceInfo[0], RUNTIME_OPTION_IS_XR_CONTENT, PLATFORM_OPTION_NONE, &m_sessionHandle);

if (result != SonyOzResult::SUCCESS) {
    std::cout << "CreateSession Error." << std::endl;
    return false;
}

result = BeginSession(m_sessionHandle);

if (result != SonyOzResult::SUCCESS) {
    std::cout << "BeginSession Error." << std::endl;
    return false;
}

if (utility::WaitUntilRunningState(m_sessionHandle) == false) {
    std::cout << "WaitUntilRunningState Error." << std::endl;
    return false;
}
```

`GetDisplaySpec` returns the `SonyOzDisplaySpec`.

The specifications include the dimensions of the Spatial Reality Display (`display_size`) and the tilt of the screen (`display_tilt_rad`).

SR-1 and SR-2 have different dimensions. Adjust the size of the 3D scene based on Spatial Reality Display actual size.

Dimensions and screen tilt are used for drawing in 3D space and homography transformation.

Dimensions are in meters.
The sample is constructed in centimeters, so multiply it by 100.

```cpp
if (GetDisplaySpec(m_sessionHandle, &m_srdSpec) != SonyOzResult::SUCCESS)
{
    std::cout << "GetDisplaySpec Error." << std::endl;
    return false;
}
```

`EnableStereo` starts stereo viewing.

```cpp
EnableStereo(m_sessionHandle, true);
```

### Spatial Reality Display terminate

`SampleApp::OnSRDDisable()` implements Spatial Reality Display termination processing.

1. End stereo viewing (`EnableStereo`)
2. End the session (`EndSession`)
3. Destroy the session handle (`DestroySession`)

```cpp
EnableStereo(m_sessionHandle, false);
std::cout << "Session disconnected." << std::endl;

// terminating process.
result = EndSession(m_sessionHandle);

if (result != SonyOzResult::SUCCESS) {
    std::cout << "EndSession Error." << std::endl;
}
result = DestroySession(&m_sessionHandle);
if (result != SonyOzResult::SUCCESS) {
    std::cout << "DestroySession Error." << std::endl;
}
```

## Preparing 3D objects

`SampleApp::SetupScene()` prepares the 3D scene to be displayed on the Spatial Reality Display.

The scene has the following structure.

- Bottom plane
- Back plane
- Sample 3D object (cube)
- Mouse pointer object

![Sample scene](images/img_001.png)

The 3D scene is created according to the dimensions of the Spatial Reality Display used.

The scene origin (0, 0, 0) is a right-handed coordinate system centered in front of the Spatial Reality Display.

![Scene coordinate](images/img_002.png)

## Mouse pointer object

The position of the 3D object representing the mouse pointer in the 3D scene is updated in `SampleApp::CheckMouseInput()`. \
We obtain the mouse position in screen coordinates then convert it to 3D scene coordinates so as to display it on the Spatial Reality Display panel surface. \
In addition, to check whether the mouse input is recorded correctly, the color of the pointer object is changed when the left click is pressed.

```cpp
void SampleApp::CheckMouseInput()
{
    // Get mouse position in screen coordinates
    double xpos, ypos;
    glfwGetCursorPos(m_window, &xpos, &ypos);

    double xPercent = xpos / (m_deviceWidth - 1);
    double yPercent = ypos / (m_deviceHeight - 1);

    // Convert to position on the SRD plane in world coordinates
    double worldX = xPercent * m_srdScreenWidthCm - m_srdScreenWidthCm / 2;
    double worldY = m_srdHeightCm - yPercent * m_srdHeightCm;
    double worldZ = yPercent * m_srdDepthCm - m_srdDepthCm;

    m_mousePointerObj->m_transform = glm::translate(glm::mat4(1), glm::vec3(worldX, worldY, worldZ));

    // Change object color when left button is being pressed
    if (glfwGetMouseButton(m_window, GLFW_MOUSE_BUTTON_LEFT) == GLFW_PRESS)
    {
        m_mousePointerObj->m_diffuse = glm::vec4(170.0f / 255.0f, 85.0f / 255.0f, 85.0f / 255.0f, 1.0f);
    }
    else if (glfwGetMouseButton(m_window, GLFW_MOUSE_BUTTON_LEFT) == GLFW_RELEASE)
    {
        m_mousePointerObj->m_diffuse = glm::vec4(170.0f / 255.0f, 170.0f / 255.0f, 170.0f / 255.0f, 1.0f);
    }
}
```

## Keyboard input

Key input is checked every frame in `SampleApp::CheckKeyboardInput()`.

```cpp
void SampleApp::CheckKeyboardInput()
{
    if (glfwGetKey(m_window, GLFW_KEY_ESCAPE) == GLFW_PRESS)
    {
        glfwSetWindowShouldClose(m_window, 1);
    }
}
```

- When the ESC key is pressed, the program closes.

## Head tracking

The View/Projection matrix is created based on the eye information tracked by Spatial Reality Display.

### Creating a View matrix

The View matrix is created from the eye pose information 'SonyOzPosef' from the Spatial Reality Display.

Update the tracking data cache to get pose information (`UpdateTrackingResultCache`).

Get the eye pose information with `GetCachedPose`.

```cpp
UpdateTrackingResultCache(m_sessionHandle);
GetCachedPose(m_sessionHandle, SonyOzPoseId::LEFT_EYE, &left_pose, &leftValid);
GetCachedPose(m_sessionHandle, SonyOzPoseId::RIGHT_EYE, &right_pose, &rightValid);
```

`SonyOzPosef` has a position (`position`) and an orientation (`orientation`).
The eye coordinate system is inverted from the scene coordinate system.
It looks like the image below.

![Camera coordinate](images/img_003.png)

Create a View matrix with `MakeViewMatrix()`.

```cpp
static glm::mat4 MakeViewMatrix(const SonyOzPosef& pose)
{
    const auto inv_xz = glm::mat3(
        -1, 0, 0,
        0, 1, 0,
        0, 0, -1
    );

    auto camRot = glm::quat(pose.orientation.w, pose.orientation.x, pose.orientation.y, pose.orientation.z);
    auto camPos = glm::vec3(pose.position.x, pose.position.y, pose.position.z);
    camPos = (glm::mat4(inv_xz) * glm::translate(glm::mat4(1), camPos) * glm::mat4(inv_xz))[3];
    camRot = glm::quat_cast(inv_xz * glm::mat3_cast(camRot) * inv_xz);

    // Convert units from meters to centimeters.
    camPos = camPos * 100.0f;
    const auto view = glm::inverse(glm::translate(glm::mat4(1), camPos) * glm::mat4_cast(camRot));

    return view;
}
```

### Creating a Projection matrix

Projection matrix is created using `SonyOzProjection`.
Get `SonyOzProjection` with `GetProjection`.

```cpp
GetProjection(m_sessionHandle, SonyOzPoseId::LEFT_EYE, &left_eye_projection);
GetProjection(m_sessionHandle, SonyOzPoseId::RIGHT_EYE, &right_eye_projection);
```

Create a Projection matrix with `MakeProjectionMatrix()`.

```cpp
static glm::mat4 MakeProjectionMatrix(const SonyOzProjection& proj, float nearClip, float farClip)
{
    const float left = nearClip * tanf(proj.half_angles_left);
    const float right = nearClip * tanf(proj.half_angles_right);
    const float top = nearClip * tanf(proj.half_angles_top);
    const float bottom = nearClip * tanf(proj.half_angles_bottom);
    return glm::frustumRH(left, right, bottom, top, nearClip, farClip);
}
```

## Rendering

The scene is drawn as seen from the left and right eyes.
The scene drawing result needs to be homography transformed.
Finally, submit the left and right drawing results side-by-side to his Spatial Reality Display runtime.

Prepare the drawing destination seen from the left and right eyes (`SampleApp::OnSetupSRDFramebuffer`).

```cpp
m_leftTarget = m_glManager.CreateFrameBuffer(m_deviceWidth, m_deviceHeight);
m_rightTarget = m_glManager.CreateFrameBuffer(m_deviceWidth, m_deviceHeight);
m_leftHmTarget = m_glManager.CreateFrameBuffer(m_deviceWidth, m_deviceHeight, false);
m_rightHmTarget = m_glManager.CreateFrameBuffer(m_deviceWidth, m_deviceHeight, false);
m_sideBySideTarget = m_glManager.CreateFrameBuffer(m_deviceWidth * 2, m_deviceHeight, false);
m_compositeTarget = m_glManager.CreateFrameBuffer(m_deviceWidth, m_deviceHeight, false);
```

### Rendering (Left / Right eyes)

```cpp
// Render left view
m_sceneInfo.m_view = viewL;
m_sceneInfo.m_projection = projL;
glBindFramebuffer(GL_FRAMEBUFFER, m_leftTarget->m_fbo);
RenderScene();

// Render right view
m_sceneInfo.m_view = viewR;
m_sceneInfo.m_projection = projR;
glBindFramebuffer(GL_FRAMEBUFFER, m_rightTarget->m_fbo);
RenderScene();
```

### Homography transformation

The actual Spatial Reality Display screen is angled, so if you use the drawing results as is, they will be distorted.

![Rendered results and actual appearance](images/img_004.png)

Distorts the drawing result using homography transformation (considering the actual appearance).

![Homography transformation](images/img_005.png)

Create a matrix for homography transformation with `MakeHomographyMatrix`.
This function creates a homography matrix that transforms (0, 0) , (0, 1), (1,0), (1, 1) to any 4 vertices.

```cpp
static glm::mat3 MakeHomographyMatrix(const glm::vec2 viewportPoints[4])
{
    const auto p00 = viewportPoints[0];
    const auto p01 = viewportPoints[1];
    const auto p10 = viewportPoints[2];
    const auto p11 = viewportPoints[3];

    const auto x00 = p00.x;
    const auto y00 = p00.y;
    const auto x01 = p01.x;
    const auto y01 = p01.y;
    const auto x10 = p10.x;
    const auto y10 = p10.y;
    const auto x11 = p11.x;
    const auto y11 = p11.y;

    const auto a = x10 - x11;
    const auto b = x01 - x11;
    const auto c = x00 - x01 - x10 + x11;
    const auto d = y10 - y11;
    const auto e = y01 - y11;
    const auto f = y00 - y01 - y10 + y11;

    const auto h13 = x00;
    const auto h23 = y00;
    const auto h32 = (c * d - a * f) / (b * d - a * e);
    const auto h31 = (c * e - b * f) / (a * e - b * d);
    const auto h11 = x10 - x00 + h31 * x10;
    const auto h12 = x01 - x00 + h32 * x01;
    const auto h21 = y10 - y00 + h31 * y10;
    const auto h22 = y01 - y00 + h32 * y01;

    return glm::mat3(h11, h12, h13, h21, h22, h23, h31, h32, 1.0f);
}
```

Calculates where the four vertices of the Spatial Reality Display's display surface are on the rendered 2D screen.

Use `WorldToViewport` to calculate where the coordinates in the 3D scene are on the 2D screen.
In this function, the top left is (0,0) and the bottom right is (1, 1).

```cpp
static glm::vec2 WorldToViewport(const glm::mat4& vp, const glm::vec3& pos)
{
    glm::vec4 screenPos = vp * glm::vec4(pos, 1);
    screenPos /= screenPos.w;
    return glm::vec2((screenPos.x + 1.0f) * 0.5f, 1.0f - (screenPos.y + 1.0f) * 0.5f);
}
```

The four vertices of the display position use Spatial Reality Display actual size.

```cpp
glm::vec3 srdDispalyPlanePositions[4];
{
    const float x0 = -m_srdScreenWidthCm * 0.5f; // Left
    const float x1 = +m_srdScreenWidthCm * 0.5f; // Right
    const float y0 = m_srdHeightCm;              // Top
    const float y1 = 0.0f;                       // Bottom
    const float z0 = 0.0f;                       // Front
    const float z1 = -m_srdDepthCm;              // Back
    srdDispalyPlanePositions[0] = glm::vec3(x0, y0, z1); // Left Top
    srdDispalyPlanePositions[1] = glm::vec3(x0, y1, z0); // Left Bottom
    srdDispalyPlanePositions[2] = glm::vec3(x1, y0, z1); // Right Top
    srdDispalyPlanePositions[3] = glm::vec3(x1, y1, z0); // Right Bottom
}

// Left eye
const auto vpLeft = projL * viewL;
const glm::vec2 viewportLeft[] =
{
    WorldToViewport(vpLeft, srdDispalyPlanePositions[0]),
    WorldToViewport(vpLeft, srdDispalyPlanePositions[1]),
    WorldToViewport(vpLeft, srdDispalyPlanePositions[2]),
    WorldToViewport(vpLeft, srdDispalyPlanePositions[3]),
};
```

Calculates the homography matrix and its inverse matrix.

```cpp
glm::mat3 homographyL = MakeHomographyMatrix(viewportLeft);
glm::mat3 inverseHomographyL = MakeInverseHomographyMatrix(homographyL);
```

In the shader for homography transformation, the vertex shader performs the homography transformation.
The fragment shader uses an inverse matrix to refer to pixels in the original image.

This program requires processing to distort the original image and eliminate the distortion when you actually view it.
Set the inverse matrix to the vertex shader and distort the drawing.

```cpp
CopyHomography(*m_leftHmTarget, m_leftTarget->m_texture, hmLQuad, inverseHomographyL, homographyL);
```

### Display in Spatial Reality Display

Prepare a texture in which the results of homography transformation are placed side-by-side.

```cpp
Quad copyLeftQuad;
copyLeftQuad.m_quad[0].m_position = glm::vec2(-1, 1);
copyLeftQuad.m_quad[1].m_position = glm::vec2(-1, -1);
copyLeftQuad.m_quad[2].m_position = glm::vec2(0, 1);
copyLeftQuad.m_quad[3].m_position = glm::vec2(0, -1);
CopyQuad(*m_sideBySideTarget, m_leftHmTarget->m_texture, copyLeftQuad);

Quad copyRightQuad;
copyRightQuad.m_quad[0].m_position = glm::vec2(0, 1);
copyRightQuad.m_quad[1].m_position = glm::vec2(0, -1);
copyRightQuad.m_quad[0].m_position = glm::vec2(1, 1);
copyRightQuad.m_quad[1].m_position = glm::vec2(1, -1);
CopyQuad(*m_sideBySideTarget, m_rightHmTarget->m_texture, copyRightQuad);
```

Submit the created side-by-side texture to Spatial Reality Display and get the composite result.

```cpp
SubmitOpengl(m_sessionHandle, m_sideBySideTarget->m_texture, false, m_compositeTarget->m_texture);
```

Draws the composite result to the OpenGL screen(FBO = 0).

```cpp
CopyQuad(0, m_deviceWidth, m_deviceHeight, m_compositeTarget->m_texture, copyCompositeQuad);
```
