**_Environment Mapping in Vulkan_**
	[William Gunawan](https://www.williscool.com)
	Written on 2024/06/25

# Introduction

Environment mapping is a popular, cheap, and fast solution to "Local Illumination", as written in the book Real-Time Rendering. Environment maps helps add a constant environment shade to all surfaces through the use of effectively pre-computed 
global illumination. 

The primary use-case is to be used as a constant background of the scene that represents objects/landscapes at an infinite distance. While that isn't a realistic representation, it is a practical one that works well enough in the context of a 
game scene.

The process of implementing environment maps can be fairly lengthy as there are multiple steps that need to be implemented to produce the final results that can be seen in the image below.

All code is publicly available on my [GitHub](https://github.com/Williscool13/WillEngine/tree/a285c02427cadb83146abc1a8092e8bf7f249059).

![Smoothness and Metallic](emapping/smoothMetallicBalls.png)

# Equirectangular Image

It is highly likely that rather than using 6 individual textures to form a cubemap, you will be using an equirectangular texture instead. These are fairly common and can be found on website such as [Polyhaven](https://polyhaven.com/). Of course, 
a higher resolution results in a sharper environment image, but these images come in HDR format, which require 32 bits for each channel. This, if you did not know, is significantly larger than the typical texture formats, which only require 32 bits for an entire
pixel (i.e. 8 bits per channel). 

Of course, the quality you choose will be dependent on your game/goals, but consider using a slightly lower resolution, as results are typically acceptable. The resolution I chose is 4k (4096x2048x3).

## Load the image with STBI into a VulkanImage

STBI comes with a load function that can be specifically used for the HDR format (stbi_loadf).

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
AllocatedImage _equiImage;
int width, height, channels;
float* data = stbi_loadf(path, &width, &height, &channels, 4);
if (data) {
	fmt::print("Loaded Equirectangular Image \"{}\"({}x{}x{})\n", path, width, height, channels);
	_equiImage = _creator->_resourceConstructor->create_image(data, width * height * 4 * sizeof(float), VkExtent3D{ (uint32_t)width, (uint32_t)height, 1 }, VK_FORMAT_R32G32B32A32_SFLOAT, VK_IMAGE_USAGE_SAMPLED_BIT, true);
	stbi_image_free(data);
}
else {
	fmt::print("Failed to load Equirectangular Image ({})\n", path);
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

## Convert the image into a cubemap

This step is more involved and requires the creation of a Vulkan cubemap image, which I will not elaborate on in this write-up. 
If you need reference, feel free to take a look at the [Project on GitHub](https://github.com/Williscool13/WillEngine/tree/a285c02427cadb83146abc1a8092e8bf7f249059)

I decided that compute shaders are more than appopriate for this task and all tasks that process the equirectangular image/cubemap. The **`SampleSphericalMap`** function is directly taken from [LearnOpenGL](https://learnopengl.com/), 
which has a myriad of resources that relate to rendering. Though as the name implies, it is primarily focused on implementing rendering features in OpenGL.

An important detail is that because we want to represent our environment map in HDR format, we must specify the qualifier rgba32f in the imageCube declaration. This of course, must match the VkFormat of **`VK_FORMAT_R32G32B32A32_SFLOAT`** which is specified
during creation of the Vulkan cubemap image.

### Shader Code

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#version 450

layout (local_size_x = 16, local_size_y = 16) in;

layout(set = 0, binding = 0) uniform sampler2D sourceImage;
layout(set = 1, binding = 0, rgba32f) writeonly uniform imageCube cubemap;

layout(push_constant) uniform PushConstants {
    bool invertY;
} pushConstants;


vec2 SampleSphericalMap(vec3 v) {
    vec2 invAtan = vec2(0.1591, 0.3183); // 1 / (2 * pi), 1 / pi
    vec2 uv = vec2(atan(v.z, v.x), asin(v.y));
    uv *= invAtan;
    uv += 0.5;
    return uv;
}

void main() {
    ivec3 storePos = ivec3(gl_GlobalInvocationID.xy, gl_GlobalInvocationID.z);
    vec2 uv = (vec2(storePos.xy) + 0.5) / vec2(imageSize(cubemap).xy); // Normalized coordinates
    uv = uv * 2.0 - 1.0; // Map to range [-1, 1]

    vec3 direction;
    switch (storePos.z) {
        case 0: direction = normalize(vec3(1.0, -uv.y, -uv.x)); break; // +X
        case 1: direction = normalize(vec3(-1.0, -uv.y, uv.x)); break; // -X
        case 2: direction = normalize(vec3(uv.x, 1.0, uv.y)); break; // +Y
        case 3: direction = normalize(vec3(uv.x, -1.0, -uv.y)); break; // -Y
        case 4: direction = normalize(vec3(uv.x, -uv.y, 1.0)); break; // +Z
        case 5: direction = normalize(vec3(-uv.x, -uv.y, -1.0)); break; // -Z
        default: direction = vec3(0.0); break; // Should not happen
    }

    vec2 sphericalUV = SampleSphericalMap(direction);
    sphericalUV = clamp(sphericalUV, vec2(0.0), vec2(1.0));

    vec4 color;
    if (pushConstants.invertY) {
		color = texture(sourceImage, vec2(sphericalUV.x, 1.0 - sphericalUV.y));
	} else {
		color = texture(sourceImage, sphericalUV);
	}

    imageStore(cubemap, storePos, color);
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

## Use the cubemap as environment

This generated cubemap can be sampled directly to produce the environment map/background. The code is straightforward and can be directly drawn on a fullscreen quad to minimize resource use.

Note that the z-component of my vertices is 0.001, as my renderer uses a reverse depth buffer. If you use a standard depth buffer, you must change this to 0.999.

### Shader Code

**`environment.vert`**
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#version 450

vec3 positions[3] = vec3[](
        vec3(-1.0, -1.0, 0.001),
        vec3( 3.0, -1.0, 0.001),
        vec3(-1.0,  3.0, 0.001)
);

layout(set = 0, binding = 0) uniform GlobalUniform {
	mat4 view;
	mat4 proj;
	mat4 viewproj;
} sceneData;

layout(location = 0) out vec3 fragPosition;

void main() {
    vec3 pos = positions[gl_VertexIndex];
	gl_Position = vec4(pos, 1.0);
	fragPosition = (inverse(sceneData.viewproj) * vec4(pos, 1.0)).xyz;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
**`environment.frag`**
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#version 450
layout(location = 0) in vec3 fragPosition;
layout(location = 0) out vec4 outColor;

layout(set = 1, binding = 0) uniform samplerCube environmentMap;


void main()
{
    // sample environment map
	vec3 direction = normalize(fragPosition);
	direction.y = -direction.y;
	vec3 envColor = texture(environmentMap, direction).rgb;

	// HDR to sRGB
	envColor = envColor / (envColor + vec3(1.0));
	envColor = pow(envColor, vec3(1.0 / 2.2));

    outColor = vec4(envColor, 1.0);
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

With the environment cubemap map, we can now generate the 2 PBR maps necessary to produce the surface shade. These maps include the Diffuse Irradiance map and the Specular Prefiltered maps.
We can use the cubemap's data to describe the integral of irradiance received by a PBR surface. It is much simpler to compute this integral by splitting it into 2 parts: Diffuse and Specular.

# Specular Image-Based Lighting

Let's first talk about the specular prefiltered maps, typically referred to as specular cubemaps. To represent the specular incoming light, we can simply use the cubemap directly. 
However, how the cubemap appears is altered drastically by the roughness of the surface. As the surface becomes more rough, the reflection will tend to appear more blurred, which appears Gaussian at a glance (but is not).

A crucially important detail that significantly reduces the memory needed for these cubemaps is that as the roughness of a surface increases, the cubemap that needs to be sampled will exhibit fewer high-frequency points.
This means that the data can be represented fairly accurately at lower resolutions and still appear convincing. We can take advantage of the mipmap structure which very conveniently supports this pattern.

The exact underlying mathematics of how the cubemaps are filtered is beyond the scope of this write-up, but is elaborated in much more detail in LearnOpenGl's article about [Specular IBL](https://learnopengl.com/PBR/IBL/Specular-IBL).
It isn't necessary to know the mathematics to implement this prefiltering.

## Shader Code

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#version 450

layout (local_size_x = 8, local_size_y = 8) in;

layout(set = 0, binding = 0) uniform samplerCube sourceMap;
layout(set = 1, binding = 0, rgba32f) writeonly uniform imageCube specularPrefilteredMap;

const float PI = 3.14159265359;


layout(push_constant) uniform PushConstants {
    float roughness;
    uint width;
    uint height;
    uint sampleCount;
} pushConstants;


/* Non Bit Shift version
float VanDerCorput(uint n, uint base)
{
    float invBase = 1.0 / float(base);
    float denom   = 1.0;
    float result  = 0.0;

    for(uint i = 0u; i < 32u; ++i)
    {
        if(n > 0u)
        {
            denom   = mod(float(n), 2.0);
            result += denom * invBase;
            invBase = invBase / 2.0;
            n       = uint(float(n) / 2.0);
        }
    }

    return result;
}
*/
float RadicalInverse_VdC(uint bits) 
{
    bits = (bits << 16u) | (bits >> 16u);
    bits = ((bits & 0x55555555u) << 1u) | ((bits & 0xAAAAAAAAu) >> 1u);
    bits = ((bits & 0x33333333u) << 2u) | ((bits & 0xCCCCCCCCu) >> 2u);
    bits = ((bits & 0x0F0F0F0Fu) << 4u) | ((bits & 0xF0F0F0F0u) >> 4u);
    bits = ((bits & 0x00FF00FFu) << 8u) | ((bits & 0xFF00FF00u) >> 8u);
    return float(bits) * 2.3283064365386963e-10; // / 0x100000000
}
// ----------------------------------------------------------------------------
vec2 Hammersley(uint i, uint N)
{
    return vec2(float(i)/float(N), RadicalInverse_VdC(i));
}  

vec3 ImportanceSampleGGX(vec2 Xi, vec3 N, float roughness)
{
    float a = roughness*roughness;
	
    float phi = 2.0 * PI * Xi.x;
    float cosTheta = sqrt((1.0 - Xi.y) / (1.0 + (a*a - 1.0) * Xi.y));
    float sinTheta = sqrt(1.0 - cosTheta*cosTheta);
	
    // from spherical coordinates to cartesian coordinates
    vec3 H;
    H.x = cos(phi) * sinTheta;
    H.y = sin(phi) * sinTheta;
    H.z = cosTheta;
	
    // from tangent-space vector to world-space sample vector
    vec3 up        = abs(N.z) < 0.999 ? vec3(0.0, 0.0, 1.0) : vec3(1.0, 0.0, 0.0);
    vec3 tangent   = normalize(cross(up, N));
    vec3 bitangent = cross(N, tangent);
	
    vec3 sampleVec = tangent * H.x + bitangent * H.y + N * H.z;
    return normalize(sampleVec);
}  


void main()
{
	ivec3 storePos = ivec3(gl_GlobalInvocationID.xy, gl_GlobalInvocationID.z);
    if (storePos.x >= pushConstants.width || storePos.y >= pushConstants.height) {
		return;
	}

    vec2 uv = (vec2(storePos.xy) + 0.5) / vec2(imageSize(specularPrefilteredMap).xy); // Normalized coordinates
    uv = uv * 2.0 - 1.0; // Map to range [-1, 1]
     vec3 direction;
    switch (storePos.z) {
        case 0: direction = normalize(vec3(1.0, -uv.y, -uv.x)); break; // +X
        case 1: direction = normalize(vec3(-1.0, -uv.y, uv.x)); break; // -X
        case 2: direction = normalize(vec3(uv.x, 1.0, uv.y)); break; // +Y
        case 3: direction = normalize(vec3(uv.x, -1.0, -uv.y)); break; // -Y
        case 4: direction = normalize(vec3(uv.x, -uv.y, 1.0)); break; // +Z
        case 5: direction = normalize(vec3(-uv.x, -uv.y, -1.0)); break; // -Z
        default: direction = vec3(0.0); break; // Should not happen
    }

    vec3 N = direction;
    vec3 R = N;
    vec3 V = R;
    float roughness = pushConstants.roughness;

    // for now hardcode 1024 samples
    const uint SAMPLE_COUNT = 1024u;
    float totalWeight = 0.0;   
    vec3 prefilteredColor = vec3(0.0);     
    for(uint i = 0u; i < SAMPLE_COUNT; ++i)
    {
        vec2 Xi = Hammersley(i, SAMPLE_COUNT);
        vec3 H  = ImportanceSampleGGX(Xi, N, roughness);
        vec3 L  = normalize(2.0 * dot(V, H) * H - V);

        float NdotL = max(dot(N, L), 0.0);
        if(NdotL > 0.0)
        {
            prefilteredColor += texture(sourceMap, L).rgb * NdotL;
            totalWeight      += NdotL;
        }
    }
    prefilteredColor = prefilteredColor / totalWeight;

    vec4 color = vec4(prefilteredColor, 1.0);

    imageStore(specularPrefilteredMap, storePos, color);

}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The results of this prefiltering operation can be seen in the images below. It effectively results in a blurred reflection at higher roughness levels.
What this effectively means, is that on a perfectly metallic and smooth ball, you will see a direct clear reflection of the environment map.

Note that for values between the roughness levels in the mip chain, we use trilinear interpolation to obtain the specular cubemap at the "correct" roughness.

![Roughness: 0](emapping/specZero.png)![Roughness ~1.5](emapping/SpecOneFive.png)![Roughness 4](emapping/specFour.png)

## Pre-Computing the BRDF

For the specular cubemap, we need to pre-compute the BRDF's response to the angle between N/LightOut and roughness. Frankly, the underlying mathematics on BRDF and rendering equation is not known in depth to me, so I will again refer you to the LearnOpenGL article on 
[Specular IBL](https://learnopengl.com/PBR/IBL/Specular-IBL).

But in short, in order to save a significant amount of time computing this response, we precompute and store the response values in a 2D Look-Up Table (LUT) that makes use of the R and G channels. 
The red channel corresponds to the angle, and the green to the roughness. [Brian Karis](https://cdn2.unrealengine.com/Resources/files/2013SiggraphPresentationsNotes-26915738.pdf) says that accuracy is important and the data should be stored in a R16G16 format instead of the typical R8G8.

My BRDF LUT is 512x512.

![BRDF LUT](emapping/testlut.png)


### Shader Code

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#version 450

layout (local_size_x = 8, local_size_y = 8) in;

layout(set = 0, binding = 0, rg32f) writeonly uniform image2D lut;


const float PI = 3.14159265359;

float RadicalInverse_VdC(uint bits) 
{
    bits = (bits << 16u) | (bits >> 16u);
    bits = ((bits & 0x55555555u) << 1u) | ((bits & 0xAAAAAAAAu) >> 1u);
    bits = ((bits & 0x33333333u) << 2u) | ((bits & 0xCCCCCCCCu) >> 2u);
    bits = ((bits & 0x0F0F0F0Fu) << 4u) | ((bits & 0xF0F0F0F0u) >> 4u);
    bits = ((bits & 0x00FF00FFu) << 8u) | ((bits & 0xFF00FF00u) >> 8u);
    return float(bits) * 2.3283064365386963e-10; // / 0x100000000
}
// ----------------------------------------------------------------------------
vec2 Hammersley(uint i, uint N)
{
    return vec2(float(i)/float(N), RadicalInverse_VdC(i));
}  

vec3 ImportanceSampleGGX(vec2 Xi, vec3 N, float roughness)
{
    float a = roughness*roughness; // note squared. No idea why.
	
    float phi = 2.0 * PI * Xi.x;
    float cosTheta = sqrt((1.0 - Xi.y) / (1.0 + (a*a - 1.0) * Xi.y));
    float sinTheta = sqrt(1.0 - cosTheta*cosTheta);
	
    // from spherical coordinates to cartesian coordinates
    vec3 H;
    H.x = cos(phi) * sinTheta;
    H.y = sin(phi) * sinTheta;
    H.z = cosTheta;
	
    // from tangent-space vector to world-space sample vector
    vec3 up        = abs(N.z) < 0.999 ? vec3(0.0, 0.0, 1.0) : vec3(1.0, 0.0, 0.0);
    vec3 tangent   = normalize(cross(up, N));
    vec3 bitangent = cross(N, tangent);
	
    vec3 sampleVec = tangent * H.x + bitangent * H.y + N * H.z;
    return normalize(sampleVec);
}  

float GeometrySchlickGGX(float NdotV, float roughness)
{
    float a = roughness;
    float k = (a * a) / 2.0;

    float nom   = NdotV;
    float denom = NdotV * (1.0 - k) + k;

    return nom / denom;
}
// ----------------------------------------------------------------------------
float GeometrySmith(vec3 N, vec3 V, vec3 L, float roughness)
{
    float NdotV = max(dot(N, V), 0.0);
    float NdotL = max(dot(N, L), 0.0);
    float ggx2 = GeometrySchlickGGX(NdotV, roughness);
    float ggx1 = GeometrySchlickGGX(NdotL, roughness);

    return ggx1 * ggx2;
}  


vec2 IntegrateBRDF(float NdotV, float roughness)
{
    vec3 V;
    V.x = sqrt(1.0 - NdotV*NdotV);
    V.y = 0.0;
    V.z = NdotV;

    float A = 0.0;
    float B = 0.0;

    vec3 N = vec3(0.0, 0.0, 1.0);

    const uint SAMPLE_COUNT = 1024u;
    for(uint i = 0u; i < SAMPLE_COUNT; ++i)
    {
        vec2 Xi = Hammersley(i, SAMPLE_COUNT);
        vec3 H  = ImportanceSampleGGX(Xi, N, roughness);
        vec3 L  = normalize(2.0 * dot(V, H) * H - V);

        float NdotL = max(L.z, 0.0);
        float NdotH = max(H.z, 0.0);
        float VdotH = max(dot(V, H), 0.0);

        if(NdotL > 0.0)
        {
            float G = GeometrySmith(N, V, L, roughness);
            float G_Vis = (G * VdotH) / (NdotH * NdotV);
            float Fc = pow(1.0 - VdotH, 5.0);

            A += (1.0 - Fc) * G_Vis;
            B += Fc * G_Vis;
        }
    }
    A /= float(SAMPLE_COUNT);
    B /= float(SAMPLE_COUNT);
    return vec2(A, B);
}
// ----------------------------------------------------------------------------
void main() 
{
    ivec2 id = ivec2(gl_GlobalInvocationID.xy);
    ivec2 im = imageSize(lut).xy;
    if (id.x >= im.x || id.y >= im.y) return;

	vec2 TexCoords = (vec2(id) + vec2(0.5f)) / vec2(im);
    vec2 integratedBRDF = IntegrateBRDF(TexCoords.x, TexCoords.y);

    imageStore(lut, id, vec4(integratedBRDF,0.0,0.0));
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# Diffuse Irradiance Image-Based Lighting

Diffuse Irradiance Image-Based Lighting, or commonly referred to as Irradiance Maps, are the diffuse component of the incoming light integral from the environment map. This cubemap is a singular cubemap which describes the total incoming diffuse light
from all directions in the initial cubemap. This image resembles a blurred image, not unlike the specular image, but is not the same and has a different sampling technique. 

It is simpler in both generation and use, as you will see later in the pbr code.

## Shader Code

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#version 450

layout (local_size_x = 8, local_size_y = 8) in;

layout(set = 0, binding = 0) uniform samplerCube sourceMap;
layout(set = 1, binding = 0, rgba32f) writeonly uniform imageCube diffuseMap;


const float PI = 3.14159265359;


layout(push_constant) uniform PushConstants {
    float sampleDelta;
} pushConstants;


void main()
{
	ivec3 storePos = ivec3(gl_GlobalInvocationID.xy, gl_GlobalInvocationID.z);
    vec2 uv = (vec2(storePos.xy) + 0.5) / vec2(imageSize(diffuseMap).xy); // Normalized coordinates
    uv = uv * 2.0 - 1.0; // Map to range [-1, 1]

    vec3 direction;
    switch (storePos.z) {
        case 0: direction = normalize(vec3(1.0, -uv.y, -uv.x)); break; // +X
        case 1: direction = normalize(vec3(-1.0, -uv.y, uv.x)); break; // -X
        case 2: direction = normalize(vec3(uv.x, 1.0, uv.y)); break; // +Y
        case 3: direction = normalize(vec3(uv.x, -1.0, -uv.y)); break; // -Y
        case 4: direction = normalize(vec3(uv.x, -uv.y, 1.0)); break; // +Z
        case 5: direction = normalize(vec3(-uv.x, -uv.y, -1.0)); break; // -Z
        default: direction = vec3(0.0); break; // Should not happen
    }

    vec3 irradiance = vec3(0.0);

    // Convolution Code
    vec3 up    = vec3(0.0, 1.0, 0.0);
    if (abs(direction.y) > 0.999) { // Handle parallel case
        up = vec3(0.0, 0.0, 1.0);
    }
    vec3 right = normalize(cross(up, direction));
    up         = normalize(cross(direction, right));

    float sampleDelta = pushConstants.sampleDelta;
    float nrSamples = 0.0; 
    for(float phi = 0.0; phi < 2.0 * PI; phi += sampleDelta)
    {
        for(float theta = 0.0; theta < 0.5 * PI; theta += sampleDelta)
        {
            // spherical to cartesian (in tangent space)
            vec3 tangentSample = vec3(sin(theta) * cos(phi),  sin(theta) * sin(phi), cos(theta));
            // tangent space to world
            vec3 sampleVec = tangentSample.x * right + tangentSample.y * up + tangentSample.z * direction; 
    
            irradiance += texture(sourceMap, sampleVec).rgb * cos(theta) * sin(theta);
            nrSamples++;
        }
    }

    irradiance = PI * irradiance * (1.0 / float(nrSamples));

    vec4 out_irradiance = vec4(irradiance, 1.0);
    imageStore(diffuseMap, storePos, out_irradiance);
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The results can be seen below, though it can be difficult to tell the difference between this and a low level of specular. Because Irradiance maps doesn't exhibit high-frequency data, it can actually be stored at a significantly low resolution.

A common idea is to store the irradiance map in one of the lowest mip levels in a specular cubemap mip-chain. In my implementation, I store it in the 5th mip level. Since the specular cubemap starts at 512x512, this means that at mip level 5, the Irradiance Map
has a resolution of 16x16. Even at that low of a resolution, the results are still fairly convincing, though ideally the resolution should be slightly larger than 16 - 32 or 64 is adequate.

![Irradiance Map](emapping/diffuseMap.png)

# Using IBL maps

The generated maps should now all reside in a singular cubemap, contained in different mip levels in this cubemap. The cubemap should resemble this structure (e.g. Max Specular Mip 5, Base 512):

- Mip 0: Specular Roughness 0.00 (0 / 4) @512
- Mip 1: Specular Roughness 0.25 (1 / 4) @256
- Mip 2: Specular Roughness 0.50 (2 / 4) @128
- Mip 3: Specular Roughness 0.75 (3 / 4) @64
- Mip 4: Specular Roughness 1.00 (4 / 4) @32
- Mip 5: Diffuse Irradiance				 @16

I follow the convention used by LearnOpenGL, and only use the first few mip levels. Truth be told, I am unsure of why the lower mip levels are not used. Perhaps at significantly low resolutions, the image clarity is simply insufficient to describe 
specular data.

The PBR shader will also need to be supplied the Specular LUT we generated earlier. 

## Shader Code

**`indirect_input_structures.glsl`**
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
struct Vertex
{
	vec3 position;
	//float uv_x;
	float pad;
	vec3 normal;
	//float uv_y;
	float pad2;
	vec4 color;
	vec2 uv;
	uint material_index; // only use X, can pack with other values in future
	float pad3;
};

struct Material
{
	vec4 color_factor;
	vec4 metal_rough_factors;
	vec4 texture_image_indices;
	vec4 texture_sampler_indices;
	vec4 alpha_cutoff; // only use X, can pack with other values in future
};

struct Model
{
	mat4 model;
	uint vertex_offset;
	uint index_count;
	uint mesh_index;
	float pad;
};


layout(buffer_reference, std430) readonly buffer VertexBuffer
{
	Vertex vertices[];
};

layout(buffer_reference, std430) readonly buffer ModelData
{
	Model models[];
};

layout(buffer_reference, std430) readonly buffer MaterialData
{
	Material materials[];
};

layout(set = 0, binding = 0) uniform addresses
{
	VertexBuffer vertexbufferDeviceAddress;
	MaterialData materialBufferDeviceAddress;
	ModelData modelBufferDeviceAddress;
} buffer_addresses;

layout(set = 1, binding = 0) uniform sampler samplers[32];
layout(set = 1, binding = 1) uniform texture2D textures[255];

layout(set = 2, binding = 0) uniform GlobalUniform
{
	mat4 view;
	mat4 proj;
	mat4 viewproj;
	vec4 ambientColor;
	vec4 sunlightDirection; //w for sun power
	vec4 sunlightColor;
	vec4 cameraPos;
} sceneData;


const uint DIFF_IRRA_MIP_LEVEL = 5;
const bool FLIP_ENVIRONMENT_MAP_Y = true;

const float MAX_REFLECTION_LOD = 4.0;


layout(set = 3, binding = 0) uniform samplerCube environmentDiffuseAndSpecular;
layout(set = 3, binding = 1) uniform sampler2D lut;

vec3 DiffuseIrradiance(vec3 N)
{
	vec3 ENV_N = N;
	if (FLIP_ENVIRONMENT_MAP_Y) { ENV_N.y = -ENV_N.y; }
	return textureLod(environmentDiffuseAndSpecular, ENV_N, DIFF_IRRA_MIP_LEVEL).rgb;
}
vec3 SpecularReflection(vec3 V, vec3 N, float roughness, vec3 F) {
	vec3 R = reflect(-V, N);
	if (FLIP_ENVIRONMENT_MAP_Y) { R.y = -R.y; }
	// dont need to skip mip 5 because never goes above 4
	vec3 prefilteredColor = textureLod(environmentDiffuseAndSpecular, R, roughness * MAX_REFLECTION_LOD).rgb;

	vec2 envBRDF = texture(lut, vec2(max(dot(N, V), 0.0f), roughness)).rg;

	return prefilteredColor * (F * envBRDF.x + envBRDF.y);
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


**`pbr.vert`**
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#version 460

#extension GL_EXT_buffer_reference : require
#extension GL_GOOGLE_include_directive : require

#include "indirect_input_structures.glsl"

layout (location = 0) out vec3 outPosition;
layout (location = 1) out vec3 outNormal;
layout (location = 2) out vec3 outColor;
layout (location = 3) out vec2 outUV;
layout (location = 4) flat out uint outMaterialIndex;


void main()
{
	Model m = buffer_addresses.modelBufferDeviceAddress.models[gl_InstanceIndex];
	uint targetVertex = gl_VertexIndex + m.vertex_offset;
	Vertex v = buffer_addresses.vertexbufferDeviceAddress.vertices[targetVertex];

	vec4 _position = vec4(v.position, 1.0);
	gl_Position =  sceneData.viewproj * m.model * _position;


	mat3 i_t_m = inverse(transpose(mat3(m.model)));
	outPosition = vec3(m.model * _position);
	outNormal = i_t_m *  v.normal;
	outColor = v.color.xyz;
	outUV = v.uv;//vec2(v.uv_x, v.uv_y);
	outMaterialIndex = v.material_index;
};
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

**`pbr.frag`**
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#version 460	

#extension GL_EXT_buffer_reference : require
#extension GL_GOOGLE_include_directive : require
#include "indirect_input_structures.glsl"

// All calculations are done in world space
layout (location = 0) in vec3 inPosition;
layout (location = 1) in vec3 inNormal;
layout (location = 2) in vec3 inColor;
layout (location = 3) in vec2 inUV;
layout (location = 4) flat in uint inMaterialIndex;
layout (location = 0) out vec4 outFragColor;


vec3 lambert(vec3 kD, vec3 albedo)
{
	return kD * albedo / 3.14159265359;
};

float D_GGX(vec3 N, vec3 H, float roughness){
	// Brian Karis says GGX NDF has a longer "tail" that appears more natural
	float a = roughness * roughness;
	float a2 = a * a; // disney reparameterization
	float NdotH = max(dot(N, H), 0.0f);
	float NdotH2 = NdotH * NdotH;

	float num = a2;
	float premult = NdotH2 * (a2 - 1.0f) + 1.0f;
	float denom = 3.14159265359 * premult * premult;
	return num / denom;
};

float G_SCHLICKGGX(float NdotX, float k){
	float num = NdotX;
	float denom = NdotX * (1.0f - k) + k;
	return num / denom;
};


float G_SCHLICKGGX_SMITH(vec3 N, vec3 V, vec3 L, float roughness){
	// height correlated smith method

	// "Disney reduced hotness" - Brian Karis
	float r = (roughness + 1);
	float k = (r * r) / 8.0f;

	float NDotV = max(dot(N, V), 0.0f);
	float NDotL = max(dot(N, L), 0.0f);

	float ggx2 = G_SCHLICKGGX(NDotV, k);
	float ggx1 = G_SCHLICKGGX(NDotL, k);
	return ggx1 * ggx2;
};

float unreal_fresnel_power(vec3 V, vec3 H){
	float VDotH = dot(V, H);
	return (-5.55473 * VDotH - 6.98316) * VDotH;
	
};

vec3 F_SCHLICK(vec3 V, vec3 H, vec3 F0){
	float VdotH = max(dot(V, H), 0.0f);

	// classic
	//return F0 + (1.0f - F0) * pow(1.0f - VdotH, 5.0f);

	// unreal optimized
	return F0 + (1 - F0) * pow(2, unreal_fresnel_power(V, H));
}


void main() 
{
	Material m = buffer_addresses.materialBufferDeviceAddress.materials[inMaterialIndex];
	uint colorSamplerIndex = uint(m.texture_sampler_indices.x);
	uint colorImageIndex =	 uint(m.texture_image_indices.x);
	vec4 _col = texture(sampler2D(textures[colorImageIndex], samplers[colorSamplerIndex]), inUV);
	
	uint metalSamplerIndex = uint(m.texture_sampler_indices.y);
	uint metalImageIndex = uint(m.texture_image_indices.y);
	vec4 _metal_rough_sample = texture(sampler2D(textures[metalImageIndex], samplers[metalSamplerIndex]), inUV);

	
	_col *= m.color_factor; 

	if (_col.w < m.alpha_cutoff.w) { discard; }

	vec3 albedo = inColor * _col.xyz;
	float metallic = _metal_rough_sample.b * m.metal_rough_factors.x;
	float roughness = _metal_rough_sample.g * m.metal_rough_factors.y;

	vec3 light_color = sceneData.sunlightColor.xyz * sceneData.sunlightColor.w;
	vec3 N = normalize(inNormal);
	vec3 V = normalize(sceneData.cameraPos.xyz - inPosition);
	vec3 L = normalize(sceneData.sunlightDirection.xyz); // for point lights, light.pos - inPosition
	vec3 H = normalize(V + L);
	
	// SPECULAR
	float NDF = D_GGX(N, H, roughness);
	float G = G_SCHLICKGGX_SMITH(N, V, L, roughness);
	vec3 F0 = mix(vec3(0.04), albedo, metallic);
	vec3 F = F_SCHLICK(V, H, F0);

	vec3 numerator = NDF * G * F;
	float denominator = 4.0f * max(dot(N, V), 0.0f) * max(dot(N, L), 0.0f);
	vec3 specular = numerator / max(denominator, 0.001f);
	

	vec3 kS = F;
	vec3 kD = vec3(1.0f) - kS;
	kD *= 1.0f - metallic;

	// DIFFUSE
	float nDotL = max(dot(N, L), 0.0f);
	vec3 diffuse = lambert(kD, albedo);

	// REFLECTIONS
	vec3 irradiance = DiffuseIrradiance(N);
	vec3 reflection_diffuse = irradiance * albedo;

	vec3 reflection_specular = SpecularReflection(V, N, roughness, F);
	vec3 ambient = (kD * reflection_diffuse + reflection_specular);

	vec3 final_color = (diffuse + specular) * light_color * nDotL;
	//final_color += ambient;

	vec3 corrected_ambient = ambient / (ambient + vec3(1.0f)); // Reinhard
	corrected_ambient = pow(corrected_ambient, vec3(1.0f / 2.2f)); // gamma correction
	final_color += corrected_ambient;

	outFragColor = vec4(final_color, _col.w);

	vec3 corrected_final_color = final_color / (final_color + vec3(1.0f)); // Reinhard
	corrected_final_color = pow(corrected_final_color, vec3(1.0f / 2.2f)); // gamma correction
} ;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

It's a lot of shader code, but the above is a full pbr workflow that incorporates Specular and Diffuse IBL. A great deal of work has been done on this subject, and we can leverage some lessons learned, such as the changes made by Brian Karis from 
Unreal and from Disney's PBR implementation. 

The results are extremely impressive and can help take your renderer to another level.

![](emapping/smoothMetallicBallsField.png)

<!-- Markdeep: --><style class="fallback">body{visibility:hidden;white-space:pre;font-family:monospace}</style><script src="markdeep.min.js" charset="utf-8"></script><script src="https://morgan3d.github.io/markdeep/latest/markdeep.min.js?" charset="utf-8"></script><script>window.alreadyProcessedMarkdeep||(document.body.style.visibility="visible")</script>