Specular Mapping Shader

Specularity helps to create shininess, such as for metal or wet surfaces. For more detail on the widely used "phong" illumination model ask google or wikipedia. We will create a shader that supports specular texture maps and also self-illumination maps. There is 4 textures, of which 3 or 2 are used. Diffuse, Diffuse with Specular in Alpha, Colored Specular and Colored Emissive.

For this tutorial it's best you open the Shader reference as well. In short, shaders allow you to define how textures/colors should be applied to rendered surfaces. With the use of hardware shaders (gpuprograms) you can further deform stuff and do more complex lighting effects. Luxinia uses "shaders" to define "how" stuff is blend/rendered, while "materials" mostly pass the "what", ie which textures are used. Therefore redundancy is cut down, as you can use the same shader for multiple materials, which is also good for speed.

Shaders in luxinia allow multiple techniques to be defined. Each technique resembles certain hardware capability. When a shader is loaded the "first" technique that matches the current hardware will be used, and the others ignored. Ideally you should provide a default technique that is used as fallback as well.

We will start out our shader with such a fallback path. And then later add other techniques "on top" so that they are first checked for load.

// latest luxinia shader version is 310, this has been changed
// with 0.98. 

// should always be specified last as a fallback, in case hardware 
// isnt so good
        // Diffuse
        VTEX "Texture:1";
        // materials will pass this texture with the id 1

        // VTEX means the texture is vertexcolored: 
        // color.tex * color.vertex
        // Vertexcolors will also contain diffuse & ambient lighting
        // Self-Illum
        TEX "Texture:3";
        blendmode VID_ADD;
        // add the self-illumination texture on top.

Texture Combiner Version

Above that technique we will add a new one, that uses more multi-texture features. While the same definition could be used in the default technique, we know that it may need more rendering passes at worse hardware, hence we use a dedicated technique.

This VID_ARB_TEXCOMB technique is mostly for shader model 1 cards and even older ones with enhanced multitexturing support, think along early GeForce?. Also many "on-board" intel chips may have similar capability.

The technique won't allow fragment/vertex programs, so we build up the effect with multitexturing. It works similar to layers in a paint program, the first defined texture stage is the bottom, and then you add layers on top. Luxinia will try to compile the effect to a minimum number of rendering passes. Nevertheless you shouldn't add too much stages (layers).

    // the first texture layer is the diffuse texture
        // Diffuse/Spec combined
        VTEX "Texture:0";    
        // Specular
        // this is a special CubeMap that luxinia provides
        // specularity: (MAX (0,(cubedirection dot (0,0,1)))^8

        // This is a special modifier that will create
        // texcoordinates for this Texture stage, so that
        // view vector is reflected at vertex using its normal.

        // Further the sun position will also change the
        // texcoordinates, in a fashion that those 
        // who point towards the sun will be 0,0,1

        blendmode VID_AMODADD_PREV;    
        // the blendmode for this texturestage
        // it is a built-in mode that will do:
        // color.previous + alpha.previous * color.tex
        // Because color.tex contains perpixel specular using the 
        // cubemap, we will get shiny effect where alpha value of 
        // the previous texture was high. 
        // The downside is that we can only use greyscale 
        // specularity.
        // Self-Illum
        TEX "Texture:3";
        blendmode VID_ADD;

We are done, the effect will take one or two rendering passes (3 textures used), depending on hardware capability. The vast majority even of older cards will be able to do it in one pass.

Cg Version

For more flexibility and functionality GPU Programs can be used to program vertex and fragment (pixel) processing. Luxinia allows you to Pass GPU Programs as ASM like code, which utilizes ARB_vertex_program and ARB_fragment_program, or the high-level shading language Cg.

Depending on what you use drawing will be different, Cg does not allow you to access the OpenGL? State, such as texture matrices, fog & lighting values directly, but you must specify more precisely what you need and setup parameters. The benefit is that it can be faster and provide more technology. The "ARB_program" GpuPrograms? allow you to mix fixed function processing with GpuPrograms?, ie you can have a fragment program bound, but dont need to write a vertex program. From 0.98 on Cg was optimized, however it will require that in Cg mode both Vertex and Fragment programs must be passed per pass.

The Rendering Pipeline basically works like this:

  • Application makes drawcall: sends vertex data, which have attributes like "color, position, texturecoordinate". And the vertex indices for each triangle.
  • Vertex Processing: Every triangle's vertex is transformed by the vertex stage and writes the screenspace position. Furthermore you can specify several attribute interpolants that later can be used in the fragment processing stage. For example you can compute a color value, and do lighting per-vertex. You may also deform vertices for animation effects.
  • Primitive Assembly: Every triangle is clipped at the view bounds. So that only visible triangles get shaded.
  • Rasterization & Interpolation: The triangle is turned to fragments (which later become pixels). Every attribute that was output by vertex stage will be interpolated across the triangle. So the midpoint of a triangle would have mean values of all three vertices invovled.
  • Fragment Processing: On fragment level you use the incoming attributes to write the color output that is written to the active framebuffer (can be backbuffer or texture). In the most simple form you just output a constant color, or you use the per-vertex color. But you can also use attributes as texturecoordinates and sample textures, which you can combine to a final color value.

It is important to know that you cannot access a "neighboring" pixel's color or whatever, every vertex and every fragment is computed on its own, and doesnt have knowledge about the "scene" or the mesh. It just has the attributes from application or later vertex processing, and uniform parameters you can provide from the outside. In those uniforms typically control values and matrices are stored. For further reading search the web for introductory articles on hardware shading. You will see that HLSL and Cg can often be exchanged code-wise.

So let's build the Cg file with Cg vertex and fragment program. First we define the "data" structures we use for in/output to Vertex & Fragment-program. This is not needed, but often easier to work with.

// The data we receive from the application
// use only what you really need/want
struct AppVertex
        // the : is the semantic, ie. the binding to the
        // opengl state
        // you can accessPOSITION, NORMAL, COLOR, TEXCOORD0-7
        float4 oPos :         POSITION;
        float3 oNormal:     NORMAL;
        float4 tex0 :         TEXCOORD0;

        // Every vertex attribute is in object-space and not
        // transformed.
        // Transforms to world/view/screen-space are done
        // in vertex-shader

// This structure defines what the vertex-shader outputs
// and becomes input to fragment-shader
struct VertexOut
        float4 hpos :         POSITION;    // must-have

        // other outputs, you can use COLOR and TEXCOORD0-7
        float2 tex0 :         TEXCOORD0;
        float3 oNormal :     TEXCOORD1;
        float4 oPos :        TEXCOORD2;

// The Fragment-Shader output
// it must write to COLOR/COLOR0
struct FragOut{
        // must write to COLOR/COLOR0
        // COLOR1-3 need Multiple Render Targets support
        float4 color0 : COLOR0;

The Vertex-Shader must write to POSITION, which is the screenspace position. It lies within a box that has the extents [-1,+1] and along Z [-1,0]

VertexOut vertex_main(AppVertex IN,
// among several automatic parameters, this matrix
// allows you to transform object-space to screenspace
uniform float4x4 WorldViewProjMatrix
        VertexOut OUT;

        // transform to screenspace
        OUT.hpos = mul(WorldViewProjMatrix, IN.oPos);

        // output other attributes, for fragment-shader
        OUT.tex0 = IN.tex0.xy;
        OUT.oNormal = IN.oNormal;
        OUT.oPos = IN.oPos;

        return  OUT;

The main work is done in the fragment-program. We first declare a function that handles basic Phong lighting. The benefit is that we could apply this function on vertex level, or use in other shaders, too. As you see Cg can pass by value (default) or by reference similar to C pointers, using the "out"/"inout" keywords.

void phong_shading(
// our inputs
float3 normal, float3 toLight, float3 toCam, float specpower,
// and variables we want to write into, or read as well
out float diffuse, inout float4 specular)
        // diffuse lighting term
        // saturate does clamp to [0,1]
        diffuse = saturate(dot(normal,toLight)) ;

        // specular term is how much the lightsource
        // is reflected to the eye.
        // We invert the toCam vector to make it
        // "toPos", as reflect function assumes incident
        // vectors.
        float3 toCamReflected = reflect(-toCam,normal);

        float spec = saturate(dot(toLight,toCamReflected));
        // specularctrl is our control vector,
        // we use .w for highlight sharpness
        specular *= pow(spec,specpower);

At the fragment level we perform the lighting and of course take textures into account.

FragOut fragment_main(VertexOut IN,
// various parameters we control from the SHD file
uniform float4 lightambient,
uniform float4 lightdiffuse,
uniform float4 oLightpos,
uniform float4 oCampos,
uniform float4 specularctrl,

// the textures
// there is sampler1D,2D,3D,CUBE,RECT
// and TEXUNIT0-15
// textures are bound to the TEXUNIT slot with the same
// order as they were defined in the SHD Pass
// That sometimes may not be the same as the
// "Material" slot!!
uniform sampler2D diffusemap : TEXUNIT0,
uniform sampler2D specmap : TEXUNIT1,
uniform sampler2D illummap : TEXUNIT2)

        // sample the textures
        float4 texcolor = tex2D(diffusemap,IN.tex0);
        float4 texspec = tex2D(specmap,IN.tex0);
        float4 texillum = tex2D(illummap,IN.tex0);

        // create a smooth per-pixel normal
        float3 normal = normalize(IN.oNormal);

        // create toLight direction vector
        float3 toLight = normalize(oLightpos.xyz-IN.oPos.xyz);
        // create toCam direction vector
        float3 toCam = normalize(oCampos.xyz-IN.oPos.xyz);

        float diffuse;
        float4 specular = float4(specularctrl.xyz,1);
        // the float(..,1) creates a vector4 with w = 1
        // as later outcolor is also a float4 we need to make the
        // vector compatible


        FragOut OUT;
        // final color output
        // we mix the lighting terms with lightcolors and
        // texturecolors
        OUT.color0 = texcolor*(diffuse*lightdiffuse+lightambient) +
        texspec * specular + texillum;

        // in all cases we want to keep original alpha
        OUT.color0.w = texcolor.w;

        return OUT;

Back to our SHD-shader, the technique (VID_ARB_VF) is valid for most shader model 2 cards and above. Sm2+ cards are the majority among gamers these days (GeForce5?, Radeon9500, Intel950 and up). By adding _TEX4 to the technique, it means that 4 textureunits are required. That means 4 textures can be active at the same time.

    // When using vertex/fragment programs, we must specify
    // render passes manually. That is not needed in the 
    // VID_DEFAULT technique.

        // the effect will have just one Pass, which 
        // simply overwrites everything
        blendmode VID_REPLACE;
        BASE 0 "simplelighting.cg" "vertex_main"
        // BASE means for vertexprograms that its applied for 
        // regular geometry.
        // If you want to code versions of the program
        // that allow boneanimation, you can also pass
        // those programs additionally with SKIN.

        // The first number is number of light sources.
        // We can define up to 5 different programs
        // depending on the light count.
        // As we already know that we will always have 1 sunlight,
        // we can ignore that and use the first program for all 
        // cases.
        // Finally provide string to programfile, 
        // and the entryname function. The latter is not needed
        // for lowlevel ASM shaders.

        VCG;    // The type of program, a Cg Vertex Program
                // you can also use VPROG for ARB_vertex_program 
                // files, which contain ASM like code.
                // however its not allowed to mix PROG and CG 
                // programs.
        BASE 0 "simplelighting.cg" "fragment_main"
        // similar as above, just that we picked another entryname

        FCG;    // a Cg Fragment Program
                // similar to above, FPROG exists here as well.

        param "oCampos" 0 VID_CAMPOS (0,0,0,0);
        param "oLightpos" 0 VID_LIGHTPOS (0,0,0,0) 0;
        param "lightdiffuse" 0 VID_LIGHTCOLOR  (0,0,0,0) 0;
        param "lightambient" 0 VID_LIGHTAMBIENT  (0,0,0,0) 0;
            // some automatic gpuparameters which will set values
            // for the variables we've given names for.
            // the number (here 0) is not used for Cg, but important
            // for PROG GpuPrograms.

            // the keyword defines what type of variable

            // The vector in for automatic variables will often be 
            // overwritten, but might also contain control values

            // The last parameter in thise case defines which 
            // light's values are used. Light 0 will always be 
            // the Sun, 1-3 the effect lights

        param "specularctrl" 0 VID_VALUE (1,1,1,16);
            // A parameter inside the gpuprogram, which
            // we want to control
            // xyz will be intensity (color), w will be power 
            // (sharpness of highlight)
        // Diffuse
        TEX "Texture:1";     // material passes the texture
        // Specular
        TEX "Texture:2";     // material passes the texture
        texcoord -1;        
            // with -1 we disable passing dedicated texture 
            // coordinates

            // as our model only has one set of texture coordinates
            // and those are already passed with Diffuse texture
            // we can speed thigns up, by disabling this here.
            // By default each Texture stage passes texcoord 0;
        // self-illumination
        TEX "Texture:3";     // material passes the texture
        texcoord -1;        

Scene Setup

The scene is similar to "Using Models". Just a timer function was added to rotate the model, cause specularity wouldn't show off at a static scene.

The model originally used "t33small.tga" as texture, we want to replace this assignment with our shader material. Therefore we first need to create a material, which we then assign to the meshes of the l3dmodel instance.

Material creation can be done in two ways: // material.load("filename.mtl") which is a predefined materialfile, and offers most functionality // material.load("MATERIAL_AUTO:...") which creates a simple material from the string

Because our material/shader combo isnt very complex we will use the string generation method.

local mtl = material.load(
-- Our material uses one shader "3tex_diffspecillum.shd"
-- and 4 textures. Depending on hardware capability the shader
-- will load the textures it uses. The texture slots are filled 
-- ascending 0,1,2 ... same goes with the shaders. 
-- You must always define at least one shader

-- now we want to assign the material to the l3dmodel
-- l3dmodel.matsurface requires us to pass a meshid as well
-- we simply assign the material to all meshes
for i=1,t33:meshcount() do
    -- create a meshid 
    local mid = t33:meshid(i-1)    -- meshid indices start at 0
    -- assign material to l3dmodel's mesh
-- always set Renderflags AFTER material assignments