9. Geometry Shaders

Geometry shaders use vertex shader output to generate primitives and output an arbitrary number of vertices.

In OpenGL ES 2.0, you can specify GL_POINTS, GL_LINES, GL_LINE_STRIP, or GL_LINE_LOOP to the glDrawElements or glDrawArrays() function to render primitives, but on the 3DS system, you must use the geometry shader to generate non-triangle primitives. Triangles are rendered with GL_TRIANGLES, GL_TRIANGLE_STRIP, and GL_TRIANGLE_FAN because they are included in OpenGL ES 2.0, but multisample rendering is not supported.

As described in 5. Shader Programs, the SDK provides geometry shaders for the Nintendo 3DS system. You can use the following geometry shaders.

  • Point shaders
  • Line shaders
  • Silhouette shaders
  • Catmull-Clark subdivision shaders
  • Loop subdivision shader
  • Particle system shader

These geometry shaders cannot be used alone. Always load a binary that also has a linked vertex shader, and also make sure that you use the vertex shader. When a geometry shader is used, one of the four vertex processors is used as the geometry processor. The 15th Boolean register (b15) is reserved for the geometry shader.

Figure 9-1. Processor Configurations When the Geometry Shader Is Used and Not Used

Vertex Processor Vertex Input Process Vertex Cache Geometry Processor

Vertex shader output is used as input data to the geometry shader. Each geometry shader requires specific vertex attributes and a specific input order and can only use some vertex attributes. The vertex shader must output these values correctly. Data is input to the geometry shader in order, starting with the smallest register number output by the vertex shader. Output vertex attributes are defined by #pragma output_map, but any generic attribute names are only used by the geometry shader, without being handled by some later process (such as fragment processing). Geometry shaders have reserved uniforms.

The point shader, for example, uses reserved uniforms to configure the viewport and other settings. Because all of these reserved uniforms initially have undefined values, they must be set by the application.

9.1. Point Shaders

The point shader generates two triangle primitives to draw a square (a point primitive) centered at the specified vertex coordinates, with sides as long as the specified point size. The point sprite shader is a point shader that outputs texture coordinates s, t (shown in the following figure), r (fixed at 0.0), and q (fixed at 1.0), for applying textures to point primitives. Neither shader supports grid adjustments or multisample rendering.

Figure 9-2. Drawing Point Primitives

PointSize VertexCoord(Input) VertexCoord(Output) Point Primitive (s, t) = (0.0, 0.0) (s, t) = (1.0, 0.0) (s, t) = (0.0, 1.0) (s, t) = (1.0, 1.0)

9.1.1. Shader Files

The shader file to link with the vertex shader is determined by two factors: the number of vertex attributes output for fragment processing that are also not required by the point shader, and the number of texture coordinates output by the point sprite shader.

Points: DMP_pointN.obj

Point sprites: DMP_pointSpriteN_T.obj

Where N is the number of vertex attributes that are not required by the point (or point sprite) shader, and T is the number of texture coordinates.

9.1.2. Reserved Uniform

There are reserved uniforms for configuring the viewport, and for enabling or disabling distance attenuation on the point size. These reserved uniforms must be set by the application because they initially have undefined values.

Viewport

Use the glUniform2fv() function to set the width and height of the viewport in the reserved uniform dmp_Point.viewport.

Distance Attenuation

Use the glUniform1i() function to set the reserved uniform for distance attenuation on the point size, dmp_Point.distanceAttenuation. A value of GL_TRUE enables distance attenuation, and GL_FALSE disables it. The point size is multiplied by the clip coordinate Wc when distance attenuation is disabled. This cancels out the division by Wc during the conversion to window coordinates and prevents the size of displayed point primitives from being affected by distance.

Table 9-1. Reserved Uniforms for Point Shaders

Reserved Uniform

Type

Value to Set

dmp_Point.viewport

vec2

The viewport using the following equation.

(1 / viewport.width, 1 / viewport.height)

dmp_Point.distanceAttenuation

bool

Specifies whether the point size is affected by distance attenuation.

Specify GL_TRUE to enable distance attenuation, and GL_FALSE otherwise.

9.1.3. Vertex Shader Settings

The point shader requires vertex coordinates and a point size to render point primitives. The vertex shader outputs the vertex coordinates and then the point size in order (starting with the smallest output register number). This is followed by the texture coordinates for the point sprite shader. If multiple texture coordinate pairs are used, they must be packed two at a time into the xy and zw components of a single output register.

Output vertex attributes must be set as follows: position for the vertex coordinates, generic for the point size, and texture0 through texture2 for the texture coordinates.

If the vertex shader outputs the vertex color in addition to the vertex coordinates and point size required by the point shader, link the DMP_point1.obj shader file and use the following #pragma output_map statements in the vertex shader.

Code 9-1. Sample Output Register Settings When the Point Shader Is Used (in Shader Assembly)
#pragma output_map ( position , o0 )
#pragma output_map ( generic , o1 )
#pragma output_map ( color , o2 ) 

If the vertex shader outputs the vertex color in addition to the vertex coordinates, point size, and two texture coordinate pairs required by the point sprite shader, link the DMP_pointSprite1_2.obj shader file and use the following #pragma output_map statements in the vertex shader.

Code 9-2. Sample Output Register Settings When the Point Sprite Shader Is Used (in Shader Assembly)
#pragma output_map ( position , o0 )
#pragma output_map ( generic , o1 )
#pragma output_map ( texture0 , o2.xy )
#pragma output_map ( texture1 , o2.zw )
#pragma output_map ( color , o3 ) 

The point sprite shader outputs texture coordinates for point sprites that replace the input texture coordinates. Note that the vertex shader must write dummy values to the output registers for the texture coordinates that are replaced by the point sprite shader.

9.1.4. Input Vertex Data

To generate primitives with the point shader, call the glDrawElements or glDrawArrays() function and specify GL_GEOMETRY_PRIMITIVE_DMP for mode.

9.2. Line Shaders

A line shader generates two triangle primitives to draw lines (a line primitive) that connect two points specified by vertex coordinates. You can use a reserved uniform to set the line width. The line shader does not support grid adjustments or multisample rendering.

Figure 9-3. Drawing Line Primitives

VertexCoord(Input) VertexCoord(Output) Line Primitive

Coordinates are generated for the four vertices that make a rectangle (parallelogram) from the slope and width of the line segment connecting the specified vertex coordinates. The coordinates of the four vertices are generated from the input vertex coordinates in the Y direction (or the X direction for some line segment slopes).

9.2.1. Shader Files

The shader file to link with the vertex shader is determined by the number of vertex attributes output for fragment processing that are also not required by the line shader.

Vertex coordinates can be specified for separate lines or line strips. Separate lines are drawn using two sets of vertex coordinates per line. The coordinates of the first two vertices are the same for both a line strip and separate lines, but the next separate line is drawn from the coordinates of the second and third vertices. In other words, starting with the third vertex, each vertex is used together with the previous one to draw a single, connected line.

Separator lines: DMP_separateLineN.obj

Strip lines: DMP_stripLineN.obj

Where N is the number of vertex attributes not required by the line shader.

9.2.2. Reserved Uniform

There is a reserved uniform for setting the line width. The reserved uniform for the line width must be set by the application because its initial value is undefined.

Line Width

Use the glUniform4fv() function to set the reserved uniform for the line width, dmp_Line.width, using values calculated from both the line width and the width and height of the viewport.

Table 9-2. Reserved Uniforms for the Line Shader

Reserved Uniform

Type

Value to Set

dmp_Line.width

vec4

The following expression calculates the line width.

(viewport.width / line.width,
viewport.height / line.width,
viewport.width * viewport.height,
2 / line.width)

9.2.3. Vertex Shader Settings

The line shader requires vertex coordinates to render line primitives. The vertex shader outputs the vertex coordinates, starting with the smallest output register number.

One output vertex attribute must be set: the position attribute of the vertex coordinates.

If the vertex shader outputs the vertex color in addition to the vertex coordinates required by the separate line shader, link the DMP_separateLine1.obj shader file and use the following #pragma output_map statements in the vertex shader.

Code 9-3. Sample Output Register Settings When a Line Shader Is Used (in Shader Assembly)
#pragma output_map ( position , o0 )
#pragma output_map ( color , o1 ) 

The line strip shader works the same way as the separate line shader. Its shader file is DMP_stripLine1.obj.

9.2.4. Input Vertex Data

To generate primitives with the line shader, call the glDrawElements() or glDrawArrays() function and pass GL_GEOMETRY_PRIMITIVE_DMP for the mode parameter.

9.3. Silhouette Shaders

Silhouette shaders generate and render silhouettes around object edges. You can use silhouette edges to render object contours and, when combined with the shadow feature, soft shadows.

To generate silhouette edges, the silhouette shader needs a primitive called a triangle with neighborhood, or TWN for short.

9.3.1. Triangle With Neighborhood

This section describes one of the triangles that make up the objects used to render silhouette edges.

This is called a center triangle. A triangle with neighborhood (TWN) comprises a center triangle and the three (adjacent) triangles that share an edge with it.

Figure 9-4. Sample Triangles With Neighborhood

One TWN comprises the center triangle with vertices 3, 1, and 4, in addition to the three triangles defined by vertices 0, 1, and 3; 2, 4, and 1; and 6, 3, and 4. When vertices 3, 4, and 6 form the center triangle, vertices 3, 1, and 4 represent a triangle in a separate TWN.

TWNs are used to detect the silhouette edges of center triangles. You can make an object from TWNs to render silhouette edges on it.

9.3.2. Shader Files

TWN vertices can be input to a silhouette shader in two ways: (1) either as silhouette triangles, one TWN at a time (just like normal triangles), or (2) as continuous silhouette strips of adjoining TWNs.

Silhouette triangles: DMP_silhouetteTriangle.obj

Silhouette strips: DMP_silhouetteStrip.obj

9.3.3. Reserved Uniform

Silhouette shaders have the following reserved uniforms. These reserved uniforms must be set by the application because they initially have undefined values.

Polygon Facing

Use the glUniform1i() function to configure how the silhouette shader determines whether a polygon is front-facing (dmp_Silhouette.frontFaceCCW). Specify GL_TRUE or GL_FALSE if GL_CCW or GL_CW, respectively, have been passed to the glFrontFace() function for object vertex input.

Silhouette Edge Width

Use the glUniform2fv() function to set the silhouette edge width (dmp_Silhouette.width) to a value calculated by multiplying the normal vector's x direction by a coefficient.

You can configure the effect of a vertex's w component (dmp_Silhouette.scaleByW). Multiplying this component by the silhouette edge width disables distance attenuation. Call the glUniform1i() function and specify a value of GL_TRUE to multiply by the w component, or specify GL_FALSE to fix the silhouette edge's w component at 1.0.

Silhouette Edge Colors

Use the glUniform4fv() function to set the color of the silhouette edge (dmp_Silhouette.color) with R, G, B, and A values.

Open Edges

An open edge is an edge of the center triangle that is not shared with any other triangles. You can configure open edges (dmp_Silhouette.acceptEmptyTriangles) to always be drawn (GL_TRUE) or not (GL_FALSE).

Open edges differ from silhouette edges in that they are drawn like line primitives, without using vertex normals. As a result, they may look different from silhouette edges at some angles. Some settings are specific to open edges.

Open Edge Width

Use the glUniform4fv() function to set the open edge width (dmp_Silhouette.openEdgeWidth) using values calculated from a specified width and the viewport's width and height (just like a line shader).

Open Edge Colors

Use the glUniform4fv() function to set the open edge color (dmp_Silhouette.openEdgeColor) using R, G, B, and A values.

Open Edge Bias Toward the Viewpoint

Use the glUniform1fv() function to set the bias toward the viewpoint (dmp_Silhouette.openEdgeDepthBias). A negative value indicates movement toward the viewpoint, and a positive value indicates movement away from it. Normal vectors are not used to generate open edges, so this bias value adjusts their appearance.

Multiplying an Open Edge's w Component

You can multiply a vertex's w component by using the open edge’s width and bias values. Use the glUniform1i() function to set dmp_Silhouette.openEdgeWidthScaleByW and dmp_Silhouette.openEdgeDepthBiasScaleByW to GL_TRUE or GL_FALSE for each setting.

Table 9-3. Reserved Uniforms for Silhouette Shaders

Reserved Uniform

Type

Value to Set

dmp_Silhouette.width

vec2

Specifies the silhouette edge width using the following expression.

(xscale_f,
xscale_f * viewport.width / viewport.height)
xscale_f is the factory to multiply by the normal vector's x component.

dmp_Silhouette.scaleByW

bool

Specifies whether to apply a vertex's w component to silhouette edges.

Specify GL_TRUE to apply the vertex's w component, and GL_FALSE otherwise.

dmp_Silhouette.color

vec4

Specifies the silhouette edge color.

dmp_Silhouette.frontFaceCCW

bool

Specifies how a polygon is determined to be front-facing or back-facing.

Specify GL_TRUE to set CCW (counterclockwise is front-facing), and GL_FALSE to set CW (clockwise is front-facing).

dmp_Silhouette.acceptEmptyTriangles

bool

Specifies whether to render open edges.

Specify GL_TRUE to render open edges, and GL_FALSE otherwise.

dmp_Silhouette.openEdgeColor

vec4

Specifies the open edge color.

dmp_Silhouette.openEdgeWidth

vec4

Specifies the open edge width using the following expression.

(viewport.width / silhouette.width,
viewport.height / silhouette.width,
viewport.width / viewport.height,
2 / silhouette.width)

dmp_Silhouette.openEdgeDepthBias

float

Specifies the open edge bias toward the viewpoint.

A negative value indicates movement toward the viewpoint, and a positive value indicates movement away from it.

dmp_Silhouette. openEdgeWidthScaleByW

bool

Specifies whether to multiply the open edge width by a vertex's w component.

Specify GL_TRUE to multiply the open edge width by a vertex's w component, and GL_FALSE otherwise.

dmp_Silhouette. openEdgeDepthBiasScaleByW

bool

Specifies whether to multiply a vertex's w component by the open edge bias toward the viewpoint.

Specify GL_TRUE to multiply the open edge width by a vertex's w component, and GL_FALSE otherwise.

9.3.4. Vertex Shader Settings

A silhouette shader requires three items to render silhouette edges: vertex coordinates, vertex colors, and normal vectors. The vertex shader outputs the vertex coordinates, vertex color, and then the normal vector in that order, starting with the smallest output register number.

The output vertex attributes must be set using the vertex coordinates in the position attribute, the vertex color in the color attribute, and the normal vector in the generic attribute.

To output silhouette triangles, link the DMP_silhouetteTriangle.obj shader file, and use the following #pragma output_map statements in the vertex shader.

Code 9-4. Sample Output Register Settings When a Silhouette Shader Is Used (in Shader Assembly)
#pragma output_map ( position , o0 )
#pragma output_map ( color , o1 )
#pragma output_map ( generic , o2 ) 

The vertex shader runs a modelview transformation on the normal vectors input from the application. Normalized values must be output for the x and y components. In other words, the vertex shader must output a normal vector that is normalized as follows from the normal vector (nx, ny, nz) in the viewpoint coordinate system.

This is shown by the following shader assembly code. aNormal specifies the normal vector input from the application, and vNormal specifies the normal vector output to the silhouette shader.

Code 9-5. Normalizing Input Normal Vectors for the Silhouette Shader (in Shader Assembly)
mov     TEMP_NORM,      CONST_0
dp3     TEMP_NORM.x,    aNormal,        MATRIX_ModelView[0]
dp3     TEMP_NORM.y,    aNormal,        MATRIX_ModelView[1]
mul     TEMP,           TEMP_NORM,      TEMP_NORM
add     TEMP,           TEMP.x,         TEMP.y
rsq     TEMP,           TEMP.x
mul     vNormal,        TEMP_NORM,      TEMP 

To output silhouette strips, use the same procedure as you would for silhouette triangles, but link the DMP_silhouetteStrip.obj shader file instead.

9.3.5. Input Vertex Data

To render silhouette edges with the silhouette shader, call the glDrawElements or glDrawArrays() function, and specify GL_GEOMETRY_PRIMITIVE_DMP for the mode parameter. You must also disable culling by calling glDisable(GL_CULL_FACE) before rendering.

We also recommend using the vertex indices to input vertex data to the silhouette shader, in consideration of TWN characteristics. The following description assumes that you are using the vertex indices. If you are not using the vertex indices, you must order the vertex data according to the TWN vertex input rules.

9.3.6. Silhouette Triangle Indices

Silhouette triangles are input to the shader, one TWN at a time, using six vertices per TWN.

Vertices are input in the following order.

  1. The first vertex of the center triangle.
  2. The second vertex of the center triangle.
  3. The remaining vertex of the adjacent triangle that shares the edge created by the first and second vertices of the center triangle.
  4. The third vertex of the center triangle.
  5. The remaining vertex of the adjacent triangle that shares the edge created by the first and third vertices of the center triangle.
  6. The remaining vertex of the adjacent triangle that shares the edge created by the second and third vertices of the center triangle.

The object in the following figure shows how to specify indices.

Figure 9-5. Specifying Indices With TWNs

A setting of CCW is assumed for determining polygon facing.

Having selected the center triangles (1,4,3) and (3,4,6) so that they are front-facing, specify the indices as follows.

Triangle (1,4,3): 1, 4, 2, 3, 0, 6.
Triangle (3,4,6): 3, 4, 1, 6, 5, 7.

.

Silhouette edges are not generated for a degenerate center triangle. Triangles that have edges shared by three or more triangles were not considered.

9.3.7. Silhouette Strip Indices

Silhouette strips are input to the shader as consecutive TWNs, using six vertices for the first TWN and two vertices for each subsequent TWN. You can expect this to be at least twice as efficient as using silhouette triangles to render the same model.

Silhouette strips can be used as input when the second and third vertices of a center triangle share an edge with another vertex to form the center triangle of the next TWN. In other words, the last adjacent triangle of a TWN is the center triangle of the next TWN.

Vertices are input in the following order.

  1. The first six vertices are the same as those used for silhouette triangles. The second and third vertices of the center triangle and the last specified vertex represent the first, second, and third vertices of the next center triangle.
  2. The remaining vertex of the adjacent triangle shares the edge created by the first and third vertices of the center triangle.
  3. The remaining vertex of the adjacent triangle that shares the edge created by the second and third vertices of the center triangle.
  4. The second and third vertices of the center triangle and the vertex specified in step 3 form vertices 1 through 3 of a new center triangle. Repeat steps 2 and 3.

To stop entering silhouette strips or to enter the next silhouette strip, give the third vertex of the center triangle as input before, the vertex specified in step 3.

Figure 9-6. Sample Indices for Silhouette Strips

Assuming (3,8,7) as the final center triangle in the figure,

the silhouette strip indices would be specified in the following order: 1, 4, 2, 3, 0, 8, 6, 7, 5, 7, 9.

The number in bold, 7, is the vertex used to indicate the end of the silhouette strip.

To continue entering silhouette strips after you have specified the end of one strip, you must first enter the six vertices that start a new TWN. If the first center triangle of the new silhouette strip faces in the opposite direction than the one specified using glFrontFace, enter its first vertex twice. In other words,

if the silhouette strip in the previous example was initially back-facing, its indices would be specified in the following order: 1, 1, 4, 2, 3, 0, 8, 6, 7, 5, 7, 9.

.

The end of a silhouette strip is specified as a delimiter when inputting multiple silhouette strips. However, if the end of the last silhouette strip is not specified, further input triangles connected to the silhouette strip may cause duplicate silhouette edges to be rendered. The end of each input silhouette strip must be specified, when alpha blending is used on silhouettes.

Note that a degenerate center triangle is considered to specify the end of a strip.

9.3.8. Open Edges

An open edge has already been explained to be an edge of the center triangle that is not shared with any other triangle. Because it does not share an edge with any other triangle, its indices are specified by using the remaining vertex from the center triangle. You can visualize an open edge as an adjacent triangle that has been folded precisely over the center triangle.

The indices for the object in Figure 9-5 would be specified as follows, if vertex 0 did not exist.

The indices (1, 4, 3) for the silhouette triangle would be specified in the order: 1, 4, 2, 3, 4, 6.

The silhouette strip indices would be specified in the following order: 1, 4, 2, 3, 4, 6, 5, 6, 7.

Open edges must be configured differently than silhouette edges. For more information, see 9.3.3. Reserved Uniforms.

Figure 9-7. Sample Indices for Open Edges

9.3.9. Generating Silhouette Edges

A silhouette edge is rendered by generating a new rectangular polygon on the edge of a front-facing center triangle and a back-facing adjacent triangle in a TWN. Two vertices are added along the normal vectors (n1 and n2) of the two vertices (1 and 2) shared by the center and adjacent triangle to generate a rectangular polygon (from two triangular polygons).

Figure 9-8. Rectangular Polygon That Forms a Silhouette Edge

n1 n2

The coordinates (x', y', z', w') of the vertices to add are calculated by the following equation, where (x, y, z, and w) are the coordinates of a vertex on the center triangle, and (nx, ny, nz) represent a normal vector.

x' = x + x_factor * nx * w_scale
y' = y + y_factor * ny * w_scale
z' = z
w' = w

The reserved uniform for silhouette edge width sets the values applied to x_factor, y_factor, and w_scale.

9.4. Catmull-Clark Subdivision Shaders

The Catmull-Clark subdivision shader uses quadrilateral polygons and their surrounding vertices to split groups of vertices into smooth polygons. Note: The word subdivision always indicates Catmull-Clark subdivision within this section.

9.4.1. Subdivision Patches

To subdivide polygons, give the shader a set of polygons comprising only quadrilaterals (a Catmull-Clark subdivision patch, or subdivision patch for short). A subdivision patch is made up of the target (center) quadrilateral and the group of quadrilaterals that share edges formed by that center quadrilateral's four vertices.

Figure 9-9. Sample Catmull-Clark Subdivision Patch

A subdivision patch can only be applied to a polygon model that is entirely made up of quadrilaterals.

Each vertex in the central quadrilateral usually forms four edges, like vertices 5, 6, and 10. However, a vertex that forms three or five edges like vertex 9 is called an extraordinary point, and its edge count is called its valence. A subdivision patch can only have one extraordinary point. It must be in the central quad and have a valence from 3 through 12.

If a subdivision patch's central quad has an extraordinary point, you must start specifying indices from that point. Any other subdivision patches with central quads that have the same extraordinary point must be entered consecutively. Holes will appear in the mesh in some cases.

9.4.2. Shader Files

The shader file to link with the vertex shader is determined by the number of vertex attributes output for fragment processing that are not required by the subdivision shader.

Subdivision: DMP_subdivisionN.obj

Where N is the number of vertex attributes (1 through 6) that are not required by the subdivision shader.

9.4.3. Reserved Uniform

Subdivision shaders have the following reserved uniforms. These reserved uniforms must be set by the application because they initially have undefined values.

Subdivision Level

Use the glUniform1f() function to set the subdivision level (dmp_Subdivision.level), which controls how finely the shader subdivides polygons. Higher levels (larger numbers) indicate finer subdivision. At the smallest value of 0, a single vertex is added to the center of the subdivision patch, and the patch's original vertex coordinates are adjusted.

Using Quaternions

The subdivision shader generates new output vertices that have interpolated vertex attributes for their attributes that are not vertex coordinates. Because quaternions must be subdivided in a particular way, the shader must be notified when it is given quaternions.

Use the glUniform1i() function to either enable (GL_TRUE) or disable (GL_FALSE) quaternions (dmp_Subdivision.fragmentLightingEnabled).

Table 9-4. Reserved Uniforms for the Catmull-Clark Subdivision Shader

Reserved Uniform

Type

Value to Set

dmp_Subdivision.level

float

Specifies the subdivision level.

0 (The lowest subdivision level).
1
2 (The highest subdivision level).

dmp_Subdivision.fragmentLightingEnabled

bool

Specifies whether to use quaternions required for fragment lighting.

Specify GL_TRUE to use quaternions., and GL_FALSE otherwise.

9.4.4. Vertex Shader Settings

The subdivision shader requires one vertex attribute: the vertex coordinates. The vertex shader outputs the vertex coordinates, starting with the smallest output register number.

Two output vertex attributes must be set: the position attribute of the vertex coordinates and one other attribute.

If the vertex shader outputs the vertex color in addition to the vertex coordinates required by the subdivision shader, link the DMP_subdivision1.obj shader file and use the following #pragma output_map statements in the vertex shader.

Code 9-6. Sample Output Register Settings When the Catmull-Clark Subdivision Shader Is Used (in Shader Assembly)
#pragma output_map ( position , o0 )
#pragma output_map ( color , o1 ) 

When quaternions are required for fragment lighting, they must be output to the register with the smallest number following the register that outputs the vertex coordinates.

Code 9-7. Sample Output Register Settings When the Subdivision Shader Uses Quaternions (in Shader Assembly)
#pragma output_map ( position , o0 )
#pragma output_map ( quaternion , o1 )
#pragma output_map ( color , o2 ) 

9.4.5. Input Vertex Data

To use Catmull-Clark subdivision, call the glDrawElements() function, and specify GL_GEOMETRY_PRIMITIVE_DMP for the mode parameter. You cannot use the glDrawArrays() function. Vertex indices must also be used through the vertex buffer.

9.4.6. Subdivision Patch Indices

The number of vertices in a subdivision patch depends on the valence (from 3 through 12) of its extraordinary point. A subdivision patch does not have a fixed number of input vertices.

Enter the size of the subdivision patch first. Specify the size of the subdivision patch as the number of vertices that it includes. To find this number, double the extraordinary point's valence, and add 8. A subdivision patch's size must be in the range from 14 through 32. It is 16 when the patch does not have an extraordinary point. Behavior is undefined for input patch sizes larger than 32.

Specify a subdivision patch's indices in the following order.

  1. The four vertices in the central quad (following the order specified when glFrontFace was called, starting from the extraordinary point, if it exists).
  2. The vertices around the central quad (following the order specified when glFrontFace was called).

Assuming that polygons with a counterclockwise (CCW) winding are front-facing, the indices for the subdivision patch in Figure 9-8 would be specified in the following order.

Subdivision Patch Indices: 18, 9, 5, 6, 10, 8, 4, 0, 1, 2, 3, 7, 11, 17, 16, 15, 14, 13, 12.

. The first number, 18, is the size of the subdivision patch.

It is followed by the subdivision patch for the central quad, with the extraordinary point's vertex 9. In the following figure, when subdivision patch indices are used,

the resulting subdivision patch indices would be 18, 9, 10, 16, 15, 5, 6, 7, 11, 17, 21, 20, 19, 18, 14, 13, 12, 8, 4.

Figure 9-10. Sample Subdivision Patch Indices

9.5. Loop Subdivision Shader

The Loop subdivision shader uses triangular polygons and their surrounding vertices to split groups of vertices into smooth polygons. The word subdivision always indicates Loop subdivision within this section.

9.5.1. Subdivision Patches

To subdivide polygons, give the shader a Loop subdivision patch (referred to later simply as a subdivision patch). A subdivision patch comprises the target (center) triangle and the group of vertices that share edges with that triangle's three vertices.

Figure 9-11. Sample Loop Subdivision Patch

Each vertex in the center triangle forms a number of edges that is called its valence.

With the vertices 0, 1, 2 in Figure 9-11 set as the center triangle, the valence of vertex 0 is 6, the valence of vertex 1 is 7, and the valence of vertex 2 is 6. The valence of each patch that accepts a subdivision patch is 3 to 12, and the total of the valences of the three vertices of the center triangle must be 29 or less .

You can increase the number of virtual vertices to include a vertex with a valence of 2 (a vertex that only shares an edge with other vertices in the center triangle) in a subdivision patch.

9.5.2. Shader Files

The number of output registers configured to send non-valence vertex attributes to the subdivision shader determines which shader file to link with the vertex shader. There must be at least one output register because vertex coordinates are required output.

Subdivision: DMP_subdivisionN.obj

Where N is the number of output registers (1 through 4) configured with vertex attributes, other than the valence.

9.5.3. Reserved Uniform

The Loop subdivision shader has the same reserved uniforms as the Catmull-Clark subdivision shader. For more information, see 9.4.3. Reserved Uniforms. The reserved uniform for the loop subdivision shader must be set by the application because its initial value is undefined.

Although new vertices are not added when the subdivision level is 0, the subdivision patch's original vertex coordinates are adjusted.

9.5.4. Vertex Shader Settings

The subdivision shader requires two attributes: the vertex coordinates and the valence. The vertex shader outputs these attributes, starting at the smallest output register number. The vertex coordinates are first, followed by any other vertex attributes that exist, and finally, followed by the valence.

Two output vertex attributes must be set: the vertex coordinates using the position attribute, and the valence using the generic attribute.

If the vertex shader outputs the vertex color in addition to the vertex coordinates required by the subdivision shader, link the DMP_loopSubdivision2.obj shader file (because the vertex coordinates are also included in the number of output registers) and use the following #pragma output_map statements in the vertex shader.

Code 9-8. Sample Output Register Settings When the Loop Subdivision Shader Is Used (in Shader Assembly)
#pragma output_map ( position , o0 )
#pragma output_map ( color , o1 )
#pragma output_map ( generic, o2 ) 

When quaternions are required for fragment lighting, they must be output to the register with the smallest number following the register that outputs the vertex coordinates. Because the number of non-valence output registers is restricted to 4 or less, multiple attributes must be packed into a single register, when there are five or more vertex attributes other than the valence. Quaternions, however, cannot be packed with other vertex attributes.

Code 9-9. Sample Output Register Settings When the Subdivision Shader Uses a Large Number of Vertex Attributes (in Shader Assembly)
#pragma output_map ( position , o0 )
#pragma output_map ( quaternion , o1 )
#pragma output_map ( color , o2 )
#pragma output_map ( texture0 , o3.xy )
#pragma output_map ( texture1 , o3.zw )
#pragma output_map ( generic , o4 ) 

9.5.5. Input Vertex Data

To use Loop subdivision, call the glDrawElements() function and, for the mode parameter, pass GL_GEOMETRY_PRIMITIVE_DMP. You cannot use the glDrawArrays() function. Vertex indices must also be used through the vertex buffer.

9.5.6. Subdivision Patch Indices

The number of vertices in a subdivision patch depends on the total valence of the center triangle. A subdivision patch does not have a fixed number of input vertices.

Enter the size of the subdivision patch first. Specify the size of the subdivision patch as three plus the total valence of the vertices that form the center triangle.

Specify a subdivision patch's indices in the following order.

  1. The indices of the three vertices (v0, v1, and v2) that form the center triangle (following the order specified when glFrontFace was called).
  2. All vertices that share an edge with v0 (in any order, although, the same order must also be used by any other subdivision patch that includes the same vertex).
  3. All vertices that share an edge with v1 (in any order, although, the same order must also be used by any other subdivision patch that includes the same vertex).
  4. All vertices that share an edge with v2 (in any order, although, the same order must also be used by any other subdivision patch that includes the same vertex).
  5. A fixed value of 12 and the center triangle's three vertices (12, v0, v1, and v2).
  6. A vertex (e00) that forms a triangle with v0 and v2 and is not in the center triangle.
  7. A vertex (e10) that forms a triangle with v0 and v1 and is not in the center triangle.
  8. A vertex (e20) that forms a triangle with v1 and v2 and is not in the center triangle.
  9. The vertex that shares an edge with v0 and is next to e00, in counterclockwise order.
  10. The vertex that shares an edge with v1 and is next to e10, in counterclockwise order.
  11. The vertex that shares an edge with v2 and is next to e20, in counterclockwise order.
  12. The vertex that shares an edge with v0 and is next to e10, in clockwise order.
  13. The vertex that shares an edge with v1 and is next to e20, in clockwise order.
  14. The vertex that shares an edge with v2 and is next to e00, in clockwise order.

Assuming that polygons with a counterclockwise (CCW) winding are front-facing, the indices for the subdivision patch in Figure 9-11 would be specified in the following order.

Subdivision Patch Indices:

22, 0, 1, 2, 1, 2, 12, 3, 4, 5, 2, 0, 5, 6, 7, 8, 9, 0, 1, 9, 10, 11, 12,
12, 0, 1, 2, 12, 5, 9, 3, 6, 10, 4, 11, 8.

. The number 22 on the first line is the size of the subdivision patch. The number 12 on the second line is a fixed value.

When another subdivision patch uses v0 in its center triangle, the vertices that share an edge with it must be specified in the same order: (1, 2, 12, 3, 4, 5). The same applies to v1 and v2.

Although some vertices will be specified more than once when specifying a subdivision patch, they are read from the cache after vertex processing, and do not actually impose a performance penalty.

Figure 9-12. Sample Subdivision Patch Indices

9.6. Particle System Shaders

Particle system shaders are used by particle systems that render a large number of point sprites (particles) along a Bézier curve.

Particles are rendered along a Bézier curve that is defined by four control points input to the shader. Each control point is randomly placed within its own bounding box, changing the Bézier curve.

A particle's color, size, angle of texture coordinate rotation, and other attributes are interpolated based upon its position on the Bézier curve.

9.6.1. Shader Files

Link the vertex shader to the correct particle system shader file, based on the features that you want to support.

Particle system shader: DMP_particleSystem_X_X_X_X.obj

Where each X represents a value of 0 or 1 that controls a particle system feature. These features are, in order, particle time clamping, texture coordinate rotation, the use of RGBA components or the alpha component alone, and output of texture coordinate 2. A feature is not necessarily disabled by a value of 0, nor enabled by a value of 1. Refer to the following table to determine which shader file to link.

Table 9-5. Shader Filenames and the Particle System Features They Support

Filename

Time Clamping

Texture Coordinate Rotation

RGBA Colors

Texture coordinate 2

*_0_0_0_0.obj

Yes

Yes

(Alpha only)

No

*_0_0_0_1.obj

Yes

Yes

(Alpha only)

Yes

*_0_0_1_0.obj

Yes

Yes

Yes

No

*_0_0_1_1.obj

Yes

Yes

Yes

Yes

*_0_1_0_0.obj

Yes

None

(Alpha only)

No

*_0_1_0_1.obj

Yes

None

(Alpha only)

Yes

*_0_1_1_0.obj

Yes

None

Yes

No

*_0_1_1_1.obj

Yes

None

Yes

Yes

*_1_0_0_0.obj

None

Yes

(Alpha only)

No

*_1_0_0_1.obj

None

Yes

(Alpha only)

Yes

*_1_0_1_0.obj

None

Yes

Yes

No

*_1_0_1_1.obj

None

Yes

Yes

Yes

*_1_1_0_0.obj

None

None

(Alpha only)

No

*_1_1_0_1.obj

None

None

(Alpha only)

Yes

*_1_1_1_0.obj

None

None

Yes

No

*_1_1_1_1.obj

None

None

Yes

Yes

Each asterisk (*) in the table stands for DMP_particleSystem.

9.6.2. Reserved Uniform

Particle system shaders have the following reserved uniforms. These reserved uniforms must be set by the application because they initially have undefined values.

Color

Use the glUniformMatrix4fv() function to set the particle color (dmp_PartSys.color). A 4x4 matrix represents the particle color of the first, second, third, and fourth control point, using an RGBA value in each of the corresponding rows. This setting is valid only for a shader file that uses RGBA colors (DMP_particleSystem_X_X_1_X.obj).

Performance is worse when RGBA colors are used than when alpha components alone are used. If you are not using color components, we recommend that you link to a shader file that uses alpha components alone (DMP_particleSystem_X_X_0_X.obj).

Aspect

Use the glUniformMatrix4fv() function to set the particle aspect (dmp_PartSys.aspect) with a 4x4 matrix. Each row corresponds to a control point (from 1 through 4) and configures the particle size, texture coordinate rotation, texture coordinate scaling, and alpha component.

Size

Set the particle size (the first aspect column) to a value of 1.0 or greater.

Use the glUniform2fv() function to set the minimum and maximum particle size (dmp_PartSys.pointSize). Use the glUniform2fv() function to set the reciprocal of the viewport's width and height (dmp_PartSys.viewport) because particle rendering does not account for the screen size. If distance attenuation is being applied to the particle size, call the glUniform3fv() function to set the distance attenuation factor (dmp_PartSys.distanceAttenuation), and specify the attenuation coefficients that calculate the attenuated size, as shown in the following equation.

derived_size is the size with distance attenuation applied, size is the original size, and d is the distance from the viewpoint.

Texture Coordinates

You can set the rotation (the second aspect column) and scaling (the third aspect column) of texture coordinates at each control point.

The particle system outputs two texture coordinates: 0 and 2. Both texture coordinates support rotations, but only texture coordinate 2 supports scaling. The linked shader file enables or disables these settings.

  • Output texture coordinate 2 (DMP_particleSystem_X_X_X_1.obj, do not output (DMP_particleSystem_X_X_X_0.obj)
  • Rotate and scale the texture coordinate (DMP_particleSystem_X_0_X_X.obj), neither rotate nor scale (DMP_particleSystem_X_1_X_X.obj)

Rotations are specified in radians. Clockwise rotations are specified by positive values.

Texture coordinate 0 is output with the particle's lower-left, lower-right, upper-left, and upper-right corners at (0,0), (1,0), (0,1), and (1,1) respectively. Texture coordinate 2 is output with the particle's lower-left, lower-right, upper-left, and upper-right corners at (-1,-1), (1,-1), (-1,1), and (1,1) respectively.

Given a rotation of A and a scaling value of R, texture coordinate 0 is calculated as follows.

Lower-left = ( 0.5×(1.0 + (-cosA + sinA)), 0.5×(1.0 + (-cosA - sinA)) )
Lower-right = ( 0.5×(1.0 + (cosA + sinA)), 0.5×(1.0 + (-cosA + sinA)) )
Upper-left = ( 0.5×(1.0 + (-cosA - sinA)), 0.5×(1.0 + (cosA - sinA)) )
Upper-right = ( 0.5×(1.0 + (cosA - sinA)), 0.5×(1.0 + (cosA + sinA)) ).

Texture coordinate 2 is calculated as follows.

Lower-left = ( R×(-cosA + sinA), R×(-cosA - sinA) )
Lower-right = ( R×(cosA + sinA), R×(-cosA + sinA) )
Upper-left = ( R×(-cosA - sinA), R×(cosA - sinA) )
Upper-right = ( R×(cosA - sinA), R×(cosA + sinA) ).

.

Alpha Component

Set the particle's alpha component (the fourth aspect column) as a value between 0.0 and 1.0. This setting is used with the linked shader files that only use the alpha component (DMP_particleSystem_X_X_0_X.obj).

Emission Count

Use the glUniform1fv() function to set the maximum particle emission count (dmp_PartSys.countMax). Set this value to one less than the actual number of particles you want to emit. You must set a value of 0.0 or greater. However, because the shader program implementation limits the maximum number of emitted particles to 255, no more than 255 particles will be emitted, even when this reserved uniform is set to a value greater than 256.

Execution Time and Speed

A particle system has a concept of time. Use the glUniform1fv() function to set the particle system time (dmp_PartSys.time) to the current time. This current time is randomly converted into each particle's execution time, during which the particle travels from the first control point to the fourth control point. A particle is emitted at the first control point when its execution time is 0.0, and reaches the fourth control point when its execution time is 1.0.

If you link to a shader file that clamps the execution time (DMP_particleSystem_0_X_X_X.obj), particles with an execution time of 1.0 or greater cease to be rendered. Consequently, particles will cease to be emitted at some point, if the application simply lets time pass without resetting the execution time.

If you link to a shader file that does not clamp the execution time (DMP_particleSystem_1_X_X_X.obj), the execution time loops between 0.0 and 1.0. In other words, particles that reach the fourth control point are re-emitted from the first control point.

Use the glUniform1fv() function to set the particle speed (dmp_PartSys.speed).

Random Values

The position of control points within their bounding boxes and the execution time of particles are determined using a function that generates pseudorandom numbers. The application can specify a random seed and coefficient to use in this random function.

The implementation of the random function is similar to the following algorithm for a pseudorandom number generator.

Use the glUniform4fv() function to set the random seed (dmp_PartSys.randSeed) with an array of values (the x, y, and z components of the Bézier curve and the particle execution time, in that order) corresponding to X0 in Equation 9-8.

For the random function's coefficients (dmp_PartSys.randomCore), use the glUniform4fv() function to specify the values a, b, m, and 1/m in Equation 9-8.

Table 9-6. Reserved Uniforms for Particle System Shaders

Reserved Uniform

Type

Value to Set

dmp_PartSys.color

mat4

The color at each control point.

(R, G, B, A) * 4vec

Each component has a value in the range from 0.0 through 1.0.

dmp_PartSys.aspect

mat4

The aspect of each control point.

(particle_size, rotation_angle, scale, alpha) * 4vec

particle_size is 1.0 or greater and alpha is in the range from 0.0 through 1.0.

dmp_PartSys.time

float

The current particle system time.

dmp_PartSys.speed

float

The speed of particle movement.

0.0 or more

dmp_PartSys.countMax

float

A number that is one less than the maximum number of particles to emit.

0.0 or more

dmp_PartSys.randSeed

vec4

The random seed for each of the random functions.

(Multiplicand with the Bézier curve's x component, multiplicand with the Bézier curve's y component, multiplicand with the Bézier curve's z component, multiplicand with the particle execution time).

dmp_PartSys.randomCore

vec4

The random function's coefficients.

(a, b, m, 1/m)

dmp_PartSys.distanceAttenuation

vec3

The distance attenuation factor.

dmp_PartSys.viewport

vec2

The viewport using the following equation.

(1 / viewport.width, 1 / viewport.height)

dmp_PartSys.pointSize

vec2

The minimum and maximum particle size.

Each has a value of 0.0 or greater.

9.6.3. Vertex Shader Settings

A particle system shader requires a 4×4 matrix that contains the following values converted into clip coordinates: a single control point's vertex coordinates and the radius for the x, y, and z components of its bounding box (centered on said vertex coordinates). Starting at the smallest output register number, the vertex shader outputs the vertex coordinates followed by the first, second, third, and fourth rows of the conversion matrix, in that order.

Two output vertex attributes must be set: the vertex coordinates using the position attribute and the converted matrix using the generic attribute.

Code 9-10. Sample Output Register Settings When a Particle System Shader Is Used (in Shader Assembly)
#pragma output_map ( position , o0 )
#pragma output_map ( generic , o1 )
#pragma output_map ( generic , o2 )
#pragma output_map ( generic , o3 )
#pragma output_map ( generic , o4 ) 

The following equation shows the conversion into clip coordinates. The radii for the bounding box's x, y, and z components are: Rx, Ry, and Rz respectively. The projection matrix is Mproj and the modelview matrix is Mmodelview.

This is shown by the following shader assembly code. aBoundingBox is a vector with the x, y, and z components of the radius input by the application, and vBoundingBox1 through vBoundingBox4 represent the matrix output to the particle system shader. The bounding box's radius is input as attributes in this sample code, but because the particle system shader requires data for only four vertices, you can use an implementation that sets all of this along with the vertex coordinates in uniforms.

Code 9-11. Bounding Box Radius and Clip Coordinate Conversion (in Shader Assembly)
mov     TEMP_BOX[0],    CONST_0
mov     TEMP_BOX[1],    CONST_0
mov     TEMP_BOX[2],    CONST_0
mov     TEMP_BOX[3],    CONST_0
mov     TEMP_BOX[0].x,  aBoundingBox.x
mov     TEMP_BOX[1].y,  aBoundingBox.y
mov     TEMP_BOX[2].z,  aBoundingBox.z
m4x4    TEMP_MAT,       MATRIX_Project, MATRIX_ModelView
m4x4    TEMP_MAT,       TEMP_MAT,       TEMP_BOX
mov     vBoundingBox1,  TEMP_MAT[0]
mov     vBoundingBox2,  TEMP_MAT[1]
mov     vBoundingBox3,  TEMP_MAT[2]
mov     vBoundingBox4,  TEMP_MAT[3] 

9.6.4. Input Vertex Data

Calls from the glDrawArrays() function are not supported. When using a particle system shader, call the glDrawElements() function, and specify GL_GEOMETRY_PRIMITIVE_DMP for the mode parameter.


CONFIDENTIAL