Command caches allow you to reuse the 3D commands that have accumulated in a command list, while rendering 3D graphics. A cache reuses the 3D commands themselves, so caches allow you to reduce the cost of the function calls required to generate 3D commands, and thereby reduce CPU load.
Command caches may be used for the following.
- Saving Command Lists
- Reusing Command Lists
- Copying Command Lists
- Exporting and importing command lists.
- Adding 3D Commands
- Editing 3D Commands
This chapter introduces how to use command caches, and provides the information needed to edit 3D commands.
8.1. Saving Command Lists
Command caches cache both the contents of the 3D Command Buffer stored in a command list and the queued Command Requests. However, there are no such things as command cache objects. Command caches simply use start saving
and stop saving
declarations to get the information that is required in order to reuse a command list. Because a command cache simply reuses the data accumulated in its targeted command list without modifying that data, clearing or destroying that command list causes the command cache to stop functioning (unless the application has copied the data).
Use the nngxStartCmdlistSave
function to start saving a command list.
void nngxStartCmdlistSave(void);
GL_ERROR_8034_DMP
The nngxStartCmdlistSave
function was called while a command list was already being saved. You must stop saving before calling it again. GL_ERROR_8035_DMP
A command list was bound that has an object name of 0.
Use the nngxStopCmdlistSave
function to stop saving a command list, and get the command cache information.
void nngxStopCmdlistSave(GLuint* bufferoffset, GLsizei* buffersize, GLuint* requestid, GLsizei* requestsize);
The bufferoffset parameter returns the offset at which saving of the 3D command buffer started, buffersize returns the save size (in bytes) of the 3D command buffer, requestid returns the ID at which saving of command requests started, and requestsize returns the number of saved command requests. GL_ERROR_8036_DMPThe nngxStopCmdlistSave
function was called when a command list was not being saved.
8.1.1. States
A state refers to all of the settings used by a particular 3D graphics feature. When you call a gl
function or another graphics function, the state corresponding to that function is updated, and the 3D commands generated by the state update are accumulated in the 3D command buffer. Consequently, state updates have an effect on whether the 3D commands required for rendering are saved, and on whether command caches will function properly.
Each state has at least one gl
function or uniform setting. It is also possible that a single gl
function will update multiple states. If you specify states in the application (one reason to do this is to ensure that the required 3D commands are saved), you must specify states as a bitwise OR of the state flags.
State Flag | Description |
---|---|
NN_GX_STATE_SHADERBINARY |
Shader binary state. Updated when you use the Updating this state generates 3D commands to load shader assembly code. |
NN_GX_STATE_SHADERPROGRAM |
Shader program state. Updated when you use the Updating this state generates various 3D commands, including commands that set the configuration of vertex attributes. 3D commands are only generated for registers of settings that are changed since the previous validation. |
NN_GX_STATE_SHADERMODE |
Shader mode state. Updated when you use the Updating this state generates 3D commands to enable or disable use of geometry shaders. |
NN_GX_STATE_SHADERFLOAT |
Shader floating-point definition state. Updated when you use the Updating this state generates 3D commands to set floating-point registers with the values defined by the shader assembly |
NN_GX_STATE_VSUNIFORM |
Vertex shader uniform state. Updated when you use the Updating this state generates 3D commands to set values in the floating-point registers, Boolean registers, and integer registers that are defined as uniforms in the shader assembly code. |
NN_GX_STATE_FSUNIFORM |
Reserved fragment shader uniform state. Updated when you use the Updating this state generates 3D commands to set values in the reserved fragment shader registers. |
NN_GX_STATE_LUT |
Lookup table state. Updated when you use the Updating this state generates 3D commands to set the lookup tables. |
NN_GX_STATE_TEXTURE |
Texture state (excluding procedural textures). Updated when any of the Updating this state generates 3D commands involved with texture units (except for procedural textures). |
NN_GX_STATE_FRAMEBUFFER |
Frame buffer information state. Updated when any of the Updating this state generates 3D commands involved with the framebuffer format and buffer address. |
NN_GX_STATE_VERTEX |
Vertex attribute data state. Updated when any of the Updating this state generates 3D commands involved with vertex attribute data. |
NN_GX_STATE_TRIOFFSET |
Polygon offset state. Updated when you use the Updating this state generates 3D commands involved with polygon offsets. |
NN_GX_STATE_FBACCESS |
Frame buffer access method state. Updated when you use the Updating this state generates 3D commands involved with framebuffer access methods (R/W, and so forth). |
NN_GX_STATE_SCISSOR |
Scissor test state. Updated when you use the Updating this state generates 3D commands involved with the scissor test. |
NN_GX_STATE_OTHERS |
State of all functions that generate 3D commands besides Updated when functions other than For details on the 3D commands generated when this state is updated, see Table 8-3. |
NN_GX_STATE_ALL |
This flag specifies all of the above states. |
8.1.2. Generating Commands
Most of the 3D commands accumulated in 3D Command Buffers are generated by the glDrawElements
and glDrawArrays
functions. When these functions run, they check the state and generate the 3D commands involved with any state updates. This is known as validating.
States can be validated at any time by calling the nngxValidateState
function. Alternatively, you can specify state flags to the nngxUpdateState
function to mark those states as updated. By doing so, you can generate all 3D commands for those specified states, at a time of your choosing.
void nngxValidateState(GLbitfield statemask, GLboolean drawelements); void nngxUpdateState(GLbitfield statemask);
Out of all the updated states, the nngxValidateState
function only validates the states specified by statemask, and then removes these states from updated status. Calling this function does not cause any rendering to take place, so specify in the drawelements parameter whether you will use the glDrawElements
or the glDrawArrays
function when actually rendering. Pass GL_TRUE
to generate 3D commands corresponding to the glDrawElements
function, and pass GL_FALSE
to generate 3D commands for the glDrawArrays
function. In conclusion, you can use a combination of the nngxUpdateState
and nngxValidateState
functions to generate all the 3D commands for any state.
In contrast, glDrawElements
and the other rendering functions cause all updated states to be validated and then removed from "updated" status.
Unlike other functions, the nngxValidateState
function can generate 3D commands at any time. Note, however, that there is a prescribed order in which the various states must be updated. The 3D commands for some states must be set before the 3D commands for other states. This prescribed order is described below. Operation may become unstable if 3D commands are generated and run in any order other than as described by the conditions below.
NN_GX_STATE_FBACCESS
andNN_GX_STATE_TRIOFFSET
must be specified before, or at the same time as,NN_GX_STATE_FSUNIFORM
.NN_GX_STATE_SHADERMODE
must be specified before, or at the same time as,NN_GX_STATE_SHADERBINARY
,NN_GX_STATE_SHADERPROGRAM
,NN_GX_STATE_SHADERFLOAT
, andNN_GX_STATE_VSUNIFORM
.NN_GX_STATE_FRAMEBUFFER
andNN_GX_STATE_OTHERS
must be specified before, or at the same time as,NN_GX_STATE_FBACCESS
.
Error | Cause |
---|---|
GL_ERROR_8066_DMP |
Calling the nngxValidateState function resulted in a 3D Command Buffer overflow. |
GL_ERROR_806C_DMP |
An error occurred during validation. |
GL_ERROR_80B2_DMP |
The 3D Command Buffer is not set. |
GL_ERROR_80B3_DMP |
Without setting a valid program object (
|
The following are causes of validation errors.
- No texture memory allocated for the enabled textures.
To resolve this error, you must call theglTexImage2D
,glCompressedTexImage2D
, orglCopyTexImage2D
function and allocate texture memory. For cube map textures, memory must be allocated for all six faces. - A bound texture is in an invalid format.
This could be caused, for example, when a texture of formatGL_SHADOW_DMP
is bound to texture unit 1 or 2, or when a texture of formatGL_GAS_DMP
is bound to a cube map texture. - The settings for the faces of a cube map texture differ.
The six faces of a cube map texture must all have the same width, height, format, and mipmap level. - The most-significant seven bits of the addresses for all six faces of a cube map texture do not match.
The most-significant seven bits of the addresses for all six faces of a cube map texture must be the same. - A lookup table object is not properly bound, or a lookup table number is not properly specified.
When lookup tables are configured for use in fragment lighting, procedural textures, fog, or gas, valid lookup table objects must be bound to the relevant lookup table numbers. The uniforms that specify the lookup table numbers must also be properly configured. - Failed to allocate the memory required to store the lookup tables' internal format values.
If an error occurred during validation, each state is considered to be validated. To generate the correct command after an error, call the nngxUpdateState
function and update the state again.
Generally, 3D commands are generated by the glDraw
functions, but 3D commands for NN_GX_STATE_FRAMEBUFFER
state updates can also be generated by the glReadPixels
and glClear
functions. In addition to those, calling the following functions will also generate 3D commands.
Function | When Generated |
---|---|
glBlendColor |
When a setting value has been changed. |
glBlendEquation |
When a setting value has been changed. |
glBlendEquationSeparate |
When a setting value has been changed. |
glBlendFunc |
When a setting value has been changed. |
glBlendFuncSeparate |
When a setting value has been changed. |
glClearEarlyDepthDMP |
When a setting value has been changed. |
glColorMask |
When a setting value has been changed. |
glCullFace |
When a setting value has been changed. |
glDepthFunc |
When a setting value has been changed. |
glDepthMask |
When a setting value has been changed. |
glDisable |
When the GL_COLOR_LOGIC_OP , GL_BLEND , GL_DEPTH_TEST , GL_EARLY_DEPTH_TEST_DMP , GL_STENCIL_TEST , or GL_CULL_FACE setting has been changed. No commands generated if other setting values have changed. |
glEarlyDepthFuncDMP |
When a setting value has been changed. |
glEnable |
When the GL_COLOR_LOGIC_OP , GL_BLEND , GL_DEPTH_TEST , GL_EARLY_DEPTH_TEST_DMP , GL_STENCIL_TEST , or GL_CULL_FACE setting has been changed. No commands generated if other setting values have changed. |
glFrontFace |
When a setting value has been changed. |
glLogicOp |
When a setting value has been changed. |
glRenderBlockModeDMP |
When a setting value has been changed. |
glStencilFunc |
When a setting value has been changed. |
glStencilMask |
When a setting value has been changed. |
glStencilOp |
When a setting value has been changed. |
glViewport |
Always generated. |
Of the functions in this table, the glEnable
, glDisable
, glDepthFunc
, glEarlyDepthFuncDMP
, glColorMask
, glDepthMask
, and glStencilMask
functions all update the NN_GX_STATE_FBACCESS
state when they are called for certain features (GL_COLOR_LOGIC_OP
, GL_BLEND
, GL_DEPTH_TEST
, GL_STENCIL_TEST
, and GL_EARLY_DEPTH_TEST_DMP
), so you must validate the NN_GX_STATE_FBACCESS
state when you save a command list.
If you specified NN_GX_STATE_OTHERS
to the nngxUpdateState
function, 3D commands for all the functions in the table above are generated at state validation time. The same function call also marks the NN_GX_STATE_FBACCESS
state as updated.
8.1.3. Commands for Partial State Updates and Full State Updates
The gl
functions may update multiple states. Normally, a called function only generates 3D commands for updated states. This is known as a partial state update, and these commands are delta commands. Meanwhile, a full state update is when the called function generates 3D commands for all associated states, regardless of whether they have been updated. Full state updates may lead to redundant 3D command generation because they cause all 3D commands for a state to be generated separately again for each state.
Use the nngxSetCommandGenerationMode
function to configure several states to always trigger full state updates. You can also use the nngxUpdateState
function to specify a full state update of the specified states.
void nngxSetCommandGenerationMode(GLenum mode);
Specify NN_GX_CMDGEN_MODE_CONDITIONAL
in the mode parameter to generate delta commands. This is the default mode. Specify NN_GX_CMDGEN_MODE_UNCONDITIONAL
to generate all commands for a full state update. GL_ERROR_804D_DMPAn invalid value was specified in the mode parameter. Call the nngxGetCommandGenerationMode
function to get the current mode.
void nngxGetCommandGenerationMode(GLenum* mode);
The current mode is returned in the mode parameter.
The following states are affected by unconditional command generation mode (which specifies full state updates).
- States involved with setting reserved fragment shader uniforms.
- States involved with setting vertex shader integer uniforms.
- States involved with setting lookup table data.
- States involved with the functions listed in Table 8-3.
In addition to the 3D commands for updated states, the states listed above have all commands generated for a full state update, regardless of whether they have been updated. However, even if all states involved with lookup tables have been updated, only those 3D commands for the enabled lookup tables are generated. The following lists the conditions under which a lookup table is enabled.
Lookup Table | Condition |
---|---|
Red component of reflection (RR) in fragment lighting | dmp_FragmentLighting.enabled is set to GL_TRUE , dmp_LightEnv.config is set to use RR, and dmp_LightEnv.lutEnabledRefl is set to GL_TRUE . |
Green component of reflection (RG) in fragment lighting | dmp_FragmentLighting.enabled is set to GL_TRUE , dmp_LightEnv.config is set to use RG, and dmp_LightEnv.lutEnabledRefl is set to GL_TRUE . |
Blue component of reflection (RB) in fragment lighting | dmp_FragmentLighting.enabled is set to GL_TRUE , dmp_LightEnv.config is set to use RB, and dmp_LightEnv.lutEnabledRefl is set to GL_TRUE . |
Distribution factor 0 (D0) in fragment lighting | dmp_FragmentLighting.enabled is set to GL_TRUE , dmp_LightEnv.config is set to use D0, and dmp_LightEnv.lutEnabledD0 is set to GL_TRUE . |
Distribution factor 1 (D1) in fragment lighting | dmp_FragmentLighting.enabled is set to GL_TRUE , dmp_LightEnv.config is set to use D1, and dmp_LightEnv.lutEnabledD1 is set to GL_TRUE . |
Fresnel factor (FR) in fragment lighting | dmp_FragmentLighting.enabled is set to GL_TRUE , dmp_LightEnv.config is set to use FR, and dmp_LightEnv.fresnelSelector is set to a value other than GL_LIGHT_ENV_NO_FRESNEL_DMP . |
Spotlight attenuation (SP) in fragment lighting | dmp_FragmentLighting.enabled is set to GL_TRUE , dmp_LightEnv.config is set to use SP, dmp_FragmentLightSource[i].enabled is set to GL_TRUE , and dmp_FragmentLightSource[i].spotEnabled is set to GL_TRUE . |
RGB-mapping F function in procedural textures | dmp_Texture[3].samplerType is set to GL_TEXTURE_PROCEDURAL_DMP . |
Alpha-mapping F function in procedural textures | dmp_Texture[3].samplerType is set to GL_TEXTURE_PROCEDURAL_DMP and dmp_Texture[3].ptAlphaSeparate is set to GL_TRUE . |
Noise modulation function in procedural textures | dmp_Texture[3].samplerType is set to GL_TEXTURE_PROCEDURAL_DMP , and dmp_Texture[3].ptNoiseEnable is set to GL_TRUE . |
Color lookup table in procedural textures | dmp_Texture[3].samplerType is set to GL_TEXTURE_PROCEDURAL_DMP . |
Fog coefficient | dmp_Fog.mode is set to a value other than GL_FALSE . |
Shading lookup table for gas | dmp_Fog.mode is set to GL_GAS_DMP . |
8.1.4. Command Cache Restrictions and Notes
Command cache use has the following restrictions and notes.
- When fragment lighting is enabled (reserved uniform
dmp_FragmentLighting.enabled
is set toGL_TRUE
) and all light sources are disabled (reserved uniformsdmp_FragmentLightSource[i].enabled
are all set toGL_FALSE
), 3D commands for lighting are generated again when a rendering function is called, even after you have specified the state flag for reserved fragment shader uniforms (NN_GX_STATE_FSUNIFORM
) to thenngxValidateState
function and validated the state. - 3D commands for the
dmp_Gas.accMax
reserved uniform are generated again when a rendering function is called, even after you have specified the state flag for reserved fragment shader uniforms (NN_GX_STATE_FSUNIFORM
) to thenngxValidateState
function and validated the state. - When the reserved fragment uniform
dmp_Gas.autoAcc
is set toGL_TRUE
, either starting saving or stopping saving of a command list, and when the value of the reserved uniformdmp_FragOperation.mode
changes toGL_FRAGOP_MODE_GAS_ACC_DMP
, or when it changes fromGL_FRAGOP_MODE_GAS_ACC_DMP
to some other value, this sometimes results in the wrong commands fordmp_Gas.autoAcc
in the saved command list. - To run a 3D Command Buffer, its size must be a multiple of 16 bytes. Adjust the size by calling the
nngxAdd3DCommand
function and appending the filler value 0x00000000_00000000. - If you call the
glUseProgram
function and pass0
as an argument, no commands are generated during validation of states involved with programs or shaders.
Code 8.1.5. Getting Updated States
Call the nngxGetUpdateState
function to get the state flags of states that have been updated.
void nngxGetUpdatedState (GLbitfield* statemask);
8.1.6. Disabling State Updates
Call the nngxInvalidateState
function to disable state updates.
void nngxInvalidateState (GLbitfield statemask);
For the statemask
parameter, specify the bitwise OR of the state flags of states for which to disable updates. Thereafter, no commands are generated for the states whose state flags are specified in statemask, even if these states are updated.
8.2. Reusing Command Lists
You can reuse 3D commands and Command Requests by taking the command cache information obtained from saving a command list and passing it to the nngxUseSavedCmdlist
or nngxUseSavedCmdlistNoCacheFlush
functions. The former flushes the cache for the added 3D commands, but the latter does not.
void nngxUseSavedCmdlist(GLuint cmdlist, GLuint bufferoffset, GLsizei buffersize, GLuint requestid, GLsizei requestsize, GLbitfield statemask, GLboolean copycmd); void nngxUseSavedCmdlistNoCacheFlush(GLuint cmdlist, GLuint bufferoffset, GLsizei buffersize, GLuint requestid, GLsizei requestsize, GLbitfield statemask);
The cmdlist parameter specifies the command list that was saved and in which 3D commands are accumulated.
The bufferoffset parameter specifies the offset at which saving of the 3D command buffer started, the buffersize parameter specifies the 3D command buffer's save size in bytes, the requestid parameter specifies the ID at which saving of command requests started, and the requestsize parameter specifies the number of saved command requests. These must all be from the same command cache or from command caches that were saved at the same time. The function does not check to confirm that this command cache information is actually from the specified command list, or that all this information is from the same command cache. If you specify incorrect command cache information, resulting operations are undefined.
The statemask parameter specifies a bitwise OR of the state flags of states for which to do full state updates. When you call this function, an inconsistency arises between state settings before and after the function call. To resolve this inconsistency it is sometimes necessary to generate commands for full state updates of all states. But because generating all commands for all states can be excessive, full state updates are done only for those states specified in statemask. If NN_GX_STATE_OTHERS
is specified in statemask, commands for full state updates are generated for all states involved with the functions listed in Table 8-3.
The copycmd parameter specifies whether the command cache feature first copies the 3D commands from the saved command list before running them, or just runs them directly from the saved command list. If GL_TRUE
is specified, the 3D commands are copied and appended to the currently bound command list. Copying the 3D commands entails a high CPU load, so this method is most appropriate when reusing small 3D command buffers. If GL_FALSE
is specified, the commands are not copied. Not copying the 3D commands reduces the CPU load, so this method is most appropriate when reusing large 3D command buffers. This parameter only controls whether 3D commands are copied and added to the 3D command buffer. Command requests are always added. Functions that do not flush the cache operate the same as passing a value of GL_FALSE
for the copycmd parameter.
If you do not copy the 3D commands from the saved command list and the 3D commands are not already split, this function adds a "split" command immediately prior to the reused 3D commands and switches to a command list that has saved the 3D command execution address. Consequently, if there is no split command in the saved command list, it is not possible to return to the original command list. To reuse commands from a command list saved this way, you must always add a split command using the nngxSplitDrawCmdlist
function before you stop saving the command list.
If you copy the 3D commands from the saved command list under the following conditions, this function adds a split command immediately prior to the reused 3D commands.
Error | Cause |
---|---|
GL_ERROR_8037_DMP GL_ERROR_8092_DMP |
Called when the bound command list’s object name is 0 . |
GL_ERROR_8038_DMP GL_ERROR_8093_DMP |
A nonexistent command list was passed in cmdlist. |
GL_ERROR_803A_DMP GL_ERROR_8095_DMP |
Running the function causes the currently bound command list's 3D command buffer to overflow or its stored command requests to exceed the maximum number. |
8.2.1. Command Request Information That Is Copied
When the nngxUseSavedCmdlist
function runs, Command Requests are always copied and appended to the currently bound command list, regardless of the value passed in copycmd. Command Requests contain information that varies depending on the type of Command Request, and this information is copied unchanged, even if states have been updated since the command list was saved. However, this does not apply to the first render command request that is copied. It is possible for this command's information to change.
DMA Transfer Command Requests
These requests contain the DMA transfer source address, destination address, and transfer size.
Render Command Requests
These requests contain the 3D command buffer's execution start address and execution size. When the address of the 3D command buffer for which saving has begun does not match the execution start address, the execution start address is replaced with the address at which saving started when the command requests were copied, and the execution size is changed to match.
Memory-Fill Command Requests
These requests contain the starting address, size, and clear color of the color buffer to fill, in addition to the starting address, size, clear depth value, and clear stencil value of the depth stencil buffer.
Post-Transfer Command Requests
These contain the address, resolution, and format of the color buffer that is the transfer source, along with the address, resolution, and format of the destination display buffer.
Copy Texture Command Requests
These contain the address and resolution of the color buffer that is the transfer source, along with the address and resolution of the destination texture.
8.3. Copying Command Lists
Call the nngxCopyCmdlist
function to copy the contents of one command list to a different command list. Note that this function copies all of the command list information and overwrites any accumulated 3D commands and command requests in the destination command list.
void nngxCopyCmdlist(GLuint scmdlist, GLuint dcmdlist);
The scmdlist
parameter specifies the source command list to copy from, and the dcmdlist
parameter specifies the destination where the command list is to be copied.
The operations of this function do not directly relate to command caches, but given that command cache information is created based on an offset, you can use command caches to reuse a copy created right after saving a command list. You can also clear the copy source command list after copying.
Error | Cause |
---|---|
GL_ERROR_8047_DMP |
The currently bound command list was specified for dcmdlist. |
GL_ERROR_8048_DMP |
A nonexistent command list was passed in cmdlist. |
GL_ERROR_8049_DMP |
A nonexistent command list was passed in cmdlist. |
GL_ERROR_804A_DMP |
The same command list was specified for both scmdlist and dcmdlist. |
GL_ERROR_804B_DMP |
A running command list was specified for dcmdlist. |
GL_ERROR_804C_DMP |
A command list was specified for scmdlist that has accumulated more 3D commands or Command Requests than can fit in the command list specified for dcmdlist. |
8.3.1. Copying Command Lists
The nngxCopyCmdlist
function only supports copying a command list and using it to overwrite the destination list, but the nngxAddCmdList
function allows you to copy all the information in a command list and append it to the currently bound command list.
void nngxAddCmdlist(GLuint cmdlist);
The cmdlist
parameter specifies the source command list to copy.
All of the commands accumulated in the source command list are appended to the currently bound command list. The copied commands are added to the end of the currently bound command list, after any commands it has already accumulated.
If the currently bound command list's 3D command buffer has not just been split, and if the first command in the command requests that are being appended is not a render command request, the library first calls the nngxSplitDrawCmdlist
function to split the 3D Command Buffer, and then appends the copied commands.
If the currently bound command list's 3D Command Buffer has not just been split, and if the first command in the Command Requests being appended is a render command request, dummy commands are added to the destination 3D Command Buffer, as needed, to adjust its alignment before the copied commands are appended.
If the library needs to call nngxSplitDrawCmdlist
or append additional dummy commands, the commands added by that processing are included in this maximum size check.
Error | Cause |
---|---|
GL_ERROR_8054_DMP |
An invalid value was specified for cmdlist. |
GL_ERROR_8055_DMP |
There is no currently bound command list. |
GL_ERROR_8056_DMP |
The currently bound command list was specified in cmdlist. |
GL_ERROR_8057_DMP |
The currently bound command list is running. |
GL_ERROR_8058_DMP |
A command list was specified that has a 3D Command Buffer or Command Requests that are too big to fit in the currently bound command list. |
8.4. Exporting Command Lists
Call the nngxExportCmdlist
function to store the contents of a command list, obtained via a command cache, in memory as binary data. This operation is equivalent to exporting the command list.
GLsizei nngxExportCmdlist(GLuint cmdlist, GLuint bufferoffset, GLsizei buffersize, GLuint requestid, GLsizei requestsize, GLsizei datasize, GLvoid* data);
The cmdlist parameter specifies the command list to export.
The bufferoffset and buffersize parameters specify the byte offset and byte size of the 3D Command Buffer memory region to export. The requestid and requestsize parameters specify the Command Request ID at which to start exporting (these IDs start from 0 in the order of accumulation), and the number of Command Requests to export. The bufferoffset value must point somewhere within the memory region executed by the first render command request, in the Command Requests to export. Likewise, all of the split commands executed by the exported render command requests must also be exported. The inverse is true if no Command Requests are exported. In this case, split commands must not be included in the exported commands.
The data and datasize parameters specify the starting address and size of the export destination memory region. This function returns the size, in bytes, of the exported data, but if the data parameter specifies 0
or NULL
, no data is exported. Instead the function returns the memory size required for export. The export procedure is to first get the size of the memory needed, allocate a memory region, and then finally export the data.
This function's bufferoffset, buffersize, requestid, and requestsize parameters must specify values that are not mutually contradictory. To safely export data, we recommend using save information obtained from the nngxStopCmdlistSave
function, or values obtained by several carefully-timed calls of the nngxGetCmdlistParameteri
function during 3D command accumulation. Specifically, get these parameters using the latter method. Call the nngxGetCmdlistParameteri
function at the point in 3D command accumulation when you want to start an export. Pass NN_GX_CMDLIST_USED_BUFSIZE
in pname
to get the size of the accumulated 3D Command Buffer, and pass NN_GX_CMDLIST_USED_REQCOUNT
instead to get the number of accumulated command requests. Specify these values in bufferoffset
and requested
, respectively. Call the function again when you want to end the export and pass the same values. Then subtract the values it returned for the start of the export from the values it returned for the end, and specify the resulting values in buffersize
and requestsize
.
Error | Cause |
---|---|
GL_ERROR_803B_DMP |
An invalid value (0 or a nonexistent command list) was specified. |
GL_ERROR_803C_DMP |
A value was specified for the datasize parameter that is smaller than the size of the data to export. |
GL_ERROR_803D_DMP |
The region specified for requestid and requestsize does not have any commands accumulated. |
GL_ERROR_803E_DMP |
The bufferoffset or buffersize values specified are not 8-byte aligned. |
GL_ERROR_803F_DMP |
A command list was specified that includes a render command request added in a method other than the nngxUseSavedCmdlist function copying the command request. |
GL_ERROR_8040_DMP |
Incorrect 3D Command Buffer values were specified for bufferoffset and buffersize for the exported render command request that executes the 3D command. |
8.4.1. Getting the Export Information
Use the nngxGetExportedCmdlistInfo
function to get the command list information (export information) that is included in exported binary data.
void nngxGetExportedCmdlistInfo(GLvoid* data, GLsizei* buffersize, GLsizei* requestsize, GLuint* bufferoffset);
The data
parameter specifies the starting address of the exported binary data. Specifying invalid data causes a GL_ERROR_8046_DMP
error. There are four parameters in the export information: buffersize stores the size, in bytes, of the 3D Command Buffer, requestsize stores the number of Command Requests, and bufferoffset stores the byte offset from the start of the exported data, specified in data, to the start of the 3D Command Buffer.
8.5. Importing Command Lists
Call the nngxImportCmdlist
function to copy-append 3D commands from exported binary data to a command list. This operation is equivalent to importing an exported list.
void nngxImportCmdlist(GLuint cmdlist, GLvoid* data, GLsizei datasize);
The cmdlist parameter can specify either the currently bound command list or a command list that is not bound. If the specified command list has already accumulated 3D commands, the imported commands will be appended after the accumulated ones.
If the first command request in the data to import is not a render command request, you must bind the destination command list, and then add a split command to it using the nngxSplitDrawCmdlist
function.
The data and datasize parameters specify a pointer to the export data and the size, in bytes, of the export data.
Importing a command list may cause dummy commands to be generated as padding in the 3D Command Buffer of the destination command list.
Error | Cause |
---|---|
GL_ERROR_8041_DMP |
An invalid value (0 or a nonexistent command list) was specified. |
GL_ERROR_8042_DMP |
An invalid pointer was specified for the data parameter. |
GL_ERROR_8043_DMP |
A size that is different from the export data size was specified for the datasize parameter. |
GL_ERROR_8044_DMP |
Importing more 3D commands or Command Requests than can fit within the size of the destination command list. |
GL_ERROR_8045_DMP |
Data that does not have a render command request as its first Command Request is being imported into an unsplit command list. |
8.6. Adding 3D Commands
Use the nngxAdd3DCommand
or nngxAdd3DCommandNoCacheFlush
function either to specify a region and add the data in that specified region to the 3D Command Buffer of the currently bound command list, or to add render command requests that run the specified region. The former flushes the cache for the added 3D commands, but the latter does not.
void nngxAdd3DCommand(const GLvoid* bufferaddr, GLsizei buffersize, GLboolean copycmd); void nngxAdd3DCommandNoCacheFlush(const GLvoid* bufferaddr, GLsizei buffersize);
This function works differently depending on the value of the copycmd parameter.
When copycmd is set to GL_TRUE
, this function copy-appends the 3D commands stored in the region whose starting address and size, in bytes, are specified by bufferaddr and buffersize to the 3D Command Buffer of the currently bound command list. Operation is not guaranteed if the specified region contains split commands. The buffersize value must be a positive multiple of 4.
When copycmd is set to GL_FALSE
, this function adds a render command request that runs the 3D commands stored in the region whose starting address and size, in bytes, are specified by bufferaddr and buffersize to the Command Requests. If the 3D Command Buffer of the currently bound command list is unsplit, this function adds a split command, and then adds the Command Request. Operation is not guaranteed if the last 3D command in the specified region is not a split command.
The buffersize value must be a positive multiple of 16.
The operation of the nngxAdd3DCommandNoCacheFlush
function is the same as when designating the copycmd of nngxAdd3DCommand
as GL_FALSE
.
Error | Cause |
---|---|
GL_ERROR_804E_DMP GL_ERROR_808C_DMP |
The nngxAdd3DCommand function was called without a bound command list. |
GL_ERROR_804F_DMP GL_ERROR_808D_DMP |
An invalid value was specified for buffersize. |
GL_ERROR_8052_DMP |
A value that is not a multiple of 16 was specified for bufferaddr, when copycmd is GL_FALSE . |
GL_ERROR_8050_DMP |
The specified 3D Command Buffer size is insufficient for the currently bound command list, when copycmd is GL_TRUE . |
GL_ERROR_8051_DMP |
The specified 3D Command Buffer size is insufficient for the currently bound command list, when copycmd is GL_TRUE . |
GL_ERROR_808E_DMP |
A value that is not a multiple of 16 was specified for bufferaddr, when copycmd is GL_FALSE . |
GL_ERROR_808F_DMP |
The specified Command Request is insufficient for the currently bound command list. |
8.6.1. Directly Generating 3D Commands
You can use the nngxAdd3DCommand
function to run 3D commands that have been directly generated by the application. You can run these 3D commands without calling any gl
functions. Note, however, that when you run a mix of directly generated 3D commands and 3D commands generated by calling the gl
functions (regularly generated commands), the directly generated commands do not update any of the library states.
If you change GPU settings with directly generated commands, there is a possibility that this will not be recognized as a state update. (Comparing the current states to the states after calling gl
functions that change the same settings might not show that any update occurred.) If your changes are not recognized, 3D commands that are normally generated are not generated, leading to unintended rendering results. Likewise, if you run directly generated commands while some states are still marked as updated and then perform state validation, unintended 3D commands may be generated and prevent your directly generated commands from being applied.
8.6.1.1. Transitioning From Regularly Generated Commands to Directly Generated Commands
The key to safely transitioning from running regularly generated commands to running directly generated commands is to ensure that when you do so, no states are currently marked as "updated."
When a state has been marked as updated, that updated status is removed only after a glDraw
function or the nngxValidateState
function is called and the state is validated (see 8.1.2. Generating Commands). If you run directly generated commands on the assumption that all settings previously configured by gl
functions are already applied to the GPU, we strongly recommend validating states first. This makes your assumption true and also ensures that your directly generated commands are run in the way you expect. It is easy to validate all states by passing NN_GX_STATE_ALL
in a call to the nngxValidateState
function. But to prevent the generation of excess 3D commands, you can instead call the nngxGetUpdatedState
function to get the state flags of the states marked as updated, and validate just those states.
8.6.1.2. Transitioning From Directly Generated Commands to Regularly Generated Commands
The key to safely transitioning from running directly generated commands to running regularly generated commands is to mark as updated all states that would have been updated by directly generated commands. Doing so ensures that the GPU settings and the library states match.
The most certain method is to mark all states as updated by passing NN_GX_STATE_ALL
in a call to the nngxValidateState
function, and then validate all states. However, this method causes the generation of unneeded 3D commands.
If you thoroughly understand what states are updated by your directly generated commands and what states are dependent on those states, you can mark only those states as "updated," validate them, and keep the 3D commands generated to the minimum required.
If you pass NN_GX_CMDGEN_MODE_UNCONDITIONAL
in a call to the nngxSetCommandGenerationMode
function and do full state updates of only some states, you can avoid excessive command generation. However, when you do so, the only states that are okay to not mark as "updated" with the nngxUpdateState
function are the specific states described in Section 8.1.3. Commands for Partial State Updates and Full State Updates.
8.7. Editing 3D Commands
Command lists can be reused by means of command caches. However, simply reusing a saved command list as is does not take into account any intervening changes in the scene, such as updates to the camera position. This section describes the 3D Command Buffer specifications and the information written to GPU registers. It also introduces how to handle changes in the scene by editing 3D commands, such as the 3D commands involved in vertex shader and reserved fragment shader settings.
8.7.1. 3D Command Buffer Access
Part of the information saved in a command cache is the offset at which saving of the 3D Command Buffer started. Consequently, to access a 3D Command Buffer to edit commands, you must first get the starting address of the 3D Command Buffer by calling the nngxGetCmdlistParameteri
function and passing NN_GX_CMDLIST_TOP_BUFADDR
in the pname parameter.
3D commands and the values written to registers are little-endian, so you must pay careful attention to the correspondence between the byte layouts and numerical notations shown below and the byte order of this data in memory.
8.7.2. 3D Command Buffer Specifications
The 3D Command Buffer is a collection of 3D commands (PICA register write commands) for writing to the GPU registers. 3D commands are collected into continuous 64-bit segments, with 32 bits of header and 32 bits of data. The number of data items varies depending on the header content, but 3D commands are always 64-bit aligned. The upper 32 bits of the final 64-bit segment may be ignored, depending on the number of data items.
Data is stored in the individual 3D command bits as shown below.
Bits | Name | Description |
---|---|---|
[31:0] | DATA | 32 bits of data to write to registers. |
[47:32] | ADDR | PICA register address to which to write the data. |
[51:48] | BE | Byte enabled. The 32-bit data segment is broken into 4 single-byte data items. A byte is written if its corresponding BE bit is 1. |
[59:52] | SIZE | Data item count. The value stored is 1 less than the actual number of data items. Single access is used if this value is 0 . Burst access is used if this value is 1 or greater. |
[63:63] | SEQ | Access mode during burst access. If 0 , all writes are to a single register. If 1 , multiple registers are written sequentially. |
There are two access methods (single access and burst access), depending on the number of data items specified in SIZE
. There are also two possible access modes during burst access (write single registers or write sequential sets of registers), depending on the value specified in SEQ
.
8.7.3. Single Access
When SIZE
is 0
, there is only one data item and the data is written to a register using single access. With single access, one data item is written to one register only once.
The contents of DATA
are written to the register specified in ADDR
. Only those bytes for which the K bit value is 1
are written. Data is not written to the register if the corresponding BE
value is 0
. The SEQ
value is ignored.
Example:
A 3D command of 0x000F0110_12345678
is interpreted as SIZE = 0
, BE = 0xF
, ADDR = 0x0110
, and DATA = 0x12345678
. SIZE = 0
means single access is used to write the data 0x12345678
to the register at address 0x0110
.
8.7.4. Burst Access
When SIZE
is 1
or greater, there are two or more data items, and data is written to registers using burst access. With burst access, SIZE+1
data items (up to 256 items) are written to one or multiple registers.
DATA
stores the first 32 bits of data. The second and later data items are stored contiguously, with two items in each of the 64-bit segments that follow. The lower 32 bits store the first item in each pair, and the upper 32 bits store the second item. Because 3D commands are 64-bit aligned, the upper 32 bits of the final 64-bit segment are ignored when writing an odd number of data items.
The BE
value is applied to all data to be written. Thus, if BE
is 0
, no data is written to any register.
8.7.4.1. Writing Single Registers
When SEQ
is 0
, the access mode is single-register writing. In this mode multiple data items are written in succession to a single register.
Data is written in succession only to the register specified in ADDR
.
Example:
The 3D command:
0x004F0080_11111111
0x33333333_22222222
0x55555555_44444444
is interpreted as SIZE = 4
, BE = 0xF
, ADDR = 0x0080
, DATA = 0x11111111
, and SEQ = 0
. Because SIZE = 4
and SEQ = 0
, data is written using burst access in single-register mode, with all five data items (0x11111111
, 0x22222222
, 0x33333333
, 0x44444444
, 0x55555555
) being written one after another to the register at address 0x0080
. The next 3D command to run is stored in the 64-bit segment following 0x55555555_44444444
.
8.7.4.2. Sequential Register Writing
When SEQ
is 1
, the access mode is sequential register writing. In this mode, each data item is written only once to multiple sequential registers.
One data item is written per register to the registers coming sequentially after the address specified in ADDR
(with the address incremented by one each time).
Example:
The 3D command: 0x805F0280_11111111
0x33333333_22222222
0x55555555_44444444
0x77777777_66666666
is interpreted as SIZE = 5
, BE = 0xF
, ADDR = 0x0280
, DATA = 0x11111111
, and SEQ = 1
. Because SIZE = 5
and SEQ = 1
, data is written using burst access in sequential register mode, with one data item written to each of the six registers starting from address 0x0280
. Thus 0x11111111
is written to the register at address 0x0280
, 0x22222222
to the register at 0x0281
, 0x33333333
to the register at 0x0282
, 0x44444444
to the register at 0x0283
, 0x55555555
to the register at 0x0284
, and 0x66666666
to the register at 0x0285
. Given the SIZE
value of 5
, 0x77777777
is padding and is ignored. The next 3D command to run is stored in the 64-bit segment following 0x77777777_66666666
.
8.7.5. 3D Command Execution Cost
Each 3D command that writes a value to a PICA register requires one clock cycle to write once to one register (except for certain registers), for either single or burst access.
The rasterization module outputs a busy signal for one cycle each time it processes a 3D command. Consequently, each 3D command sent to modules in the process flow, starting with the rasterization module (rasterization module, texture unit, fragment lighting, texture combiner, and per-fragment operation module), will take two cycles to process.
Commands to clear the texture cache (bit [16:16] of register 0x0080
) or the post-vertex cache (register 0x0231
) both require one clock cycle. However, although processing each command takes one cycle, 3D commands are commands to the texture unit, so only one command can be input per two cycles.
Commands to flush the framebuffer cache (bit [0:0] of register 0x0111
) require around 100 cycles, and commands to clear the early depth buffer (bit [0:0] of register 0x0063
) require around 1000 cycles.
Moreover, entering a 3D command into the triangle setup, rasterization, texture unit, fragment lighting, texture combiner, or per-fragment operation modules when there is still fragment data in the module (when the module is processing a fragment) flushes the pipeline once per module. Consequently, 3D commands entered immediately after a render command require a pipeline flush for each of these modules.
The following table shows the register ranges allocated for these modules.
Module | Register Range |
---|---|
Triangle setup | 0x0040 to 0x005F |
Rasterization module | 0x0060 to 0x006F |
Texture units | 0x0080 to 0x00BF |
Fragment lighting | 0x0140 to 0x01DF |
Texture combiners | 0x00C0 to 0x00FF |
Per-fragment operation module | 0x0100 to 0x013F |
This includes addresses that do not actually have a register.
8.8. PICA Register Information
This section describes addressing, how to set values, and value formats for various PICA registers. Based on this information, you can change the setting values for a feature by searching for the locations in the 3D Command Buffer where commands write to the relevant registers, and overwriting those locations.
When setting a memory address in a register, you must call the nn::gx::GetPhysicalAddr
function to convert the virtual memory address to a physical address.
The general format for register bit layouts is shown below.
8.8.1. Vertex Shader Setting Registers (0x004F - 0x0056 and Others)
This section describes the registers used for settings involved with vertex shaders, such as vertex shader starting addresses, vertex attributes, and the settings of floating-point constant registers.
8.8.1.1. Floating-Point Constant Registers (0x02C0, 0x02C1 - 0x02C8)
Vertex shaders have 96 floating-point constant registers (expressed in shader assembly code as c0
through c95
), each of which is comprised of the four components xyzw
. These can be set either by using the shader assembly def
instruction to define a constant, or by using a uniform to define a constant. When using shader assembly code, the value is set in the GPU's internal format as a 24-bit floating-point value (the lowest 16 bits are the significand, followed by 7 bits for the exponent and 1 bit for the sign). When using a uniform, the value is set as a 32-bit floating-point value (expressed as an IEEE 754 format single-precision floating point number), which is then automatically converted in the GPU to 24 bits.
Index Specifier (0x02C0)
Register 0x02C0
specifies which data input mode is used to write data to which floating-point constant register.
INDEX
specifies the index of a floating-point constant register. Setting this to 0x00
specifies register c0, 0x0A
specifies register c10, and 0x5F
specifies register c95. At the same time, setting MODE
to 1
sets the data input mode to 32-bit floating-point, and setting MODE
to 0
sets the data input mode to 24-bit floating-point.
The data to write to the four floating-point constant register components (x
, y
, z
, and w
) is written to one of the registers 0x02C1
through 0x02C8
. The results are identical no matter which of these registers is written to. For an extreme example, writing all the data to one register produces the same result as writing the data to registers in reverse order starting from 0x02C8
.
To set floating-point constant register values, first write the index and mode to 0x02C0
, and then write the data to registers 0x02C1
through 0x02C8
.
32-Bit Floating-Point Input Mode
In 32-bit floating-point input mode, four 32-bit data items are written to registers in the 0x02C1
to 0x02C8
range to set the value of one floating-point constant register. These four components are written in the order w
, z
, y
, and then x
.
After these four, 32-bit data items are written, the index is automatically incremented by 1, so that the next floating-point constant register to be set will be the one whose index is next after the specified index. In other words, if the index in register 0x02C0
is set to 0x0A
, the first four data items written to the 0x02C1
to 0x02C8
registers are then written to register c10
, and the next four data items are written to c11
.
Example:
Writing 0x80000023
to register 0x02C0
is interpreted as MODE = 1
, INDEX =35
, preparing the system to write to floating-point constant register c35
in 32-bit floating-point input mode. Then, if you write 0x40800000
to register 0x02C1
, 0x40400000
to 0x02C2
, 0x40000000
to 0x02C3
, and 0x3F800000
to 0x02C4
, and repeat this combination again (a total of 8 writes), registers c35.xyzw
and c36.xyzw
are both set to { 1.0f, 2.0f, 3.0f, 4.0f
}.
These settings could be expressed with 3D commands like the following.0x000F02C0_80000023
0x803F02C1_40800000 0x40000000_40400000 0x00000000_3F800000
0x803F02C1_40800000 0x40000000_40400000 0x00000000_3F800000
24-Bit Floating-Point Input Mode
In 24-bit floating-point input mode, four 24-bit data items are packed into three 32-bit data segments, and then written to registers in the 0x02C1
to 0x02C8
range to set the value of one floating-point constant register. The data is packed into the 32-bit segments in the component order w
, z
, y
, and then x
. The figure below shows how four, 24-bit data items are packed into three, 32-bit data items. For more information about converting 32-bit floating-point values to 24-bit floating-point values, see 8.9.1. Conversion to a 24-Bit Floating-Point Number.
After these three 32-bit data items are written, the index is automatically incremented by 1, so that the next floating-point constant register to be set will be the one whose index is next after the specified index. In other words, if the index in register 0x02C0
is set to 0x0A
, the first three data items written to the 0x02C1
to 0x02C8
registers are then written to register c10
, and the next three data items are written to c11
.
Example:
Writing 0x80000023
to register 0x02C0
is interpreted as MODE = 0
, INDEX =35
, preparing the system to write to floating-point constant register c35
in 24-bit floating-point input mode. If you then write 0x40800000
to register 0x02C1
, 0x40400000
to 0x02C2
, 0x40000000
to 0x02C3
, and 0x3F800000
to 0x02C4
, and repeat this combination again (a total of 6 writes), registers c35.xyzw
and c36.xyzw
are both set to { 1.0f, 2.0f, 3.0f, 4.0f
}.
These settings could be expressed with 3D commands like the following.0x000F02C0_00000023
0x802F02C1_41000040 0x003F0000_80004000
0x802F02C1_41000040 0x003F0000_80004000
8.8.1.2. Boolean Register (0x02B0)
Vertex shaders have 16 Boolean registers (expressed in shader assembly code as registers b0
through b15
). These can be set either by using the shader assembly def
instruction to define a constant, or by using a uniform to define a constant.
The bits [15:0] in register 0x02B0
correspond one-to-one with the vertex shader Boolean registers. Bit [0] corresponds to register b0
, bit [1] with b1
, and so on through bit [15] and register b15
. Writing a value of 1
represents TRUE
, and a value of 0
represents FALSE
.
Note that the 15th Boolean register (b15
) is reserved by the geometry shader when the geometry shader is used.
8.8.1.3. Integer Registers (0x02B1 - 0x02B4)
Vertex shaders have four floating-point constant registers (expressed in shader assembly code as i0
through i3
), each of which is comprised of the four components xyzw
. These can be set either by using the shader assembly defi
instruction to define a constant, or by using a uniform to define a constant.
Register 0x02B1 corresponds to i0, 0x02B2 to i1, 0x02B3 to i2, and 0x02B4 to i3. Each register stores the three components x
, y
, and z
in 8 bits each, starting from the lowest bit of the register. Setting negative numbers to y
and z
is expressed with two’s complement.
8.8.1.4. Program Code Setting Registers (0x02BF, 0x02CB – 0x02D3, 0x02D5 – 0x02DD)
There are multiple registers used to load the program code executed by vertex shaders. Specifically, there are registers that specify the addresses to which to load the programs and registers for writing program data.
In the ADDR
portion of register 0x02CB
, set the address to which to load the vertex shader program code. In registers 0x02CC
through 0x02D3
, write the data for the program code to load.
After you set the program code loading address in register 0x02CB
, write the data to any of the registers 0x02CC
through 0x02D3
. Each instruction in a vertex shader program is 32 bits, so writing one data item corresponds to writing one instruction, and the loading address is incremented by one after every write. The results are the same no matter which of these registers is written to.
After you have updated the program code, you must write 1
to any of the bits in register 0x02BF
to notify the GPU that program updating is complete.
In addition to loading the program code as described above, you must also load the swizzle pattern data. The following figure shows the registers that set swizzle patterns.
In the ADDR
portion of register 0x02D5
, set the address to which to load the swizzle pattern. In registers 0x02D6
through 0x02DD
, write the swizzle pattern data to load.
After you set the swizzle pattern loading address in register 0x02D5
, write the data to any of the registers 0x02D6
through 0x02DD
. The loading address is incremented by one after every write. The results are the same no matter which of these registers is written to.
8.8.1.5. Starting Address Setting Register (0x02BA)
The address of the main
label defined in the shader assembly code is the vertex shader starting address.
Set the vertex shader starting address in bits [15:0] of register 0x02BA
.
8.8.1.6. Attribute Input Count Setting Registers (0x0242, 0x02B9)
There are multiple registers used to set the number of vertex attributes to input to a vertex shader. The same value is set in each of these registers.
Set the value of count
to the number of vertex attributes to input minus 1
. Up to 12 vertex attributes can be input when the vertex buffer is used (when vertex data is loaded using a load array). When the vertex buffer is not used (when vertex data is loaded via the command buffer), up to 16 vertex attributes can be input.
8.8.1.7. Input Register Mapping Setting Registers (0x02BB, 0x02BC)
The following shows the registers used to set the mapping between the vertex attribute data to input to the vertex shader and the input registers.
In attrib_0
through attrib_15
, set index numbers designating which input registers store the vertex attribute data to input to the vertex shader. (Register v0
is indicated by a value of 0x0
, v1
by 0x1
and so on, with v15
indicated by 0xF
.) The order of the vertex attributes to input to the vertex shader does not correspond to the order specified by the index
parameter of the glBindAttribLocation
function. It corresponds instead with the internal vertex attribute numbers described in 8.8.1.9 Vertex Attribute Array Setting Registers (0x0200 – 0x0227).
8.8.1.8. Fixed Vertex Attribute Value Setting Registers (0x0232 – 0x0235)
The fixed vertex attribute values set by functions such as glVertexAttrib4f
are converted to 24-bit floating point numbers and transferred to the GPU. The fixed vertex attribute values are transferred to the GPU using the settings of the following registers.
First, the internal vertex attribute number of these fixed vertex attribute values is written to bits [3:0] of register 0x0232
. The fixed vertex attribute values are then converted to 24-bit floating point numbers and written, in order, to the three, 32-bit data segments in registers 0x0233
, 0x0234
, and 0x0235
. This 24-bit floating-point data written to the three, 32-bit segments is created in the same way as the data presented in the section titled 24-Bit Floating-Point Input Mode.
When you switch a vertex array from enabled to disabled or vice versa by using the registers described in 8.8.1.9. Vertex Attribute Array Setting Registers (0x0200 – 0x0227), it invalidates any fixed vertex attribute values that had previously been set. You must reset these fixed vertex attribute values. Also note that GPU specifications do not allow you to use all the vertex attributes as fixed vertex attributes without also using one or more vertex arrays. Always use at least one vertex array when you use fixed vertex attributes.
8.8.1.9. Vertex Attribute Array Setting Registers (0x0200 – 0x0227)
There are multiple registers used to set vertex attribute arrays when using vertex buffers. The commands for setting these registers are generated on validation of the state flag NN_GX_STATE_FRAMEBUFFER
.
The settings to these registers include the base address, the internal vertex attribute type, the fixed vertex attribute mask, the total number of vertex attributes, each load array's byte offset, information about the load array elements, the number of load array elements, the load array size in bytes, and the index array's offset.
Base Address
A value consisting of the physical address divided by 16 is set as the base address in ARRAY_BASE_ADDR
in register 0x0200
. All the vertex array addresses and the address of the vertex index array are set as the base address plus an offset. If the range of addresses for the vertex arrays and index array has been fixed in advance, the base address does not need to be reset for each combination of vertex arrays.
Internal Vertex Attribute Type
Internal vertex attributes are vertex attribute numbers determined internally by the GPU so that it can load vertex arrays. These numbers are different from the vertex attribute numbers specified by the index
parameter to the glEnableVertexAttribArray
function (which is hereafter called the GL vertex attribute numbers), but there is a one-to-one correspondence between internal vertex attributes and GL vertex attribute numbers.
Vertex arrays enabled by the glEnableVertexAttribArray
function are assigned internal vertex attributes as values starting at 0 and increasing sequentially with no numbering gaps, but GL vertex attribute number 0 does not necessarily correlate to internal vertex attribute 0. For instance, when the vertex arrays with GL vertex attribute numbers 0 and 3 have been enabled, they may be assigned internal vertex attributes 0 and 1, or they may be assigned the other way around as 1 and 0. The method by which internal vertex attribute values are assigned depends on the driver implementation, and is subject to change.
In the current implementation, the vertex array addresses are sorted in ascending order, and assigned internal vertex attributes, starting from 0 for the array with the leading GL vertex attribute.
The vertex attribute data is input to the vertex shader in the internal vertex attribute order.
The ARRAY_TYPEn
(n = 0 to 11) portion of registers 0x0201
– 0x0202
sets the type of the array having the (n + 1)th internal vertex attribute. The value set to the register as the internal vertex attribute type is determined by the combination of the size
and type
parameters specified in the glVertexAttribPointer
function. The following table shows how these combinations correspond with the values set to the register.
size | type | Value |
---|---|---|
1 | GL_BYTE |
0x0 |
1 | GL_UNSIGNED_BYTE |
0x1 |
1 | GL_SHORT |
0x2 |
1 | GL_FLOAT |
0x3 |
2 | GL_BYTE |
0x4 |
2 | GL_UNSIGNED_BYTE |
0x5 |
2 | GL_SHORT |
0x6 |
2 | GL_FLOAT |
0x7 |
3 | GL_BYTE |
0x8 |
3 | GL_UNSIGNED_BYTE |
0x9 |
3 | GL_SHORT |
0xA |
3 | GL_FLOAT |
0xB |
4 | GL_BYTE |
0xC |
4 | GL_UNSIGNED_BYTE |
0xD |
4 | GL_SHORT |
0xE |
4 | GL_FLOAT |
0xF |
Fixed Vertex Attribute Mask
The number of enabled vertex attributes is defined by a #pragma bind_symbol
statement in the vertex shader assembly code, but if those enabled vertex attributes have disabled vertex arrays (vertex attributes on which the glDisableVertexAttribArray
function has been called or on which the glEnableVertexAttribArray
function has not been called), fixed vertex attributes are used for them.
Much like vertex arrays, fixed vertex attributes are assigned internal vertex attributes. Fixed vertex attributes are assigned internal vertex attributes, starting from the last number assigned to a vertex array and increasing sequentially, with no numbering gaps.
A mask of the assigned internal vertex attributes is set by the CONST_ATTRIB_MASK
portion of register 0x0202
. This mask corresponds to internal vertex attributes 0 through 11, starting from its least-significant bit and ascending in order. Its bits are set to 1 for those internal vertex attributes that are assigned to fixed vertex attributes.
When you enable or disable a vertex array using these register settings, it invalidates any fixed vertex attribute values that had previously been set. You must reset these fixed vertex attribute values. Also note that GPU specifications do not allow you to use all the vertex attributes as fixed vertex attributes without also using one or more vertex arrays. Always use at least one vertex array when you use fixed vertex attributes.
Number of Vertex Attributes
Set the ARRAY_NUM
portion of register 0x0202
to the total number of fixed vertex attributes, plus vertex attributes that use vertex arrays, minus 1.
Load Arrays
In order to load vertex attribute data, the GPU manages vertex attribute arrays in units of data arrays. These data arrays are called load arrays, and the GPU loads data from 12 load arrays.
These 12 load arrays each have 12 elements. Each load array element contains either vertex array data that makes up the load array, or 4-byte units of padding. In general, if you have defined your vertex data as an array of structures holding multiple vertex attributes (this is called an interleaved array), that single interleaved array corresponds to a single load array. Conversely, if you define your vertex data as an array holding one vertex attribute (an independent array), that single vertex attribute corresponds to a single load array.
Load arrays must be used in ascending order from the first load array (load array 0). For example, you cannot use a combination like load array 1 and load array 4 that does not start from 0 and is not consecutive.
The actual address for a vertex attribute array is the value derived by adding the address allocated by the glBufferData
function to the offset specified by the glVertexAttribPointer
function's ptr
parameter. When setting the registers, this actual address is set as the result of this formula: [base address × 16 + the byte offset to the load array].
The ARRAYn_OFFSET
(n = 0 to 11) portions of registers (0x0203 + n × 3
) set the byte offsets to each (n + 1)th load array. GPU performance improves when there are fewer load arrays used, so the driver is configured to load all the data in as few load arrays as possible.
Set the ARRAYn_ELEMi
(n = 0 to 11, i = 0 to 11) portions of registers (0x0204 + n × 3
) through (0x0205 + n × 3
) to the (i + 1)th element of the (n + 1)th load array, starting from the first element of the first array. Set each element to either the internal vertex attribute used or padding. The values set to the register correspond to the elements as follows.
Value | Element |
---|---|
0x0 | Internal vertex attribute 0 |
0x1 | Internal vertex attribute 1 |
0x2 | Internal vertex attribute 2 |
0x3 | Internal vertex attribute 3 |
0x4 | Internal vertex attribute 4 |
0x5 | Internal vertex attribute 5 |
0x6 | Internal vertex attribute 6 |
0x7 | Internal vertex attribute 7 |
0x8 | Internal vertex attribute 8 |
0x9 | Internal vertex attribute 9 |
0xA | Internal vertex attribute 10 |
0xB | Internal vertex attribute 11 |
0xC | 4-byte padding |
0xD | 8-byte padding |
0xE | 12-byte padding |
0xF | 16-byte padding |
For example, when 0x0
is set in ARRAY0_ELEM0
in register 0x0204
, the first element of the first load array is set to internal vertex attribute 0. The data type of the data placed at the start of the load array's data structure is the data type of internal vertex attribute 0, which is set in ARRAY_TYPE0
in register 0x0201
.
Set the ARRAYn_STRIDE
portion of registers (0x0205 + n × 3
) to the number of bytes per vertex in the (n + 1)th load array. Load arrays having elements of multiple different types sometimes automatically include padding. ARRAYn_STRIDE
must set the number of bytes in the array which include padding. If the value set there contradicts the total size of the elements in the load array, operation is undefined. Set the ARRAYn_ATTRIB_NUM
(n = 0 to 11) portion of registers (0x0205 + n × 3
) to the number of attributes in the (n + 1)th load array. One load array can sometimes include multiple vertex attribute arrays, for example when multiple vertex attribute arrays are laid out as interleaved arrays. The value set as the number of attributes in the (n + 1)th load array is not the same as the number of vertex attribute arrays included in that load array. If the load array's number of attributes is set to 0, that load array is not used.
Set the INDEX_ARRAY_OFFSET
portion of register 0x0227
to the byte offset to the index array. Set the INDEX_ARRAY_TYPE
portion of register 0x0227
to the vertex index type. Set this type value to 1 when the type
parameter to the glDrawElements
function is GL_UNSIGNED_SHORT
, and 0 when type is GL_UNSIGNED_BYTE
. Always set to 1
when using the glDrawArrays
function.
Example 1: Interleaved Array
struct vertex_t { float position[3]; float color[4]; float texcoord[2]; } vertex[NUM_VERTEX];
When vertex data consists of the structure above, the vertex array settings are configured as follows.
glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, sizeof(struct vertex_t), 0); glVertexAttribPointer(1, 4, GL_FLOAT, GL_FALSE, sizeof(struct vertex_t), 12); glVertexAttribPointer(2, 2, GL_FLOAT, GL_FALSE, sizeof(struct vertex_t), 28);
In this case, the three GL vertex attributes 0, 1, and 2 comprise the elements of one load array. If the only used vertex attributes are the three above and one fixed vertex attribute, the above vertex attributes 0, 1, and 2 correspond to internal vertex attributes 0, 1, and 2, and the fixed vertex attribute corresponds to internal vertex attribute 3. Thus the related register settings are as shown below.
0x0201: 0x000007FB
// Internal vertex attribute types are 0:FLOAT_VEC3
, 1:FLOAT_VEC4
, 2:FLOAT_VEC2
.0x0202: 0x30080000
// Four vertex attributes in total, with internal vertex attribute 3 as a fixed vertex attribute.0x0203: 0x00000000
// Only one load array is used, so the base address is set to the actual address.0x0204: 0x00000210
// The elements of load array 0 are the internal vertex attributes 0, 1, and 2.0x0205: 0x30240000
// There are float × 9 = 36
bytes per each vertex in load array 0, and the number of elements is 3.0x0206 – 0x0226: 0x00000000
// Other load arrays not used.
Example 2: Independent Array
#define NUM_VERTEX (3) struct attribute0_t { float position[3]; } attirbute0[NUM_VERTEX]; struct attribute1_t { float color[4]; } attribute1[NUM_VERTEX]; struct attribute2_t { float tex[2]; } attribute2[NUM_VERTEX];
When vertex data consists of the structure above, the vertex array settings are configured as follows. The vertex buffer is a single shared object and assumes that the data is laid out in order.
glBindBuffer(GL_ARRAY_BUFFER, 1); glBufferData(GL_ARRAY_BUFFER, sizeof(attribute0)+sizeof(attribute1)+sizeof(attribute2),0,GL_STATIC_DRAW); glBufferSubData(GL_ARRAY_BUFFER, 0, sizeof(attribute0), attribute0); glBufferSubData(GL_ARRAY_BUFFER, sizeof(attribute0), sizeof(attribute1), attribute1); glBufferSubData(GL_ARRAY_BUFFER, sizeof(attribute0)+sizeof(attribute1), sizeof(attribute2), attribute2); glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 0, 0); glVertexAttribPointer(1, 4, GL_FLOAT, GL_FALSE, 0, (GLvoid*)(sizeof(attribute0))); glVertexAttribPointer(2, 2, GL_FLOAT, GL_FALSE, 0, (GLvoid*)(sizeof(attribute0)+sizeof(attribute1)));
In this case, the three GL vertex attributes 0, 1, and 2 each comprise elements of separate load arrays, corresponding to internal vertex attributes 0, 1, and 2, respectively. Thus the related register settings are as given below.
0x0201: 0x000007FB
// Internal vertex attribute types are 0:FLOAT_VEC3
, 1:FLOAT_VEC4
, 2:FLOAT_VEC2
.0x0202: 0x20000000
// Three vertex attributes in total, with no fixed vertex attributes.0x0203: 0x00000000
// Load array 0 is located at the start.0x0204: 0x00000000
// The element of load array 0 is one element of internal vertex attribute 0.0x0205: 0x100C0000
// There are float × 3 = 12
bytes per each vertex in load array 0, and the number of elements is 1.0x0206: 0x00000024
// The offset to load array 1 is sizeof(attribute0)
.0x0207: 0x00000001
// The element of load array 1 is one element of internal vertex attribute 1.0x0208: 0x10100000
// There are float × 4 = 16
bytes per each vertex in load array 1, and the number of elements is 1.0x0209: 0x00000054
// The offset to load array 2 is sizeof(attribute0) + sizeof(attribute1)
.0x020A: 0x00000002
// The element of load array 2 is one element of internal vertex attribute 2.0x020B: 0x10080000
// There are float × 2 = 8
bytes per each vertex in load array 2, and the number of elements is 1.0x020C – 0x0226: 0x00000000
// Other load arrays not used.
Load Array Padding Elements and Automatic Padding
If the ARRAYn_ELEMi
(n = 0 to 11, i = 0 to 11) portion of registers (0x0204 + n × 3
) through (0x0205 + n × 3
) is set to any values 0xC
through 0xF
, that element is padding. Padding is used when the load arrays include regions not used as vertex attributes.
For example, we can create vertex data that uses structures like the following.
struct vertex_t { float position[3]; float color[4]; float texcoord[2]; } vertex[NUM_VERTEX];
Assume that texcoord
is not used above as a vertex attribute. The byte size per vertex is float × 9
, but the last float × 2
bytes is not used. Therefore, internal vertex attributes are specified in the first and second elements of the load array corresponding to this vertex data, but the third element specifies 0xD
(8 bytes of padding).
When one load array has elements containing vertex attributes of multiple different data types (GL_FLOAT
, GL_SHORT
, GL_BYTE
, or GL_UNSIGNED_BYTE
), up to 4 bytes of padding is sometimes automatically inserted, even when no padding is specified by the load array elements. Elements in a load array have several possible sizes. Each individual element comprising a load array might be either a 4-byte type (an internal vertex attribute of type GL_FLOAT
, or padding), a 2-byte type (an internal vertex attribute of type GL_SHORT
), or a 1-byte type (an internal vertex attribute of type GL_BYTE
or GL_UNSIGNED_BYTE
). Padding is automatically inserted for each element of the load array, so that elements align to their own particular sizes. Padding is also automatically inserted at the end of each instance of vertex data, so that the vertex data is aligned to the size of the biggest data type contained by any element in the array.
For example, we can create vertex data that uses structures like the following.
struct vertex_t { GLfloat position[3]; GLubyte color[3]; GLfloat texcoord[2]; GLubyte param; } vertex[NUM_VERTEX];
Assume that the load array has as its elements the four vertex attributes of position
, color
, texcoord
, and param
. As shown above, color
is 3 bytes, but the data coming immediately after it in the array (texcoord
, of type GLfloat
) is aligned to 4 bytes. This means that one byte of padding is automatically inserted right after color
.
Because the element with the largest size in the load array is of type GLfloat
, padding is also automatically inserted after each instance of vertex data to ensure 4-byte alignment of the vertex data. This means that three bytes of padding are automatically inserted right after param
.
Load Array Settings and Performance
The performance of loading vertex data depends on the size and number of the load arrays you use, and on other factors such as the type of elements that they contain.
The GPU accesses memory in units of one load array at a time, but there is no cache and it uses the same resources to load multiple load arrays from different addresses as it does to load multiple load arrays from the same address.
When you want to load the same vertex array into multiple vertex shader input registers, one approach is to load multiple load arrays from the same address. However, an alternative approach is to make several copies of that vertex array to create an interleaved array, and load the interleaved array. This second approach entails a larger data size than the first approach, but it may have better runtime performance.
Even when loading the same size of vertex data in either case, loading it via a single load array has better performance than loading it via multiple load arrays. This difference in performance is less pronounced when the vertex indices are optimized to be consecutive, and is also affected by other modules that access device memory. For example, even without accounting for access conflicts in device memory, it takes approximately 1.3 to 2 times as long for two load arrays to each load three GLfloat
vertex data values than for a single load array to load all six values. However, both these cases will have the same performance if the vertex arrays are placed in VRAM.
Load Array Limit
A maximum of 11 load arrays can be used when rendering with the glDrawElements
function (start rendering by writing to register 0x022F
). When using 12 vertex attributes to render with the glDrawElements
function, at least 1 vertex attribute must be static, and at least 2 must be interleaved arrays. Use no more than 11 load arrays. If 12 load arrays are used to start rendering with the glDrawElements
function, the GPU may freeze. The GPU can also freeze in other cases where the load array setting is inappropriate.
If the GPU freezes due to an illegal load array setting, the value acquired by the nnqxGetCmdlistParameteri
function in NN_GX_CMDLIST_HW_STATE
enters a state where 1 is set at the eighth bit only.
8.8.1.10. Output Register Use Count Setting Registers (0x004F, 0x024A, 0x0251, 0x025E)
There are multiple registers used to set the number of vertex attributes output from a vertex shader. Set the same value in most of these registers.
Set the unchanged raw number of output registers to use in count1
(register 0x004F
), and set the number of used output registers minus 1 in count2
(registers 0x024A
, 0x0251
, and 0x025E
). Note that count1
has a different value and bit width from the others.
The number of output registers is defined by a #pragma output_map
statement in the vertex shader assembly code. Because this is the number of output registers actually used, this number is 1
when multiple vertex attributes are packed into one output register.
8.8.1.11. Output Register Mask Setting Register (0x02BD)
The register that sets which output registers are written by vertex shaders contains a mask of 16 bits, corresponding to the 16 output registers.
The bits [15:0
] in register 0x02BD
correspond one-to-one with the output registers. Bit [0] is register o0
, bit [1] is register o1
, and so on, up to bit [15] for register o15
. In this mask, set the bits corresponding to the output registers defined by the #pragma output_map
statement to 1
. Set the bits corresponding to output registers not so defined to 0
.
8.8.1.12. Output Register Attribute Setting Registers (0x0050 – 0x0056, 0x0064)
There are seven output registers available to output vertex attributes from vertex shaders. There are multiple registers used to set the attributes output in each component of these output registers. Set the attributes in order of output register, starting from the lowest-numbered used output register.
The names in the bit layout correspond to the following output attribute settings. The following table also gives the number of bits and possible values of each name.
Name | Bits | Description |
---|---|---|
attrib_x attrib_y attrib_z attrib_w |
5 |
These are the vertex attributes set to the 0x00: Vertex coordinate x component. |
texcoord | 1 |
Sets whether to include texture coordinates in the vertex attributes output from vertex shaders.
|
The following example describes the register settings when the vertex shaders are defined as follows.
#pragma output_map(position, o0) #pragma output_map(color, o1) #pragma output_map(texture0, o2.xy) #pragma output_map(texture0w, o2.z) #pragma output_map(texture1, o3.xy)
Registers are set as follows.
0x0050: 0x03020100
0x0051: 0x0B0A0908
0x0052: 0x1F100D0C
// The w component is disabled.0x0053: 0x1F1F0F0E
// The zw components are disabled.0x0054: 0x1F1F1F1F
// The 5th attribute is disabled.0x0055: 0x1F1F1F1F
// The 6th attribute is disabled.0x0056: 0x1F1F1F1F
// The 7th attribute is disabled.0x0064: 0x00000001
// Output texture coordinates.
8.8.1.13. Output Attribute Clock Control Register (0x006F)
Depending on what kinds of vertex attributes are output from vertex shaders, sometimes you must change the setting of the clock control register.
The names in the bit layout correspond to the following output attribute clock control settings. The following table also gives the possible settings of each bit.
Name | Bits | Description |
---|---|---|
vectorZ | 1 | Set to 1 to output the z component of vertex coordinates and 0 to not output it. |
vertexColor | 1 | Set to 1 to output the vertex color and 0 to not output it. |
texture0 | 1 | Set to 1 to output texture coordinate 0 and 0 to not output it. |
texture1 | 1 | Set to 1 to output texture coordinate 1 and 0 to not output it. |
texture2 | 1 | Set to 1 to output texture coordinate 2 and 0 to not output it. |
texture0w | 1 | Set to 1 to output the w coordinate of texture coordinate 0 and 0 to not output it. |
viewVector | 1 | Set to 1 to output view vectors and quaternions and 0 to not output them. |
These bits control power supply to the modules involved with the corresponding vertex attributes. Set the bits corresponding to unused vertex attributes to 0
to reduce power consumption.
8.8.2. Texture Address Setting Registers (0x0085 – 0x008A, 0x0095, 0x009D)
There are multiple registers used to set texture data addresses. For texture unit 0 there are 2D texture and cube map texture settings, and for texture units 1 and 2 there are 2D texture settings.
This section describes just the settings for addresses of texture data bound to the various targets. You can change the layout of texture data using this information. To change the resolution, filter mode, number of mipmap levels, or other characteristics of textures, see 8.8.6. Texture Setting Registers (0x0080, 0x0083, 0x008B, 0x00A8 – 0x00B7).
Texture addresses are all 8-byte addresses. (An 8-byte address is the value resulting when the physical address is divided by 8.) The highest 6 bits of the 28-bit combined texture address of all six faces of a cube map are shared with bits [27:22] of register 0x0085
.
Texture addresses must be 128-byte aligned. If texture addresses are not correctly aligned, various phenomena may occur, such as the GPU may hang or rendered results may be corrupted.
8.8.3. Render Buffer Setting Registers (0x006E, 0x0116, 0x0117, 0x011C – 0x011E)
Among the render buffer-related settings, there are multiple registers for setting the color and depth buffers. The commands for setting these registers are generated on validation of the state flag NN_GX_STATE_FRAMEBUFFER
.
The names in the bit layout correspond to the following render buffer settings. The following table also gives the number of bits and possible values of each name.
Name | Bits | Description |
---|---|---|
COLORBUFFER_FORMAT | 3 |
Sets the color buffer format. 0x0 : |
COLORBUFFER_PIXEL | 2 |
Sets the pixel size of the color buffer format.
|
COLORBUFFER_WIDTH | 11 | Sets the color buffer width in pixels. |
COLORBUFFER_HEIGHT | 10 | Sets the color buffer height in pixels, minus 1. |
COLORBUFFER_ADDR | 28 | Sets the color buffer address as an 8-byte address (a physical address divided by 8). |
DEPTHBUFFER_FORMAT | 2 |
Sets the depth buffer format. 0x0 : |
DEPTHBUFFER_WIDTH | 11 | Sets the depth buffer width in pixels. |
DEPTHBUFFER_HEIGHT | 10 | Sets the depth buffer height in pixels, minus 1. |
DEPTHBUFFER_ADDR | 28 | Sets the depth buffer address as an 8-byte address (a physical address divided by 8). |
Always specify 0xF
in byte-enable for the setting command for register 0x011E
(COLORBUFFER_WIDTH
and COLORBUFFER_HEIGHT
).
8.8.4. Texture Combiner Setting Registers (0x00C0 – 0x00C4 and Others)
The register settings that set the dmp_TexEnv[i]
texture combiner reserved uniforms are split among multiple registers. These registers start at a different address (comb_top
) for each differently-numbered combiner.
Combiner Number | Starting Address (comb_top) |
---|---|
0 | 0x00C0 |
1 | 0x00C8 |
2 | 0x00D0 |
3 | 0x00D8 |
4 | 0x00F0 |
5 | 0x00F8 |
The names in the bit layout correspond to the following reserved uniforms. The following table also gives the number of bits and possible values of each name.
Name | Bits | Description |
---|---|---|
srcRgb0 srcRgb1 srcRgb2 |
4 |
From the top, the first through third elements of the values set in 0x0 : |
srcAlpha0 srcAlpha1 srcAlpha2 |
4 |
From the top, the first through third elements of the values set in The settings are the same as for srcRgb0 through srcRgb2. |
operandRgb0 operandRgb1 operandRgb2 |
4 |
From the top, the first through third elements of the values set in 0x0 : |
operandAlpha0 operandAlpha1 operandAlpha2 |
3 |
From the top, the first through third elements of the values set in 0x0 : |
combineRgb | 4 |
The values set in 0x0 : |
combineAlpha | 4 |
The values set in These are the same as the |
constRgba0 constRgba1 constRgba2 constRgba3 |
8 |
From the top, the first through third elements of the values set in Set using unsigned 8-bit integers, mapping the values For more information about how these values are converted, see 8.9.16. Conversion From a Floating-Point Number (Between 0 and 1) to an 8-Bit Unsigned Integer. |
scaleRgb scaleAlpha |
2 |
The values set in 0x0 : 1.0 |
dmp_TexEnv[i].srcRgb
and dmp_TexEnv[i].srcAlpha
settings have the following restrictions:
- When
i
is 0,GL_PREVIOUS
andGL_PREVIOUS_BUFFER_DMP
cannot be configured for the three elements ofdmp_TexEnv[i].srcRgb
anddmp_TexEnv[i].srcAlpha
. - When
i
is not 0, at least one of the three elements in each ofdmp_TexEnv[i].srcRgb
anddmp_TexEnv[i].srcAlpha
must be set to one ofGL_CONSTANT
,GL_PREVIOUS
, orGL_PREVIOUS_BUFFER_DMP
.
8.8.4.1. Combiner Buffer Setting Registers (0x00E0, 0x00FD)
The reserved uniforms with names beginning with dmp_TexEnv[i].
are also used for combiner buffer settings. The registers used to set values in combiner buffer reserved uniforms have the following layout. Note that the other bits in register 0x00E0
are used for other settings (such as gas settings).
The names in the bit layout correspond to the following reserved uniforms. The following table also gives the number of bits and possible values of each name.
Name | Bits | Description |
---|---|---|
bufferColor0 bufferColor1 bufferColor2 bufferColor3 |
4 |
From the top, the first through fourth elements of the values set in Set using unsigned 8-bit integers, mapping the values For more information about how these values are converted, see 8.9.16. Conversion From a Floating-Point Number (Between 0 and 1) to an 8-Bit Unsigned Integer. |
bufferInput0 bufferInput1 |
1 * 4 |
From the top, the first through second elements of the values set in
|
8.8.5. Fragment Lighting Setting Registers (0x008F and Others)
This section describes registers involved with reserved uniforms used in fragment lighting. These are reserved uniforms that include dmp_FragmentLighting
, dmp_FragmentMaterial
, dmp_FragmentLightSource[i]
, or dmp_LightEnv
in their names.
8.8.5.1. Lighting Enable/Disable Control Registers (0x008F, 0x01C2, 0x01C6, 0x01D9)
The registers corresponding to the reserved uniforms that enable and disable lighting have the following layout.
The names in the bit layout correspond to the following reserved uniforms. The following table also gives the number of bits and possible values of each name.
Name | Bits | Description |
---|---|---|
enabled0 | 1 |
The value set in
|
enabled1 | 1 |
The value set in
Note: Pay close attention to these values. |
src_num | 3 |
Set to the [number of enabled light sources – 1]. Set to The number of enabled light sources is the number of light sources for which |
id1 through id8 | 3 |
Enabled light source IDs are assigned starting from If light sources 0, 1, 3, and 5 are enabled, this register is set to When the same light source is specified multiple times, the lighting result for that light source will be applied multiple times. When light source 0 is specified multiple times, the primary color’s global ambient setting will also be applied multiple times in addition to the lighting result. |
8.8.5.2. Global Ambient Setting Register (0x01C0)
The global ambient settings are configured by setting the values for the RGB components in individual bits of the 0x01C0
register. The values to set are calculated using the equation below, clamped to the range [0.0,1.0
], and then mapped to the unsigned 8-bit integers 0 through 255. For more information about how these values are converted, see 8.9.16. Conversion From a Floating-Point Number (Between 0 and 1) to an 8-Bit Unsigned Integer. If light source 0 is not enabled, this register's setting is ignored and a value of 0
is used for the primary color's global ambient term.
dmp_FragmentMaterial.emission + dmp_FragmentMaterial.ambient × dmp_FragmentLighing.ambient
Example:
When the emissive light and ambient light for a material are (0.8, 0.8, 0.8, 0.6
) and (0.2, 0.2, 0.2, 0.4
), respectively, and the global ambient light is (1.0, 1.0, 1.0, 1.0
), the value to set for the components is obtained by converting the value calculated below. The final converted value is 143 (0x8F
).
(0.8 × 0.6) + (0.2 × 0.4) × (1.0 × 1.0) = 0.56
The bit layout for the setting registers is as follows. Although 10 bits are provided for each component, the calculated results are stored in the lower 8 bits and 0
is stored in the upper 2 bits. (The figure shows in which 8 bits the values are actually set.) Behavior is undefined when the upper 2 bits are not set to 0
.
If lighting is enabled (dmp_FragmentLighting.enabled
is GL_TRUE
) and all light sources are disabled (dmp_FragmentLightSource[i].enabled
are GL_FALSE
), only the global ambient color is applied to the primary color.
Register 0x01C2
sets the number of enabled light sources, and the value of its bits [2:0] is the [number of light sources - 1]. Consequently, enabling lighting also enables one light source, even if that is not what you intended. In such cases, the driver generates commands that set black (0.0, 0.0, 0.0, 0.0
) in all color components of light source 0 (in other words, registers 0x0140
through 0x0143
are set to 0). The driver also ensures that light source 0 is the enabled light source by generating commands that set light source 0 to be the first light source enabled (commands that set bits [2:0] of register 0x01D9
equal to 0x0
). In addition, it generates commands that set dmp_LightEnv.config
equal to GL_LIGHT_ENV_LAYER_CONFIG0_DMP
(commands that set bits [7:4] of register 0x01C3
equal to 0x0
).
8.8.5.3. Light Source Setting Registers (0x01C4, 0x0140 – 0x014F and Others)
All individual light settings are based on light source numbers. There are multiple registers for these settings. Their starting address light_top
is based on this formula: [light_top
= 0x0140
+ light source number × 0x10
]. For example, to set the colors of light sources 0 and 3 in dmp_FragmentLightSource[0].specular0
and dmp_FragmentLightSource[3].specular0
, the corresponding registers are 0x140
and 0x170
respectively.
Light Source Color Setting Registers (0x0140 – 0x0143 and Others)
Light source colors are configured by setting the RGB component values for specular light 0 (LightSpecular0
), specular light 1 (LightSpecular1
), diffuse light (LightDiffuse
), and ambient light (LightAmbient
) in the individual bits of the registers starting from the light_top addresses. The values to set are calculated using the equation below, clamped to the range [0.0,1.0
], and then mapped to the unsigned 8-bit integers 0 through 255. For more information about how these values are converted, see 8.9.16. Conversion From a Floating-Point Number (Between 0 and 1) to an 8-Bit Unsigned Integer.
LightSpecular0 = dmp_FragmentMaterial.specular0 × dmp_FragmentLightSource[i].specular0
When dmp_LightEnv.lutEnabledRefl
is GL_FALSE
:
LightSpecular1 = dmp_FragmentMaterial.specular1 × dmp_FragmentLightSource[i].specular1
When dmp_LightEnv.lutEnabledRefl
is GL_TRUE
:
LightSpecular1 = dmp_FragmentLightSource[i].specular1
LightDiffuse = dmp_FragmentMaterial.diffuse × dmp_FragmentLightSource[i].diffuse
LightAmbient = dmp_FragmentMaterial.ambient × dmp_FragmentLightSource[i].ambient
The bit layout for the setting registers is as follows.
Light Source Position Setting Registers (0x0144, 0x0145, 0x0149 and Others)
The dmp_FragmentLightSource[i].position
reserved uniform sets light source position coordinates. The xyz
components of these coordinates are converted to 16-bit floating-point values before being set to registers. For more information about how these values are converted, see 8.9.2. Conversion to a 16-Bit Floating-Point Number. For the w
component, set whether the light source is point or directional in bit [0:0
] of the (light_top + 9
) register. Note that the bit is set to 1
if the w
component value is 0.0
, and set to 0
otherwise. Also note that the other bits in this register are used for separate settings.
The bit layout for the setting registers is as follows.
Spotlight Direction Setting Registers (0x0146, 0x0147 and Others)
The dmp_FragmentLightSource[i].spotDirection
reserved uniform sets the coordinates of the spotlight direction. Before these xyz
coordinates are set to registers, their signs are flipped, and they are converted to signed 13-bit fixed-point numbers with 11 fractional bits (using two's complement to represent negative values). For more information about how these values are converted, see 8.9.9. Conversion to a 13-Bit Signed Fixed-Point Number With 11 Fractional Bits.
The bit layout for the setting registers is as follows.
Distance Attenuation Setting Registers (0x014A, 0x014B and Others)
The distance attenuation bias and scale values set by the distance attenuation reserved uniforms dmp_FragmentLightSource[i].distanceAttenuationBias
and dmp_FragmentLightSource[i].distanceAttenuationScale
are set to registers after being converted to 20-bit floating point numbers. For more information about how these values are converted, see 8.9.4. Conversion to a 20-Bit Floating-Point Number.
The bit layout for the setting registers is as follows.
Other Setting Registers (0x01C4, 0x0149 and Others)
The registers that correspond to other settings for each light source have the following layout.
Reserved Uniform | Register | Description |
---|---|---|
dmp_FragmentLightSource[i].shadowed |
( |
Note: Pay close attention to these values. |
× dmp_FragmentLightSource[i].specular0 |
( |
Note: Pay close attention to these values. |
× dmp_FragmentLightSource[i].specular0 |
( |
Note: Pay close attention to these values. |
dmp_FragmentLightSource[i].twoSideDiffuse |
|
|
dmp_FragmentLightSource[i].geomFactor0 |
( |
|
dmp_FragmentLightSource[i].geomFactor1 |
( |
|
8.8.5.4. Lookup Table Setting Registers (0x01C5, 0x01C8 – 0x01CF)
Fragment lighting lookup tables specified by the dmp_FragmentMaterial.sampler{RR, RG, RB, D0, D1, FR}
and dmp_FragmentLightSource[i].sampler{SP, DA}
reserved uniforms are set using 256 data items and the same number of delta values.
The bit layout for the setting registers is as follows.
In register 0x01C5
, set Ref_Table
to the lookup table you are targeting and Ref_Index
to the index at which to start setting the table. An index of 0
indicates the first data item, and 255
indicates the last data item. The values set in Ref_Table
correspond to the following lookup tables.
Ref_Table | Targeted Lookup Table |
---|---|
0x0 | Distribution factor 0 (D0) |
0x1 | Distribution factor 1 (D1) |
0x3 | Fresnel factors (FR) |
0x4 | Blue component of reflection (RB) |
0x5 | Green component of reflection (RG) |
0x6 | Red component of reflection (RR) |
0x8 + i | Spotlight (SP), where i is the light source number |
0x10 + i | Light's distance attenuation (DA), where i is the light source number |
To set registers 0x01C8
through 0x01CF
, write the values derived from combining the ith data item and (i + 256)th delta value in the lookup table loaded by the glTexImage1D
function. Set Ref_Value
to the data item converted to an unsigned 12-bit fixed-point number with 12 fractional bits, and set Ref_Difference
to the delta value converted to a signed 12-bit fixed-point number with 11 fractional bits. (The fractional portion is an absolute value, so negative numbers are not expressed using two's complement.) For more information about how these values are converted, see 8.9.13. Conversion to a 12-Bit Unsigned Fixed-Point Number With 12 Fractional Bits and 8.9.6. Conversion to a 12-Bit Signed Fixed-Point Number With 11 Fractional Bits.
After setting the targeted lookup table and index in register 0x01C5
, write the sets of converted values to any register in the range 0x01C8
– 0x01CF
. The results are the same no matter which of these registers is written to, and the index is incremented by one for each data item that is written.
8.8.5.5. Lookup Table Argument Range Setting Register (0x01D0)
The register for setting the lookup table argument ranges has the following layout.
Reserved Uniform | Register | Description |
---|---|---|
dmp_LightEnv.absLutInputD0 |
0x01D0, bit [1:1] |
Note: Pay close attention to these values. |
dmp_LightEnv.absLutInputD1 |
0x01D0, bit [5:5] | Same as above. |
dmp_LightEnv.absLutInputSP |
0x01D0, bit [9:9] | Same as above. |
dmp_LightEnv.absLutInputFR |
0x01D0, bit [13:13] | Same as above. |
dmp_LightEnv.absLutInputRB |
0x01D0, bit [17:17] | Same as above. |
dmp_LightEnv.absLutInputRG |
0x01D0, bit [21:21] | Same as above. |
dmp_LightEnv.absLutInputRR |
0x01D0, bit [25:25] | Same as above. |
8.8.5.6. Lookup Table Input Values Setting Register (0x01D1)
The register for setting the lookup table input values has the following layout.
Reserved Uniform | Register | Description |
---|---|---|
dmp_LightEnv.lutInputD0 |
0x01D0, bit [2:0] |
0x0 : |
dmp_LightEnv.lutInputD1 |
0x01D1, bits [6:4] | Same as dmpLightEnv.lutInputD0. |
dmp_LightEnv.lutInputSP |
0x01D1, bits [10:8] | Same as dmpLightEnv.lutInputD0. |
dmp_LightEnv.lutInputFR |
0x01D1, bits [14:12] |
0x0: |
dmp_LightEnv.lutInputRB |
0x01D1, bits [18:16] | Same as dmp_LightEnv.lutInputFR. |
dmp_LightEnv.lutInputRG |
0x01D1, bits [22:20] | Same as dmp_LightEnv.lutInputFR. |
dmp_LightEnv.lutInputRR |
0x01D1, bits [26:24] | Same as dmp_LightEnv.lutInputFR. |
8.8.5.7. Lookup Table Output Value Scale Setting Register (0x01D2)
The register for setting the scale values to apply to the lookup table output values has the following layout.
Reserved Uniform | Register | Description |
---|---|---|
dmp_LightEnv.lutScaleD0 |
0x01D0, bit [2:0] |
0x0 : 1.0 |
dmp_LightEnv.lutScaleD1 |
0x01D2, bits [6:4] | Same as above. |
dmp_LightEnv.lutScaleSP |
0x01D2, bits [10:8] | Same as above. |
dmp_LightEnv.lutScaleFR |
0x01D2, bits [14:12] | Same as above. |
dmp_LightEnv.lutScaleRB |
0x01D2, bits [18:16] | Same as above. |
dmp_LightEnv.lutScaleRG |
0x01D2, bits [22:20] | Same as above. |
dmp_LightEnv.lutScaleRR |
0x01D2, bits [26:24] | Same as above. |
8.8.5.8. Shadow Attenuation Setting Register (0x01C3)
The register for setting the shadow attenuation has the following layout. Note that the other bits in this register are used for separate settings.
The names in the bit layout correspond to the following reserved uniforms. The following table also gives the number of bits and possible values of each name.
Name | Bits | Description |
---|---|---|
shadowSelector | 2 |
The value set in
|
shadowPrimary | 1 |
The value set in
|
shadowSecondary | 1 |
The value set in
|
invertShadow | 1 |
The value set in
|
shadowAlpha | 1 |
The value set in
|
shadowAttn | 1 |
Set to |
8.8.5.9. Other Setting Registers (0x01C3, 0x01C4)
The registers for other fragment lighting settings have the following layout. Note that the other bits in this register are used for separate settings.
The names in the bit layout correspond to the following reserved uniforms. The following table also gives the number of bits and possible values of each name.
Name | Bits | Description |
---|---|---|
config | 4 |
The value set in
|
fresnelSelector | 2 |
The value set in 0x0 : |
enabledFresnelSelector | 1 |
Changes depending on the value set in
|
bumpSelector | 2 |
The value set in
|
bumpMode | 2 |
The value set in
|
bumpRenorm | 1 |
The value set in
|
clampHighlights | 1 |
The value set in
|
lutEnabledD0 | 1 |
The value set in
Note: Pay close attention to these values. |
lutEnabledD1 | 1 |
The value set in
Note: Pay close attention to these values. |
lutEnabledRefl | 3 |
The value set in
Note: Pay close attention to these values. |
The value of config
(bits [7:4
] of register 0x01C3
) changes the number of cycles used for pixel operations. Because the config
setting has an effect even when lighting is disabled, when lighting is turned off, be sure to change it to a setting that causes pixel operations to use one cycle. When lighting is disabled, the driver sets bits [7:4
] of register 0x01C3
equal to 0x0
. In addition, when the setting value is (8:GL_LIGHT_ENVLAYER_CONFIG7_DMP
), distance attenuation can no longer be used. If the distance attenuation setting remains enabled, an illegal value will be applied to distance attenuation, so set bits [31:24
] of register 0x01C4
to 0xFF
to disable distance attenuation
8.8.6. Texture Setting Registers (0x0080, 0x0083, 0x008B, 0x00A8 – 0x00B7)
This section describes registers involved with reserved uniforms that make texture settings (uniforms whose names include dmp_Texture[i]
). The commands for setting these registers are generated on validation of the state flag NN_GX_STATE_TEXTURE
. Also see 8.8.2. Texture Address Setting Registers (0x0085 – 0x008A, 0x0095, 0x009D).
8.8.6.1. Shadow Texture Setting Register (0x008B)
The register for setting shadow textures has the following layout.
The names in the bit layout correspond to the following reserved uniforms. The following table also gives the number of bits and possible values of each name.
Name | Bits | Description |
---|---|---|
perspectiveShadow | 1 |
The value set in
Note: Pay close attention to these values. |
shadowZBias | 23 |
The value set in This uses the upper 23 bits of the result of converting to a 24-bit fixed-point number. For more information about how these values are converted, see 8.9.14. Conversion to a 24-Bit Signed Fixed-Point Number With 24 Fractional Bits. |
8.8.6.2. Texture Sampler Type Setting Registers (0x0080, 0x0083)
The registers for setting the texture sampler types have the following layout. Note that the other bits in these registers are used for separate settings.
The names in the bit layout correspond to the following reserved uniforms. The following table also gives the number of bits and possible values of each name.
Name | Bits | Description |
---|---|---|
samplerType0 | 1 + 3 |
The values set to the corresponding bits in registers When the uniform is When the uniform has a setting other than 0x0 : |
samplerType1 | 1 |
The value set in
|
samplerType2 | 1 |
The value set in
|
samplerType3 | 1 |
The value set in
|
The dmp_Texture[0].samplerType
, dmp_Texture[1].samplerType
, and dmp_Texture[2].samplerType
settings are generated not via the NN_GX_STATE_FSUNIFORM
state flag, but rather when the glDrawElements
or glDrawArrays
function is called.
Write 1
to register 0x0080
bit [16:16
] to clear all texture caches (both Level 1 and Level 2). When doing so, write 0
to all of the bits [23:17
] of register 0x0080
. Texture caches must be cleared when the texture unit settings have changed. To clear the L1 texture cache built into each individual texture unit, the texture units must be enabled beforehand. To enable a texture unit, set its sampler type to something other than GL_FALSE
.
The texture unit enable command must be issued separately and in advance of the texture cache clear command. The texture cache clear operation uses the same register that holds the texture unit enable/disable settings. If the same single command performs both a bit write that enables a texture unit and a bit write that clears texture cache, the texture cache clear is not performed properly. On the other hand, the same is not true when disabling texture units. If the same single command performs both a bit write to disable a texture unit and a bit write to clear texture cache, the texture cache clear is performed properly.
When you use a command to set other bits in register 0x0080
and you have no need to clear texture cache, set the byte enable to 0xB
to ensure that you do not access bits [23:16
] of the register.
There are two situations where texture cache must be cleared: when changes have been made to the texture address setting registers (0x0085
, 0x0086
, 0x0087
, 0x0088
, 0x0089
, 0x008A
, 0x0095
, 0x009D
), and when texture data has been reloaded. Even if no changes have been made to the texture addresses or data itself and only the format has changed, you must clear the caches.
8.8.6.3. Texture Coordinate Selection Setting Register (0x0080)
The register for setting the selection of what texture coordinates to input to the texture units has the following layout. Note that the other bits in this register are used for separate settings.
The names in the bit layout correspond to the following reserved uniforms. The following table also gives the number of bits and possible values of each name.
Name | Bits | Description |
---|---|---|
texcoord2 | 1 |
The value set in
|
texcoord3 | 2 |
The value set in
|
8.8.6.4. Procedural Texture Setting Registers (0x00A8 – 0x00AD)
The registers for setting the reserved uniforms involved with procedural texture settings have the following bit layout.
The names in the bit layout correspond to the following reserved uniforms. The following table also gives the number of bits and possible values of each name.
Name | Bits | Description |
---|---|---|
ptRgbMap ptAlphaMap |
4 |
From the top, the values set in
|
ptAlphaSeparate | 1 |
The value set in
|
ptClampU ptClampV |
3 |
From the top, the values set in
|
ptShiftU ptShiftV |
2 |
From the top, the values set in
|
ptMinFilter | 3 |
The value set in
|
ptTexOffset ptTexWidth |
8 | From the top, the values set in dmp_Texture[3].ptTexOffset and dmp_Texture[3].ptTexWidth . These are identical to the uniforms' setting values. |
ptTexBias_high ptTexBias_low |
8 | This setting is the result of taking the value set in dmp_Texture[3].ptTexBias , converting it to a 16-bit floating-point number, and dividing it into two segments (upper and lower) of 8 bits each. For more information about how these values are converted, see 8.9.2. Conversion to a 16-Bit Floating-Point Number. |
ptNoiseEnable | 1 |
The value set in
|
ptNoiseU (F-parameter) ptNoiseU (P-parameter) ptNoiseU (A-parameter) |
16 |
From the top, the first through third elements of the values set in For the first and second elements, the setting is the result of taking the uniform value and converting it to a 16-bit floating-point number. For more information about how these values are converted, see 8.9.2. Conversion to a 16-Bit Floating-Point Number. For the third element, the setting is the result of taking the uniform value and converting it to a signed 16-bit fixed-point number with 12 fractional bits (using two's complement to represent negative values). For more information about how these values are converted, see 8.9.10. Conversion to a 16-Bit Signed Fixed-Point Number With 12 Fractional Bits. |
ptNoiseV (F-parameter) ptNoiseV (P-parameter) ptNoiseV (A-parameter) |
16 |
From the top, the first through third elements of the values set in For the first and second elements, the setting is the result of taking the uniform value and converting it to a 16-bit floating-point number. For more information about how these values are converted, see 8.9.2. Conversion to a 16-Bit Floating-Point Number. For the third element, the setting is the result of taking the uniform value and converting it to a signed 16-bit fixed-point number with 12 fractional bits (using two's complement to represent negative values). For more information about how these values are converted, see 8.9.10. Conversion to a 16-Bit Signed Fixed-Point Number With 12 Fractional Bits. |
8.8.6.5. Procedural Texture Lookup Table Setting Registers (0x00AF, 0x00B0 – 0x00B7)
Procedural texture lookup tables specified by the dmp_Texture[3].ptSampler{RgbMap, AlphaMap, NoiseMap, R, G, B, A}
reserved uniforms have two possible sizes. Lookup tables for RgbMap
, AlphaMap
, and NoiseMap
are set using 128 data items and 128 delta values, while lookup tables for R, G, B, and A are set using 256 data items and 256 delta values.
The same register is used to set the tables regardless of the number of data items, and its bit layout is as follows.
In register 0x00AF
, set Proc_Table to the lookup table you are targeting and Proc_Index to the index at which to start setting the table. Index 0
indicates the first data item. The values set in Proc_Table correspond to the following lookup tables. Even though four bits have been prepared for this 3-bit setting, you must set bit [11:11
] equal to 0
to configure a lookup table properly.
Proc_Table | Targeted Lookup Table |
---|---|
0x0 | Noise modulation table (NoiseMap ) |
0x2 | RGB mapping F function (RgbMap ) |
0x3 | Alpha mapping F function (AlphaMap ) |
0x4 | Color lookup table color values (R , G , B , A ) |
0x5 | Color lookup table delta values (R , G , B , A ) |
The same registers are used for setting values in all the lookup tables, but the bit layout of the registers differs between the color lookup tables and the other lookup tables.
After setting the targeted lookup table and index in register 0x00AF
, write the data to any register in the range 0x00B0
– 0x00B7
. The results are the same no matter which of these registers is written to, and the index is incremented by one for each data item that is written. This holds true for all the lookup tables. Also note that register 0x0080
bit [10:10
] (samplerType3
) must be set to 0x1
(procedural textures); otherwise, writes to the registers 0x00B0
– 0x00B7
are ignored.
Noise Modulation Table, RGB Mapping F Function, and Alpha Mapping F Function
To set registers 0x01C8
through 0x01CF
, write the values derived from combining the ith data item and (i + 128)th delta value in the lookup table loaded by the glTexImage1D
function. Set Proc_Value to the data item converted to an unsigned 12-bit fixed-point number with 12 fractional bits, and set Proc_Difference to the delta value converted to a signed 12-bit fixed-point number with 11 fractional bits (with negative numbers expressed using two's complement). For more information about how these values are converted, see 8.9.13. Conversion to a 12-Bit Unsigned Fixed-Point Number With 12 Fractional Bits and 8.9.7. Conversion to a 12-Bit Signed Fixed-Point Number With 11 Fractional Bits. Because there are 128 data items, the specifiable range for Proc_Index is 0
through 127
.
Color Lookup Tables
The values to write to the registers have a different format depending on whether you are setting color values or delta values.
For color values, you must write a packed value derived from the ith elements of each of the RGBA lookup tables loaded by the glTexImage1D
function. The value to write is obtained as follows: Take each component and convert it by mapping the range from 0.0
to 1.0
to the unsigned 8-bit integers 0
through 255
. Set the R component in Proc_R, the G component in Proc_G, the B component in Proc_B, and the A component in Proc_A. (For more information about how these values are converted, see 8.9.16. Conversion From a Floating-Point Number (Between 0 and 1) to an 8-Bit Unsigned Integer.)
For delta values, you must write a packed value derived from the (i+256)th elements of each of the RGBA lookup tables loaded by the glTexImage1D
function. The value to write is obtained by doing the following: take each component and convert it to a signed 8-bit fixed-point number with 7 fractional bits (using two's complement to represent negative values). Set the R component in Proc_R, the G component in Proc_G, the B component in Proc_B, and the A component in Proc_A. (For more information about how these values are converted, see 8.9.5. Conversion to an 8-Bit Signed Fixed-Point Number With 7 Fractional Bits.) Because there are 256 data items, the specifiable range for Proc_Index is 0
through 255
.
8.8.6.6. Texture Resolution Setting Registers (0x0082, 0x0092, 0x009A)
The registers for setting the respective resolutions of the textures loaded in texture units 0 through 2 have the following layout.
TEXTUREn_WIDTH
and TEXTUREn_HEIGHT
set the width and height of the texture loaded in texture unit n (where n is 0
through 2
). The settings for texture unit 0
correspond to register 0x0082
, texture unit 1
corresponds to register 0x0092
, and texture unit 2
corresponds to register 0x009A
.
8.8.6.7. Texture Format Setting Registers (0x0083, 0x008E, 0x0093, 0x0096, 0x009B, 0x009E)
The registers for setting the respective formats of the textures loaded in texture units 0
through 2
have the following layout.
The names in the bit layout correspond to the following texture format settings. The following table also gives the number of bits and possible values of each name. Settings for texture unit 0
correspond to registers 0x0083
and 0x008E
. Settings for texture unit 1
correspond to registers 0x0093
and 0x0096
. And settings for texture unit 2
correspond to registers 0x009B
and 0x009E
. Names that are not in the table correspond to bits used by other settings.
Name | Bits | Description |
---|---|---|
TEXTUREn_FORMAT_ETC1 | 2 |
This flag indicates whether
|
TEXTURE0_SHADOW_FLAG | 1 |
This flag indicates whether the format of the texture loaded in texture unit
|
TEXTUREn_FORMAT | 4 |
The settings of the
Note: The native formats use the same setting values as the corresponding non-native formats above. Note: You cannot use |
8.8.6.8. Texture Parameter Setting Registers (0x0081, 0x0083, 0x0084, and Others)
The registers for setting the texture parameters (such as wrapping mode and filters) for texture units 0
through 2
have the following layout.
The names in the bit layout correspond to the following settings. The following table also gives the number of bits and possible values of each name. Settings for texture unit 0
correspond to registers 0x0081
, 0x0083
, and 0x0084
; settings for texture unit 1
correspond to registers 0x0091
, 0x0093
, and 0x0094
; and settings for texture unit 2
correspond to registers 0x0099
, 0x009B
, and 0x009C
. Names that are not in the table correspond to bits used by other settings.
Name | Bits | Description |
---|---|---|
TEXTUREn_BORDER_RED TEXTUREn_BORDER_GREEN TEXTUREn_BORDER_BLUE TEXTUREn_BORDER_ALPHA |
8 |
These are the values of each component of the texture border color ( To obtain the settings, take each component and convert it by mapping the range [ |
TEXTUREn_MAG_FILTER | 1 |
Sets the texture magnification filter (
|
TEXTUREn_MIN_FILTER1 TEXTUREn_MIN_FILTER2 |
1 |
Sets the texture magnification filter ( The possible combinations of the 2 bits ( ( |
TEXTUREn_WRAP_S TEXTUREn_WRAP_T |
3 |
Sets the wrapping modes (
|
TEXTUREn_MIN_LOD | 4 |
Sets the minimum LOD level ( Set to |
TEXTUREn_MAX_LOD | 4 |
Sets the maximum LOD level for texture unit n (where n is Set to |
TEXTUREn_LOD_BIAS | 13 |
Sets the LOD bias value ( The setting is a value converted into a signed 13-bit fixed-point number with 8 fractional bits (with negative numbers represented in two's complement). For more information about how these values are converted, see 8.9.8. Conversion to a 13-Bit Signed Fixed-Point Number With 8 Fractional Bits. |
8.8.6.9. Settings for Shadow Textures and Gas Textures
When using shadow textures (GL_SHADOW_DMP
or GL_SHADOW_NATIVE_DMP
) and 2D textures, set GL_TEXTURE_WRAP_S
and GL_TEXTURE_WRAP_T
to GL_CLAMP_TO_BORDER
. When using shadow textures and cube map textures, set GL_TEXTURE_WRAP_S
and GL_TEXTURE_WRAP_T
to GL_CLAMP_TO_EDGE
. When using either 2D textures or cube map textures, set GL_TEXTURE_MAG_FILTER
and GL_TEXTURE_MIN_FILTER
to GL_LINEAR
and set register 0x0083
bit [20:20
] (TEXTURE0_SHADOW_FLAG
) to 1
. Set this bit to 0
when using any format other than a shadow texture. Note that mipmaps cannot be applied to shadow textures.
When using gas textures (GL_GAS_DMP
or GL_GAS_NATIVE_DMP
), set GL_TEXTURE_WRAP_S
and GL_TEXTURE_WRAP_T
to GL_CLAMP_TO_EDGE
and set in GL_TEXTURE_MAG_FILTER
and GL_TEXTURE_MIN_FILTER
to GL_NEAREST
. Note that mipmaps cannot be applied to gas textures.
8.8.7. Gas Setting Registers (0x00E0, 0x00E4, 0x00E5, 0x0120 – 0x0124, 0x0126)
This section describes registers involved with reserved uniforms that make gas settings (uniforms whose names include dmp_Gas
).
8.8.7.1. Gas Control Setting Registers (0x00E0, 0x00E4, 0x00E5, 0x0120 – 0x0122, 0x0126)
The registers for gas control settings have the following layout. Note that the other bits in register 0x00E0
are used for other settings, including fog settings.
The names in the bit layout correspond to the following reserved uniforms. The following table also gives the possible settings of each bit.
Name | Bits | Description |
---|---|---|
lightXY (lightMin) lightXY (lightMax) lightXY (lightAtt) |
8 |
From the top, the first through third elements of the values set in To obtain the settings, take each individual uniform element value and convert it by mapping the range [ |
lightZ (scattMin) lightZ (scattMax) lightZ (scattAtt) lightZ (LZ) |
8 |
From the top, the first through fourth elements of the values set in To obtain the settings, take each individual uniform element value and convert it by mapping the range [ |
deltaZ | 24 |
The value set in The setting is the result of converting the uniform value to an unsigned 24-bit fixed-point number with 8 fractional bits. For more information about how these values are converted, see 8.9.15. Conversion to an 8-Bit Signed Fixed-Point Number With 24 Fractional Bits. |
accMax | 16 |
The value set in The setting is the result of converting the uniform value to a 16-bit floating-point number. For more information about how these values are converted, see 8.9.2. Conversion to a 16-Bit Floating-Point Number. |
autoAcc (Initialize) | 16 |
The following procedures become necessary when Zero-clear before the density-rendering pass, and call the For more information, see the description below this table. |
attenuation | 16 |
The value set in The setting is the result of converting the uniform value to a 16-bit floating-point number. For more information about how these values are converted, see 8.9.2. Conversion to a 16-Bit Floating-Point Number. |
colorLutInput | 1 |
The value set in
|
shadingDensitySrc | 1 |
The value set in
|
Automatically Calculating the Inverse of the Largest Density Value
If you set dmp_Gas.autoAcc
to GL_TRUE
, rendering is carried out in the gas density-rendering pass, and then the inverse of the maximum value of the D1 density values is automatically calculated and set in accMax.
Zero-clear the maximum D1 value before the density-rendering pass begins. Clear it by writing 0
to autoAcc
(Initialize)
. (To clear using a value other than 0
, be sure to convert that value to a 16-bit floating-point number before writing it. For more information about how these values are converted, see 8.9.2. Conversion to a 16-Bit Floating-Point Number. After the density-rendering pass is finished, be sure to call the nngxSetGasAutoAccumulationUpdate
function, so that the calculated D1 maximum value is applied to accMax.
void nngxSetGasAutoAccumulationUpdate (GLint id);
This function, which uses the interrupt handler after completion of the id
th accumulated command request in the bound command list object, applies the maximum value of the D1 values calculated during the gas density-rendering pass to accMax (the setting made by the dmp_Gas.accMax
reserved uniform), by setting the inverse of the maximum D1 value there.
The command request specified by id
must be a render command request. If you make sure to call this function on a command request that includes a density-rendering pass command, accMax will be updated correctly. Use the nngxSplitDrawCmdlist
function to separate density-rendering pass commands from shading pass commands. If both kinds of commands are included in the same command request, accMax will not be updated before the shading pass. Also note that after you use this function to update accMax, you must not write any other value to accMax until the shading pass finishes.
Error GL_ERROR_806D_DMP
occurs when a bound command list has the object name of 0
. Error GL_ERROR_806E_DMP
occurs when id
is either zero or less, larger than the number of accumulated command requests, or if the command request specified by id
is not a render command request.
8.8.7.2. Shading Lookup Table Setting Registers (0x0123, 0x0124)
Shading lookup tables specified by the dmp_Gas.sampler{TR, TG, TB}
reserved uniforms are set using eight data items and eight delta values.
The bit layout for the setting registers is as follows.
In register 0x0123
, set Shading_Index to the index at which to start setting the table. Index 0
indicates the first data item. Because there are 16 data items, the specifiable range for Shading_Index is 0
to 15
.
After setting the index in register 0x0123
, write the set of converted values to register 0x0124
. The index is incremented by one for each data item that is written. However, you must note that the format of the values written to the register differs between the first 8 and latter 8 items in the shading lookup table, and that the index set in the register differs from the index in the shading lookup table. Also note that register 0x00E0
bits [0:3
] (mode) must be set to 0x7
(indicating that the fog unit is set to gas mode). Otherwise, write operations to register 0x0124
are ignored.
For the first eight data items (i
< 8
), you must write a packed value derived from the (i
+8
)th elements of each of the lookup tables loaded by the glTexImage1D
function. The value to write is obtained by doing the following: Take each component and convert it to a signed 8-bit integer. Set the R component in Shading_R
, the G component in Shading_G
, and the B component in Shading_B
. (For more information about how these values are converted, see 8.9.18. Conversion From a Floating-Point Number (Between -1 and 1) to an 8-Bit Signed Integer.)
For the latter eight data items (i
>= 8
), you must write a packed value derived from the (i
-8
)th elements of each of the lookup tables loaded by the glTexImage1D
function. The value to write is obtained by doing the following: Take each component, multiply it by 255, and then convert it to an unsigned 8-bit fixed-point number with 0 fractional bits. Set the R component in Shading_R
, the G component in Shading_G
, and the B component in Shading_B
. (For more information about how these values are converted, see 8.9.11. Conversion to an 8-Bit Unsigned Fixed-Point Number With 0 Fractional Bits.)
Dummy commands must sometimes be inserted before commands that set the shading lookup table. In particular, immediately following any commands that set registers 0x0000
– 0x0035
, registers 0x0100
– 0x013F
, or registers at addresses not documented in this manual, you must insert 45 dummy commands before setting the gas shading lookup table. Any command that writes to some register other than those in the above ranges can serve as a dummy command. Also note that after a command that sets the shading lookup table, you must insert one dummy command that sets register 0x0100
and has a byte enable of 0
.
8.8.8. Fog Setting Registers (0x00E0, 0x00E1, 0x00E6, 0x00E8 - 0x00EF)
This section describes registers involved with reserved uniforms that make fog settings (uniforms whose names include dmp_Fog
).
8.8.8.1. Fog Control Setting Registers (0x00E0, 0x00E1)
The registers for fog control settings have the following layout. Note that the other bits in register 0x00E0
are used for other settings (such as gas settings).
The names in the bit layout correspond to the following reserved uniforms. The following table also gives the number of bits and possible values of each name.
Name | Bits | Description |
---|---|---|
mode | 3 |
The value set in
|
zFlip | 1 |
The value set in
|
color (Red) color (Green) color (Blue) |
8 |
From the top, the first through third elements of the values set in To obtain the settings, take each individual uniform element value and convert it by mapping the range [ |
8.8.8.2. Fog Lookup Table Setting Registers (0x00E6, 0x00E8 – 0x00EF)
The fog coefficient lookup table specified by the dmp_Fog.sampler
reserved uniform is set using 128 data items and 128 delta values. The bit layout for the setting registers is as follows.
In register 0x00E6
, set Fog_Index to the index at which to start setting the table. An index of 0
indicates the first data item, and 127
indicates the last data item.
To set registers 0x00E8
through 0x00EF
, write the values derived from combining the ith data item and (i + 128)th delta value in the lookup table loaded by the glTexImage1D
function. Set Fog_Value to the data item converted to an unsigned 11-bit fixed-point number with 11 fractional bits, and set Fog_Difference to the delta value converted to a signed 13-bit fixed-point number with 11 fractional bits (with negative numbers expressed using two's complement). For more information about how these values are converted, see 8.9.12. Conversion to an 11-Bit Unsigned Fixed-Point Number With 11 Fractional Bits and 8.9.9. Conversion to a 13-Bit Signed Fixed-Point Number With 11 Fractional Bits.
After setting the index in register 0x00E6
, write the sets of converted values to any register in the range 0x00E8
through 0x00EF
. The results are the same no matter which of these registers is written to, and the index is incremented by one for each data item that is written.
8.8.9. Per-Fragment Operations Setting Registers (0x0100 and Others)
This section describes registers involved with reserved uniforms that make settings for per-fragment operations (uniforms whose names include dmp_FragOperation
).
8.8.9.1. Fragment Operations Mode Setting Register (0x0100)
The register for setting the fragment operations mode has the following layout. Note that the other bits in this register are used for logical operations and blending settings. When you change the fragment operations mode, you must also change the register settings described in 8.8.9.6. Framebuffer Access Control Setting Registers (0x0112 – 0x0115).
The names in the bit layout correspond to the following reserved uniforms. The following table also gives the number of bits and possible values of each name.
Name | Bits | Description |
---|---|---|
FragmentOperation | 2 |
The value set in
|
8.8.9.2. Shadow Attenuation Factor Setting Register (0x0130)
The register for setting the shadow attenuation factor has the following layout.
Reserved Uniform | Register | Description |
---|---|---|
dmp_FragOperation.penumbraScale dmp_FragOperation.penumbraBias |
0x0130, bits [31:16] |
Before the For more information about how these values are converted, see 8.9.2. Conversion to a 16-Bit Floating-Point Number. |
0x0130, bits [15:0] |
The register setting is the sum of the For more information about how these values are converted, see 8.9.2. Conversion to a 16-Bit Floating-Point Number. |
8.8.9.3. w Buffer Setting Registers (0x004D, 0x004E, 0x006D)
The registers for w
buffer settings have the following layout.
Reserved Uniform | Register | Description |
---|---|---|
dmp_FragOperation.wScale | 0x006D, bit [0:0] | Set to 1 when the uniform value is 0 , and to 0 when the uniform value is anything else. |
0x004D, bits [23:0] | Sets the scale value to apply to the clipping coordinate z value. The setting is influenced by both the uniform value and the glDepthRangef function's setting value. A description of how to configure this setting is given below. |
|
0x004E, bits [23:0] | Sets the bias value to apply to the clipping coordinate z value. The setting is influenced by both the uniform value and the settings of the glDepthRangef and glPolygonOffset functions. A description of how to configure this setting is given below. |
For register 0x004D
bits [23:0
], if the uniform value is not 0
, the register setting is the result of taking the uniform value and flipping its sign. If the uniform value is 0
, obtain the register setting by taking the zNear
and zFar
parameters specified in the glDepthRangef
function and calculating (zNear
– zFar
). Convert the value to a 24-bit floating-point number before setting it in the register. For more information about how these values are converted, see 8.9.1. Conversion to a 24-Bit Floating-Point Number.
For register 0x004E
bits [23:0
], if the uniform value is not 0
, the register setting is 0
. If the uniform value is 0
, the register setting is the zNear
parameter specified in the glDepthRangef
function. If GL_POLYGON_OFFSET_FILL
has been enabled by the glEnable
function, the register setting is calculated by adding an offset derived from the units
parameter specified in the glPolygonOffset
function. When the depth buffer format is 16 bits, the offset is the result of the calculation (units
÷ 65535). When the depth buffer format is 24 bits, the offset is the result of the calculation (units
÷ 16777215). Convert the value to a 24-bit floating-point number before setting it in the register. For more information about how these values are converted, see 8.9.1. Conversion to a 24-Bit Floating-Point Number.
8.8.9.4. Clipping Setting Registers (0x0047 – 0x004B)
The registers for clipping settings have the following layout.
The names in the bit layout correspond to the following reserved uniforms. The following table also gives the number of bits and possible values of each name.
Name | Bits | Description |
---|---|---|
enableClippingPlane | 1 |
The value set in
|
clippingPlane1 through clippingPlane4 |
24 |
The first through fourth elements of the values set in For more information about how these values are converted, see 8.9.1. Conversion to a 24-Bit Floating-Point Number. |
8.8.9.5. Alpha Test Setting Register (0x0104)
The register for the alpha test settings has the following layout.
The names in the bit layout correspond to the following reserved uniforms. The following table also gives the number of bits and possible values of each name.
Name | Bits | Description |
---|---|---|
enableAlphaTest | 1 |
The value set in
|
alphaTestFunc | 3 |
The value set in
|
alphaRefValue | 8 |
To obtain the register setting, take the For more information about how these values are converted, see 8.9.16. Conversion From a Floating-Point Number (Between 0 and 1) to an 8-Bit Unsigned Integer. |
8.8.9.6. Framebuffer Access Control Setting Registers (0x0112 – 0x0115)
The registers for framebuffer access control settings have the following layout. These registers must sometimes be changed when certain functions are called, and when certain reserved uniform values are changed.
Register | Description |
---|---|
colorRead 0x0110, bits [3:0] |
Set
Color buffer reads are needed when any of the following conditions are met.
|
colorWrite 0x0113, bits [3:0] |
Set
Color buffer writes are needed when any of the following conditions are met.
|
depthRead 0x0114, bits [1:0] |
Set bit [
Depth buffer reads are needed when any of the following conditions are met.
Stencil buffer reads are needed when any of the following conditions are met.
|
depthWrite 0x0115, bits [1:0] |
Set bit [
Depth buffer writes are needed when the following conditions are met.
Stencil buffer writes are needed when the following conditions are met.
|
Not all possible combinations of reads and writes of the color buffer, depth buffer, and stencil buffer are supported by hardware. If an unsupported combination is set, operations are undefined. See the following table for the supported combinations.
colorRead | colorWrite | depthRead | depthWrite | Hardware Support |
---|---|---|---|---|
0 | 0 | 0 | 0 | × |
Non-zero | 0 | 0 | 0 | × |
0 | Non-zero | 0 | 0 | ○ |
Non-zero | Non-zero | 0 | 0 | ○ |
0 | 0 | Non-zero | 0 | × |
Non-zero | 0 | Non-zero | 0 | × |
0 | Non-zero | Non-zero | 0 | ○ |
Non-zero | Non-zero | Non-zero | 0 | ○ |
0 | 0 | 0 | Non-zero | × |
Non-zero | 0 | 0 | Non-zero | × |
0 | Non-zero | 0 | Non-zero | × |
Non-zero | Non-zero | 0 | Non-zero | × |
0 | 0 | Non-zero | Non-zero | ○ |
Non-zero | 0 | Non-zero | Non-zero | × |
0 | Non-zero | Non-zero | Non-zero | ○ |
Non-zero | Non-zero | Non-zero | Non-zero | ○ |
When the registers above are set to 0
, memory access to the corresponding buffers is restricted and the result is a boost in performance. For this reason it is preferable that these registers are set to 0
if possible. However, you can only set combinations that are supported by the hardware. If write access is disabled, buffer writes will not take place, even if the fragment operations settings require buffer writes. Similarly, if read access is disabled but the fragment operations settings require buffer reads, undefined values will be read from the buffer.
You can disable read access of the color buffer only if the following conditions are met.
- The reserved uniform
dmp_FragOperation.mode
is set toGL_FRAGOP_MODE_GL_DMP
. - Bits [
11:8
] of the color mask setting register (0x0107
) are set to0x0
or0xF
.
In addition to all the above conditions, at least one of the conditions below must be met.
- Blending is enabled (bit [
8:8
] of register0x0100
is1
) and the settings for srcRGB and srcAlpha (bits [19:16
] and [27:24
] of register0x101
) do not require destination color (that is, are not0x4
,0x5
,0x8
,0x9
, or0xE
). In addition, dstRGB and dstAlpha (bits [23:20
] and [31:38
] of register0x0101
) are set to0x0
(GL_ZERO
). - Logical operations are enabled (bit [
8:8
] of register0x0100
is0
) and theopcode
setting for register0x0102
does not require destination color (that is,0x0
,0x3
,0x4
, or0x5
).
You can disable write access of the color buffer only if all the following conditions are met.
- The reserved uniform
dmp_FragOperation.mode
is set toGL_FRAGOP_MODE_GL_DMP
. - Bits [
11:8
] of the color mask setting register (0x0107
) are set to0x0
.
You can disable read access of the depth buffer only if the following conditions are met.
- The reserved uniform
dmp_FragOperation.mode
is set toGL_FRAGOP_MODE_GL_DMP
. - Either the depth test is disabled, or it is enabled but the comparison method does not require the depth buffer value.
Either all the above conditions must be met, or the conditions below must be met.
- The reserved uniform
dmp_FragOperation.mode
is set toGL_FRAGOP_MODE_SHADOW_DMP
.
You can disable write access of the depth buffer only if at least one of the following conditions is met.
- The depth test is disabled (bit [
0:0
] of register0x0107
is set to0x0
). - The depth buffer masking setting is
GL_FALSE
(bit [12:12
] of register0x0107
is set to0x0
). - The reserved uniform
dmp_FragOperation.mode
is set to something other thanGL_FRAGOP_MODE_GL_DMP
.
You can disable read access of the stencil buffer only if the following conditions are met.
- The reserved uniform
dmp_FragOperation.mode
is set toGL_FRAGOP_MODE_SHADOW_DMP
.
Either this condition must be met, or, if the reserved uniform dmp_FragOperation.mode
is set to GL_FRAGOP_MODE_GL_DMP
, at least one of the following conditions must be met.
- The stencil test is disabled (bit [
0:0
] of register0x0105
is set to0x0
). - The stencil test is enabled (bit [
0:0
] of register0x0105
is set to0x1
), but the comparison method does not require the stencil buffer value. - The stencil test masking value (bits [
31:24
] of register0x0105
) is set to0x00
.
You can disable write access of the stencil buffer only if at least one of the following conditions is met.
- The stencil test is disabled (bit [
0:0
] of register0x0105
is set to0x0
). - The stencil test mask setting (bits [
15:8
] of register0x0105
) is set to0x00
. - The stencil test fail, zfail and zpass parameters are all set to
GL_KEEP
(bits [2:0
], [6:4
], and [10:8
] of register0x0106
are all set to0x0
). - The reserved uniform
dmp_FragOperation.mode
is set to something other thanGL_FRAGOP_MODE_GL_DMP
.
8.8.10. Viewport Setting Registers (0x0041 – 0x0044, 0x0068)
The registers that correspond to the viewport settings configured by the glViewport
function have the following layout. When you change these registers, you must sometimes also change the registers described in 8.8.16. Scissor Test Setting Registers (0x0065 – 0x0067).
The names in the bit layout correspond to the following viewport settings. The following table also gives the number of bits and possible values of each name.
Name | Bits | Description |
---|---|---|
VIEWPORT_WIDTH | 24 |
The result of dividing the viewport width by 2 and then converting the quotient into a 24-bit floating-point number. For more information about how these values are converted, see 8.9.1. Conversion to a 24-Bit Floating-Point Number. |
VIEWPORT_WIDTH_INV | 32 |
The result of dividing 2 by the viewport width, converting the quotient into a 31-bit floating-point number, and, finally, shifting the value left by 1 bit. For more information about how these values are converted, see 8.9.3. Conversion to a 31-Bit Floating-Point Number. |
VIEWPORT_HEIGHT | 24 |
The result of dividing the viewport height by 2 and then converting the quotient into a 24-bit floating-point number. For more information about how these values are converted, see 8.9.1. Conversion to a 24-Bit Floating-Point Number. |
VIEWPORT_HEIGHT_INV | 32 |
The result of dividing 2 by the viewport height, converting the quotient into a 31-bit floating-point number, and, finally, shifting the value left by 1 bit. For more information about how these values are converted, see 8.9.3. Conversion to a 31-Bit Floating-Point Number. |
VIEWPORT_X | 10 | Sets the x-coordinate of the viewport's origin. |
VIEWPORT_Y | 10 | Sets the y-coordinate of the viewport's origin. |
8.8.11. Depth Test Setting Registers (0x0107, 0x0126)
The registers that correspond to depth test settings have the following layout. When you change these registers, you must sometimes also change the registers described in 8.8.9.6. Framebuffer Access Control Setting Registers (0x0112 – 0x0115).
The names in the bit layout correspond to the following depth test settings. The following table also gives the number of bits and possible values of each name. Names that are not in the table correspond to bits used by other settings.
Name | Bits | Description |
---|---|---|
enable | 1 |
Sets whether to enable or disable depth tests. This setting is changed by passing
|
func | 3 |
Sets the depth test comparison method. This setting is changed by the
|
flag | 1 |
Depth buffer masking setting. This setting is changed by the
|
func2 | 2 |
Sets the depth test comparison method. This setting is changed by the
|
8.8.12. Logical Operations and Blending Setting Registers (0x0100 – 0x0103)
The following registers correspond to logical operations and blending settings. When you change these registers, you must sometimes also change the registers described in 8.8.9.6. Framebuffer Access Control Setting Registers (0x0112 – 0x0115).
The names in the bit layout correspond to the following early depth test settings. The following table also gives the number of bits and possible values of each name.
Name | Bits | Description |
---|---|---|
enable | 1 |
Sets whether logical operations or blending is enabled (they are mutually exclusive). This setting is changed by passing either
|
srcRGB dstRGB srcAlpha dstAlpha |
4 |
Sets the source and destination weighting coefficients. These settings are changed by the When blending is disabled, set When blending is enabled, you can set the following values.
|
modeRGB modeAlpha |
3 |
Sets the blending equation. This setting is changed by the When blending is disabled, set to When blending is enabled, you can set the following values.
|
red green blue alpha |
8 |
From the top, the red, green, blue, and alpha components of the constant color set by the To obtain the settings, take each individual component value and convert it by mapping the range [ |
opcode | 4 |
Sets the logical operation. This setting is changed by the
|
When logical operations are enabled, the settings in register 0x0101
(src
XXX
, dst
XXX
, and mode
XXX
) are ignored.
8.8.13. Early Depth Test Setting Registers (0x0061 – 0x0063, 0x006A, 0x0118)
The following registers correspond to early depth test settings. When you change these registers, you must sometimes also change the registers described in 8.8.9.6. Framebuffer Access Control Setting Registers (0x0112 – 0x0115) and 8.8.11. Depth Test Setting Registers (0x0107, 0x0126).
The names in the bit layout correspond to the following logical operations and blending settings. The following table also gives the number of bits and possible values of each name.
Setter Function | Bits | Description |
---|---|---|
EnableEarlyDepthTest0 EnableEarlyDepthTest1 |
1 1 |
Sets whether to enable or disable early depth tests. This setting is changed by passing
|
EarlyDepthFunc | 2 |
Sets the comparison method. This setting is specified by the
|
ClearEarlyDepthBit | 1 | This is set to a value of 0x1 when the early depth buffer is cleared by passing an argument containing GL_EARLY_DEPTH_BUFFER_BIT_DMP to the glClear function. |
ClearEarlyDepthValue | 24 | Sets the clear value for the early depth buffer. This setting is specified by the depth parameter of the glClearEarlyDepthDMP function. Set the register to a value identical to the value passed as an argument. |
8.8.14. Stencil Test Setting Registers (0x0105, 0x0106)
The following registers correspond to stencil test settings. When you change these registers, you must sometimes also change the registers described in 8.8.9.6. Framebuffer Access Control Setting Registers (0x0112 – 0x0115).
The names in the bit layout correspond to the following stencil test settings. The following table also gives the number of bits and possible values of each name.
Name | Bits | Description |
---|---|---|
enable | 1 |
Sets whether to enable or disable stencil tests. This setting is changed by passing
|
fb_mask | 8 | Sets the stencil buffer's masking value. This setting is specified by the mask parameter of the glStencilMask function. Set the register to the lower 8 bits of the value passed as an argument. |
func | 3 |
Sets the comparison method. This setting is specified by the
|
ref | 8 | Sets the stencil test reference value. This setting is specified by the ref parameter of the glStencilFunc function. Set the register to a value identical to the value passed as an argument. |
mask | 8 | Sets the stencil buffer's masking value. This setting is specified by the mask parameter of the glStencilMask function. Set the register to a value identical to the value passed as an argument. |
fail | 3 |
Sets how the stencil buffer content is changed when a fragment is eliminated by the stencil test. This setting is changed by the
|
zfail | 3 | Sets how the stencil buffer content is changed when a fragment is eliminated by the depth test. This setting is changed by the zfail parameter of the glStencilOp function. This setting and fail have the same value. |
zpass | 3 | Sets how the stencil buffer content is changed when a fragment passes the depth test. This setting is changed by the zpass parameter of the glStencilOp function. This setting and fail have the same value. |
8.8.15. Culling Setting Register (0x0040)
The following register corresponds to culling settings.
The names in the bit layout correspond to the following stencil test settings. The following table also gives the number of bits and possible values of each name.
Name | Bits | Description |
---|---|---|
Culling | 2 |
Set to The value set when culling has been enabled by calling Set to Set to |
8.8.16. Scissor Test Setting Registers (0x0065 – 0x0067)
The registers corresponding to scissor test settings have the following layout.
The names in the bit layout correspond to the following scissor test settings. The following table also gives the number of bits and possible values of each name.
Name | Bits | Description |
---|---|---|
enable | 2 |
Sets whether to enable or disable scissor tests. This setting is changed by passing
|
x | 10 |
Sets the x-coordinate of the origin of the scissor box. This setting is specified by the Set to Set identical to the |
y | 10 |
Sets the y-coordinate of the origin of the scissor box. This setting is specified by the Set to Set identical to the |
width | 10 |
Sets the scissor box width. This setting is specified by the Set to one less than the current color buffer width when the scissor test is disabled. Set to the result of ( |
height | 10 |
Sets the scissor box height. This setting is specified by the Set to one less than the current color buffer height when the scissor test is disabled. Set to the result of ( |
8.8.17. Color Mask Setting Register (0x0107)
The following register corresponds to color mask settings made by the glColorMask
function. When you change these registers, you must sometimes also change the registers described in 8.8.9.6. Framebuffer Access Control Setting Registers (0x0112 – 0x0115).
The names in the bit layout correspond to the following color mask settings. The following table also gives the number of bits and possible values of each name. Names that are not in the table correspond to bits used by the registers described in 8.8.11. Depth Test Setting Registers (0x0107, 0x0126).
Name | Bits | Description |
---|---|---|
red | 1 |
Sets the color mask's red component. This setting is specified by the
|
green | 1 |
Sets the color mask's green component. This setting is specified by the
|
blue | 1 |
Sets the color mask's blue component. This setting is specified by the
|
alpha | 1 |
Sets the color mask's alpha component. This setting is specified by the
|
8.8.18. Block Format Setting Register (0x011B)
The following register corresponds to the block format setting.
Setter Function | Register | Description |
---|---|---|
glRenderBlockModeDMP |
0x011B, bits [0:0] | 0x0 : GL_RENDER_BLOCK8_MODE_DMP 0x1 : GL_RENDER_BLOCK32_MODE_DMP |
8.8.19. Registers Involved With the Rendering API (0x0227 – 0x022A and Others)
The glDrawElements
and glDrawArrays
functions validate states when they are called. This generates commands that set the various registers relating to the various states. But in addition to generating commands through validation, these functions set registers required for rendering itself.
8.8.19.1. Settings When Vertex Buffers Are Used
The following registers are set when vertex buffers are used with rendering. All these commands must be set before the render start command, unless indicated otherwise.
The names in the bit layout correspond to the following settings.
Name | Bits | Description |
---|---|---|
DRAW_MODE DRAW_TRIANGLES1 DRAW_TRIANGLES2 |
2 1 1 |
Sets the rendering mode. This setting is specified by the 0x0: |
CALL_DRAWARRAYS | 1 |
Always set to While this is set to |
INDEX_ARRAY_OFFSET | 28 |
Sets the address offset to the vertex index array. This is the offset from the base address shared by all vertex arrays. The base address is set in bits [ When using the
The setting value for This does not need to be re-set unless settings change. |
INDEX_ARRAY_TYPE | 1 |
Sets the vertex index type. Set to Always set to |
VERTEX_NUM | 32 | Sets the number of vertices to render. This does not need to be re-set unless settings change. Operation is undefined when set to 0 , so do not set this to 0 . |
FIRST_OFFSET | 32 | When rendering with the glDrawArrays function, this is set to the value of the first parameter. This does not need to be re-set unless settings change. |
RESET_VTX_INFO | 1 |
When a value of When passing When rendering with the When rendering with either function, resetting is required per each render start command if either |
KICK_COMMAND1 KICK_COMMAND2 |
Special |
When starting to render, write a
|
CLEAR_POST_VTX | Special | Immediately after starting to render, write a 1 to any bit. This must be set each time rendering is performed. |
clearFrameBufferCacheData Register 0x0111 , bit [0:0] |
1 |
Write a For details about these registers, see 8.8.21. Framebuffer Cache Clear Setting Registers (0x0110, 0x0111). |
SamplerType0 SamplerType1 SamplerType2 Register 0x0080, bits [2:0] |
1 1 1 |
Immediately before the command to start rendering, set to 1 for just those texture units that must be enabled, and then set to 0 immediately after the command to start rendering. It does not cause any problems in operation to leave this set to 1 for the texture units that must be enabled. However, setting this to 0 can help reduce power consumption, so it is set to 0 when not rendering. For details about these registers, see 8.8.6.2. Texture Sampler Type Setting Registers (0x0080, 0x0083). |
DRAW_START | 1 |
Set to The It is acceptable to leave this bit set to |
DUMMY_CMD | 8 |
Two commands that each write a value of Commands that write to this register at that time are dummy commands, so the setting values themselves have no meaning. |
Register 0x02BA, bits [31:16] | 16 |
Running a command that writes Failure to do so does not cause any problems, but if you do write to these bits use a byte enable setting of |
Register 0x028A, bits [31:16] | 16 |
Running a command that writes Failure to do so does not cause any problems, but if you do write to these bits use a byte enable setting of When you are not using geometry shaders (that is, when bit [ |
Notes About Command Dependencies
Bits [31:16
] of register 0x02BA
must be set after DRAW_START
is set.
While CALL_DRAWARRAYS
is set to 1
, registers other than 0x0200
through 0x0254
and 0x0280
through 0x02DF
might not be set properly. Set these registers only when CALL_DRAWARRAYS
is 0
. This does not apply to dummy commands that set bits [31:24
] of register 0x025E
, however.
There are several other commands that must set registers immediately after a render start command, but there are no dependencies affecting the order of those commands.
Note on the INDEX_ARRAY_OFFSET Setting
In the description of INDEX_ARRAYOFFSET
in the table (bits [27:0
] of register 0x0227
), it states that either 0
or 0x20
must be set depending on conditions. More precisely speaking, if the setting is performed so that the following conditions are not met, values other than 0
and 0x20
will also be set correctly.
- If
VERTEX_NUM
is greater than0x10
:((VERTEX_NUM – 0x10) × 2 + (ARRAY_BASE_ADDR << 4 ) + INDEX_ARRAY_OFFSET) & 0xFFF >= 0xFE0
- If
VERTEX_NUM
is less than or equal to0x10
:(ARRAY_BASE_ADDR << 4 + INDEX_ARRAY_OFFSET) & 0xFFF >= 0xFE0
The setting value for VERTEX_NUM
is bits [31:0
] of register 0x0228
, and for ARRAY_BASE_ADDR
, bits [28:1
] of register 0x0200
.
Limitations Specific to Load Arrays
When starting rendering for writes to register 0x022F
, the number of load arrays must be under 12. For more information, see Load Array Limitations.
8.8.19.2. Settings When Vertex Buffers Are Not Used
This section describes register settings when rendering without using vertex buffers by explaining how they differ from the register settings when vertex buffers are used. The bit layout is the same as in Figure 8-56. The names in the bit layout correspond to the following settings. Commands that set vertex attribute data are handled in the same way as render start commands. All these commands must be set before the render start command, unless indicated otherwise.
Name | Bits | Description |
---|---|---|
DRAW_MODE DRAW_TRIANGLES1 DRAW_TRIANGLES2 |
2 1 1 |
Sets the rendering mode. This setting is specified by the
Both |
CALL_DRAWARRAYS | 1 |
This setting is the same regardless of which rendering function is called. Because the initial setting is While this is set to |
INDEX_ARRAY_OFFSET | 28 | Unused. |
INDEX_ARRAY_TYPE | 1 | Unused. |
VERTEX_NUM | 32 | Unused. The amount of input vertex attribute data determines the number of vertices to process. |
FIRST_OFFSET | 32 | Unused. |
RESET_VTX_INFO | 1 | No different from when vertex buffers are used. |
KICK_COMMAND1 KICK_COMMAND2 |
Special | |
CLEAR_POST_VTX | Special |
When rendering with either the In other rendering modes ( |
clearFrameBufferCacheData register 0x0111 [0:0] | 1 | No different from when vertex buffers are used. |
SamplerType0 SamplerType1 SamplerType2 Register 0x0080, bits [2:0] |
1 1 1 |
No different from when vertex buffers are used. |
DRAW_START | 1 | No different from when vertex buffers are used. |
Register 0x02BA, bits [31:16] | 16 | No different from when vertex buffers are used. |
Register 0x028A, bits [31:16] | 16 | No different from when vertex buffers are used. |
The registers described in 8.8.1.8. Fixed Vertex Attribute Value Setting Registers (0x0232 – 0x0235) are used to input vertex attribute data.
First, 0xF
is written to bits [3:0
] of register 0x0232
. Next, the vertex attribute data is converted into 24-bit floating-point numbers, and the resulting three 32-bit segments of data are written to registers 0x0233
, 0x0234
, and 0x0235
, in that order. This 24-bit floating-point data written to the three, 32-bit segments is created in the same way as the data presented in the section titled 24-Bit Floating-Point Input Mode.
When vertex buffers are not used, none of the register settings described in 8.8.1.9. Vertex Attribute Array Setting Registers (0x0200 – 0x0227) are necessary. Dependencies that affect the order in which commands must be set are the same as when vertex buffers are used.
8.8.20. Geometry Shader Setting Registers (0x0280 – 0x02AF)
There are multiple vertex shader processors built into the GPU for vertex operations. One of these processors serves as the geometry shader processor when geometry shaders are in use. This processor is called the shared processor. Because the shared processor operates as a vertex shader processor when geometry shaders are not in use, the resources for floating-point registers, Boolean registers, and other registers are set using vertex shader values. These values set as vertex shader settings must be changed to geometry shader settings when you switch from not using geometry shaders to using them. Similarly, the values set as geometry shader settings must be changed to vertex shader settings when you switch from using geometry shaders to not using them.
Registers 0x02B0
through 0x02DF
are used to set the vertex shader processors. When you set any of the registers in this range, those settings are applied to all of the vertex shader processors. Generally these settings are also applied to the shared processor, except when bit [0:0
] of register 0x0244
has a value of 1
. In that case, vertex shader processor settings are not applied to the shared processor. Conversely, if the same bit has a value of 1
and bits [1:0
] of register 0x0229
are 0x0
, the settings are applied to the shared processor.
To set the shared processor only, make settings just as you would in registers 0x02B0
through 0x02DF
, but first subtract 0x30
from the register addresses. In other words, use registers 0x0280
through 0x02AF
to set only the shared processor.
Registers 0x0280
through 0x02AF
have geometry shader–specific settings when geometry shaders are in use, but they must be set to the same settings as registers 0x02B0
through 0x02DF
when geometry shaders are not in use. It is possible to meet this requirement by setting bit [0:0
] of register 0x0244
equal to 0
and bits [1:0
] of register 0x0229
equal to 0x0
(ensuring that vertex shader processor settings are also applied to the shared processor), and then resetting registers 0x02B0
through 0x02DF
.
To use geometry shaders, you not only need to set these registers involved with the shared processor, but you must also set registers involved with other aspects, such as input and output.
8.8.20.1. Floating-Point Constant Registers (0x02C0, 0x02C1 – 0x02C8)
The values to set in these registers are the same as those in 8.8.1.1. Floating-Point Constant Registers (0x02C0, 0x02C1 – 0x02C8). Only the register addresses are different.
Register 0x0290
corresponds to register 0x02C0
and registers 0x0291
through 0x0298
correspond to registers 0x02C1
through 0x02C8
.
8.8.20.2. Boolean Register (0x02B0)
The values to set in this register are the same as those in 8.8.1.2. Boolean Register (0x02B0). Only the register address is different.
Register 0x0280
corresponds to register 0x02B0
.
8.8.20.3. Integer Registers (0x02B1 – 0x02B4)
The values to set in these registers are the same as those in 8.8.1.3. Integer Registers (0x02B1 – 0x02B4). Only the register addresses are different.
Registers 0x0281
through 0x0284
correspond to registers 0x02B1
through 0x02B4
.
8.8.20.4. Program Code Setting Registers (0x028F, 0x029B – 0x02A3, 0x02A5 – 0x02AD)
The values to set in these registers are the same as those in 8.8.1.4. Program Code Setting Registers (0x02BF, 0x02CB – 0x02D3, 0x02D5 – 0x02DD). Only the register addresses are different.
Register 0x028F
corresponds to register 0x02BF
, registers 0x029B
through 0x02A3
correspond to registers 0x02CB
through 0x02D3
, and registers 0x02A5
through 0x02AD
correspond to registers 0x02D3
through 0x02DD
.
8.8.20.5. Starting Address Setting Register (0x02BA)
The values to set in this register are the same as those in 8.8.1.5. Starting Address Setting Register (0x02BA). Only the register address is different.
Register 0x028A
corresponds to register 0x02BA
.
8.8.20.6. Vertex Attribute Input Count Setting Register (0x0289)
The register that sets the number of vertex attributes to input to the geometry shader has the following layout.
Set the value of count
to the number of vertex attributes to input minus 1
. The same number of vertex attributes that are output from the vertex shader (including generic
attributes) are input to the geometry shader.
The number of vertex attributes set here is not the number of #pragma output_map
definitions in vertex shader assembly, it is the number of output registers defined by #pragma output_map
. In other words, even if multiple #pragma output_map
definitions are used for the individual components of a single output register, they still count as a single register.
8.8.20.7. Input Register Mapping Setting Registers (0x02BB, 0x02BC)
The values set in these registers are the same as those in 8.8.1.7. Input Register Mapping Setting Registers (0x02BB, 0x02BC). Only the register addresses are different. However, when a reserved geometry shader such as the line shader is used, set register 0x028B
equal to 0x76543210
and set register 0x028C
equal to 0xFEDCBA98
.
Registers 0x028B
through 0x028C
correspond to registers 0x02BB
through 0x02BC
.
8.8.20.8. Output Register Use Count Setting Registers (0x004F, 0x025E)
The number of output registers used by geometry shaders is configured by some of the same registers described in 8.8.1.10. Output Register Use Count Setting Registers (0x004F, 0x024A, 0x0251, 0x025E).
Set the unchanged raw number of output registers to use in count1 (register 0x004F
), and set the number of used output registers minus 1
in count2 (register 0x025E
only). Note that these register settings have different values and bit widths.
The number of output registers is the number of output registers defined by #pragma output_map
for the geometry shader. In other words, even if multiple #pragma output_map
definitions are used for the individual components of a single output register, they still count as a single register.
8.8.20.9. Output Register Mask Setting Register (0x02BD)
The values to set in this register are the same as those in 8.8.1.11. Output Register Mask Setting Register (0x02BD). Only the register address is different.
Register 0x028D
corresponds to register 0x02BD
.
8.8.20.10. Output Register Attribute Setting Registers (0x0050 – 0x0056, 0x0064)
The values to set in these registers are the same as described in 8.8.1.12. Output Register Attribute Setting Registers (0x0050 – 0x0056, 0x0064), except for the fact that they modify the vertex attributes output by geometry shaders instead of those output by vertex shaders.
Geometry shader output attributes are determined by the #pragma output_map
settings defined in the geometry shader assembly code. This information is generated in the map files output by the shader assembly code linker. Several geometry shaders define generic
attributes as output attributes. Output attributes defined as generic
attributes take the #pragma output_map
settings that are defined only by the linked vertex shader. When this is done, the generic
attributes defined by the vertex shader are excluded.
For more information about map files, see the Vertex Shader Reference.
8.8.20.11. Output Attribute Clock Control Register (0x006F)
The values to set in this register are the same as those in 8.8.1.13. Output Attribute Clock Control Register (0x006F), except for the fact that they modify geometry shaders instead of vertex shaders.
8.8.20.12. Other Setting Registers (0x0229, 0x0252, 0x0254, 0x0289)
The following table shows other registers to set when geometry shaders are used.
Register | Description |
---|---|
0x0229, bit [1:0] |
Set to Dummy commands are required before and after any command that sets this register. Use dummy commands with a byte enable setting of |
0x0229, bits [31:31] | Set to 0x1 when subdivision shaders (Loop or Catmull-Clark) are used, and set to 0x0 when other geometry shaders are used or when geometry shaders are not used. |
0x0252, bits [31:0] | Set to 0x00000001 when subdivision shaders (Loop or Catmull-Clark) are used, 0x01004302 when particle systems are used, and 0x00000000 when other geometry shaders are used or when geometry shaders are not used. |
0x0289, bits [31:24] | Set to 0x08 when geometry shaders are used. Set to 0xA0 when geometry shaders are not used. |
0x0289, bits [15:8] | Set to 0x01 when subdivision shaders (Loop or Catmull-Clark) or particle systems are used, and 0x00 when other geometry shaders are used or when geometry shaders are not used. |
0x0254, bits [4:0] | Set to 0x3 when Catmull-Clark subdivision shaders are used and 0x2 when Loop subdivision shaders are used. In all other cases these bits are unused. |
8.8.21. Framebuffer Cache Clear Setting Registers (0x0110, 0x0111)
The registers that handle settings for clearing the framebuffer cache have the following layout.
When a value of 1
is written to clearFrameBufferCacheData
, cache data is flushed for both the color and depth buffers. When a value of 1
is written to clearFrameBufferCacheTag
, cache tags are cleared for both the color and depth buffers. You must use the command that clears the cache tags immediately after you use the command that flushes the cache data. In other words, the cache must first be flushed, and then the cache tags must be cleared.
Commands are generated to set these registers when the glFlush
, glFinish
, or glClear
function is called, when the state flag NN_GX_STATE_FBACCESS
is validated, and when the state flag NN_GX_STATE_FRAMEBUFFER
is validated due to changes in the color or depth buffer addresses. Also, when the 3D Command Buffer is split by the nngxSplitDrawCmdlist
function or other means, these commands are inserted immediately before the command that issues interrupts.
A standalone command to flush cache data is generated immediately after the render start command caused by the glDrawArrays
or glDrawElements
function.
Both of these registers must be set when any of the following have happened: all rendering is complete (and the rendering results have not been accessed yet), the color or depth buffer has been cleared, either buffer's settings (size, address, or format) have changed, or the access pattern has changed.
Although you must normally generate a command to set clearFrameBufferCacheData
immediately after each render command, this is not necessary when the following two conditions are met.
- After you have generated a command to set
clearFrameBufferCacheData
immediately following a render command, you do not generate any commands that set registers0x0100
through0x0130
until the next render command. - Before you set data in registers
0x0080
through0x00B7
following a render command, you generate three dummy commands (whose data and byte enable are both0
) to register0x0080
.
To set data in registers 0x0100
through 0x0130
after a render command, you must first generate a single command that sets clearFrameBufferCacheData
. As long as you have generated at least one such command, you can safely generate any number of commands that set data in registers 0x0100
through 0x0130
before the next render command.
In the same way, as long as you have generated three dummy commands (whose data and byte enable are both 0
) to register 0x0080
after a render command, you can safely generate any number of commands that set registers 0x0080
through 0x00B7
before the next render command.
8.8.22. Split Command Setting Register (0x0010)
Writing a value of 0x12345678
to register 0x0010
causes a GPU interrupt to occur. Set this command when splitting the 3D Command Buffer.
8.8.23. Command Buffer Execution Registers (0x0238 – 0x023D)
Command buffers are usually executed based on the information in queued render command requests. By using the command buffer execution registers, you can execute a different command buffer with the 3D commands accumulated in the first command buffer without needing to issue a GPU interrupt via a split command.
Two channels have been prepared for the command buffer execution interface. Channel 0 is used for normal command buffer execution. You cannot execute both channels simultaneously.
To execute a command buffer, you need to set three registers: one for the address of the command buffer to execute, one for the size of said command buffer, and one that is the execution register (also known as the kick register).
In the register that sets the address of the command buffer you want to execute (BUFFER_ADDR_CHANNEL0
or BUFFER_ADDR_CHANNEL1
), specify a value that is the result of dividing the command buffer’s physical address by 8 (this value is the 8-byte address). You must specify an even number.
In the register that sets the address of the command buffer you want to execute (BUFFER_ADDR_CHANNEL0
or BUFFER_ADDR_CHANNEL1
), specify a value that is the result of dividing the command buffer’s physical address by 8 (this value is the 8-byte address). You must specify an even number.
When any value is written to the kick register (KICK_CHANNEL0
or KICK_CHANNEL1
), the command buffer is executed using the address and size settings for that channel. Any value may be written. If the byte enable is not 0
, the command buffer is executed, regardless of what data is written to the kick register. Do not place a command that writes to the kick register anywhere in the command buffer except at the end. The command buffer address and size must both be 16-byte aligned (the values divided by 8 must not be odd), so the last address after storing a kick command (a command that writes to KICK_CHANNEL0
or KICK_CHANNEL1
) must be 16-byte aligned. The address and size settings do not affect the currently executing command buffer until a kick command is executed.
Notes
Pay attention to the following conditions when you use settings in the command buffer execution registers.
- Commands that write to the kick register (kick commands) must be placed at the end of the command buffer. Specify the command buffer size so that the kick command is last.
- You cannot write to the kick register in the middle of a burst access. If you write to the kick register at the end of the burst access, however, a kick command will be executed.
- Although the address and size settings for the two channels are maintained after a kick command is executed, the settings for channel 0 are overwritten when
nngx
functions or other functions execute 3D commands. - The command buffer’s processing might not run correctly if you have not flushed the command buffer region you want to execute.
- Channel 0 and channel 1 cannot be executed simultaneously.
- Command execution is interrupted for the time between executing a kick command and executing the specified command in the command buffer. Consequently, frequently executing kick commands that specify a small command buffer can increase the GPU processing load by a noticeable degree.
8.8.23.1. Consecutive Execution of Command Buffers
As shown in the following figure, you can execute N command buffers in a row by storing commands that write to the command buffer execution registers at the end of your command buffers.
A split command is normally stored at the end of the command buffer, as shown in command buffer N above. By replacing this split command with three commands (one to set the size of the command buffer to execute next, one to set the address of said buffer, and one kick command), you can execute multiple command buffers without ever issuing a GPU interrupt, thus reducing the CPU load. However, a split command must be the last command that is executed in the last command buffer.
To execute the first command buffer, call the nngxAdd3DCommand
function with the command buffer’s address and size specified as the bufferaddr
and buffersize
parameters, respectively, and GL_FALSE
specified as the copycmd
parameter.
8.8.23.2. Repeated Execution of the Same Command Buffer
The example in the previous section requires you to provide a separate command buffer, even if you just want to repeatedly execute the same commands. By using the two available channels, however, you can repeatedly execute the same command buffer.
For example, let’s say that you write to the combination of registers listed in the table below. These registers are all consecutive, so you can formulate a single write command that writes to all of them at the same time: a jump command. But if you use a jump command, the command buffer that you jump to must end with a kick command for channel 1.
Register | Value to Set |
---|---|
BUFFER_SIZE_CHANNEL0 | Size of the command buffer to jump to. |
BUFFER_SIZE_CHANNEL1 | Size of the command buffer to return to (until the next jump command or split command). |
BUFFER_ADDR_CHANNEL0 | Address of the command buffer to jump to. |
BUFFER_ADDR_CHANNEL1 | Address of the command buffer to return to (this address must be located immediately after the jump command described by this table). |
KICK_CHANNEL0 | Any value (executes a kick command). |
This method allows you to perform operations such as the following.
- Repeatedly execute a specific rendering process.
- Edit information stored at the destination of the jump, and thereby branch the rendering process.
- Modularize the rendering process, and construct a scene with jump commands alone.
In Figure 8-61 below, command buffer 0
configures channel 1 as the command buffer to return to (meaning that channel 1 holds the address and size of the command buffer to execute next), and configures the address and size of command buffer A
on channel 0. After these settings have been configured, a channel-0 kick command executes command buffer A
, and then a channel-1 kick command at the end of command buffer A
returns execution to command buffer 1. Note that you must firmly decide ahead of time which channel you will use when returning from the jump because you cannot change the content of the command buffer that you jump to while you are executing it.
8.8.24. Settings for Undocumented Bits
Some of the registers described so far have undocumented bits. You must use a byte enable setting of 0
to avoid accessing some of these undocumented bits, and you must set others to fixed values. This section provides information about access to those bits. Although bits that are completely undocumented (mentioned neither in the preceding sections nor this section) can, in theory, be set to any value without affecting the GPU, we recommend that you use a byte enable setting of 0
whenever possible. Do not set any registers that are not documented.
Applications must not issue commands to initialize registers in which the following table instructs you to set a fixed value because nngxInitialize
initializes the registers with these values. If any byte in a single register contains both bits that must be set to fixed values and bits that can be changed, both types of values must be written at the same time.
Register | Description |
---|---|
0x0061, bits [31:8] | Set a byte enable of 0 to ensure no access. |
0x0062, bits [31:8] | Set a byte enable of 0 to ensure no access. |
0x006A, bits [31:24] | Set a byte enable of 0 to ensure no access. |
0x006E, bits [24:24] | Set to 1 . |
0x0080, bits [3:3] and bits [31:24] | Set to 0 . |
0x0080, bits [12:12] | Set to 1 . |
0x0080, bits [23:17] | Set to 0 when you write to bit [16:16 ] of this register to clear the texture cache. Otherwise, set a byte enable of 0 to ensure no access. |
0x0083, bits [17:16] | Set to 0 . |
0x0093, bits [17:16] | Set to 0 . |
0x009B, bits [17:16] | Set to 0 . |
0x00AC, bits [10:3] | Set to 0x60 . |
0x00AD, bits [31:8] | Set to 0xE0C080 . |
0x00E0, bits [25:24] | Set to 0 . |
0x0100, bits [25:16] | Set to 0x0E4 . |
0x0110, bits [31:1] | Set to 0 . |
0x0111, bits [31:1] | Set to 0 . |
0x011E, bits [24:24] | Set to 1 . |
0x01C0, bits [9:8] | Set to 0 . |
0x01C0, bits [19:18] | Set to 0 . |
0x01C0, bits [29:28] | Set to 0 . |
0x01C3, bits [11:8] | Set to 0x4 . |
0x01C3, bits [31:31] | Set to 1 . |
0x01C4, bits [18:18] | Set to 1 . |
0x0229, bits [9:9] | Set to 0 . |
0x0229, bits [23:16] | Set a byte enable of 0 to ensure no access. |
0x0244, bits [31:8] | Set a byte enable of 0 to ensure no access. |
0x0245, bits [7:1] | Set to 0 . |
0x0245, bits [31:8] | Set a byte enable of 0 to ensure no access. |
0x0253, bits [31:16] | Set a byte enable of 0 to ensure no access. |
0x025E, bits [16:16] | Set a byte enable of 0 to ensure no access. |
0x025F, bits [31:1] | Set to 0 . |
0x0280, bits [31:16] | Set to 0x7FFF . |
0x0289, bits [23:16] | Set a byte enable of 0 to ensure no access. |
0x028A, bits [31:16] | Set to 0x7FFF . |
0x028D, bits [31:16] | Set to 0x0000 . |
0x02B0, bits [31:16] | Set to 0x7FFF . |
0x02B9, bits [15:8] | Set to 0x00 . |
0x02B9, bits [23:16] | Set a byte enable of 0 to ensure no access. |
0x02B9, bits [31:24] | Set to 0xA0 . |
0x02BA, bits [31:16] | Set to 0x7FFF . |
0x02BD, bits [31:16] | Set to 0x0000 . |
8.8.25. Register Settings When Geometry Shaders Are Used
This section explains which values to set in the registers described by 8.8.20. Geometry Shader Setting Registers (0x0280 – 0x02AF) when you use the geometry shaders provided by the SDK.
8.8.25.1. Point Shaders
The following table shows the register values to set when using point shaders.
Register | Description |
---|---|
0x004F, bits [2:0] | Set to the number of output registers defined by the linked vertex shader. This count does not include generic attributes. |
0x0050 – 0x0056 |
Set these, starting from Set registers For example, because a point sprite's vertex coordinates are assumed to be followed by texture coordinates, for attributes defined by If any attributes are unused, that portion is packed bytewise with |
0x0064 | Set in accordance with the output register attributes defined by the linked vertex shader. |
0x006F | Set in accordance with the output register attributes defined by the linked vertex shader. |
0x0229, bits [31:31] | Set to 0 . |
0x0242, bits [3:0] | Set to one less than the number of vertex attributes to input to the linked vertex shader. |
0x024A, bits [3:0] | Set to one less than the number of output registers defined by the linked vertex shader. This count includes generic attributes. |
0x0251, bits [3:0] | Set to one less than the number of output registers defined by the linked vertex shader. This count includes generic attributes. |
0x0252 | Set to 0x00000000 . |
0x0254, bits [4:0] | No setting required. |
0x025E, bits [3:0] | Set to one less than the number of output registers defined by the linked vertex shader. This count does not include generic attributes. |
0x0280, bits [15:0] | Set to 0x0000 . |
0x0281, bits [23:0] | No setting required. |
0x0282, bits [23:0] | No setting required. |
0x0283, bits [23:0] | No setting required. |
0x0284, bits [23:0] | No setting required. |
0x0289, bits [3:0] | Set to one less than the number of output registers defined by the linked vertex shader. This count includes generic attributes. |
0x0289, bits [15:8] | Set to 0x00 . |
0x0289, bits [31:24] | Set to 0x08 . |
0x028D, bits [15:0] | Set to ((1 << N) - 1), where N is the number of output registers defined by the linked vertex shader. This count does not include generic attributes. |
0x0290 – 0x0293 |
Write the following combinations to registers {
|
When you use point shaders, you must also set the reserved uniforms that are assigned to the specific registers shown below.
Reserved Uniform | Allocated Register |
---|---|
dmp_Point.viewport | c67.xy |
dmp_Point.distanceAttenuation | b0 |
8.8.25.2. Line Shaders
The following table shows the register values to set when line shaders are used.
Register | Description |
---|---|
0x004F, bits [2:0] | Set to the number of output registers defined by the linked vertex shader. |
0x0050 – 0x0056 |
Set these, starting from Set registers If any attributes are unused, that portion is packed bytewise with |
0x0064 | Set in accordance with the output register attributes defined by the linked vertex shader. |
0x006F | Set in accordance with the output register attributes defined by the linked vertex shader. |
0x0229, bits [31:31] | Set to 0 . |
0x0242, bits [3:0] | Set to one less than the number of vertex attributes to input to the linked vertex shader. |
0x024A, bits [3:0] | Set to one less than the number of output registers defined by the linked vertex shader. |
0x0251, bits [3:0] | Set to one less than the number of output registers defined by the linked vertex shader. |
0x0252 | Set to 0x00000000 . |
0x0254, bits [4:0] | No setting required. |
0x025E, bits [3:0] | Set to one less than the number of output registers defined by the linked vertex shader. |
0x0280, bits [15:0] | Set to 0x0000 . Bit [15:15 ] must be set each time rendering is performed. |
0x0281, bits [23:0] | No setting required. |
0x0282, bits [23:0] | No setting required. |
0x0283, bits [23:0] | No setting required. |
0x0284, bits [23:0] | No setting required. |
0x0289, bits [3:0] | Set to one less than the number of output registers defined by the linked vertex shader. |
0x0289, bits [15:8] | Set to 0x00 . |
0x0289, bits [31:24] | Set to 0x08 . |
0x028D, bits [15:0] | Set to ((1 << N) - 1), where N is the number of output registers defined by the linked vertex shader. |
0x0290 – 0x0293 |
Write the following combinations to registers {
|
When you use line shaders, you must also set the reserved uniforms that are assigned to the specific registers shown below.
Reserved Uniform | Allocated Register |
---|---|
dmp_Line.width | c67.xyzw |
8.8.25.3. Silhouette Shaders
The following table shows the register values to set when using particle system shaders.
Register | Description |
---|---|
0x004F, bits [2:0] | Set to 0x2 . |
0x0050 – 0x0056 | Set register 0x0050 to 0x03020100 . Set register 0x0051 to 0x0B0A0908 . Set registers 0x0052 through 0x0056 to 0x1F1F1F1F . |
0x0064 | Set to 0x00000000 . |
0x006F | Set to 0x00000003 . |
0x0229, bits [31:31] | Set to 0 . |
0x0242, bits [3:0] | Set to one less than the number of vertex attributes to input to the linked vertex shader. |
0x024A, bits [3:0] | Set to 0x2 . |
0x0251, bits [3:0] | Set to 0x2 . |
0x0252 | Set to 0x00000000 . |
0x0254, bits [4:0] | No setting required. |
0x025E, bits [3:0] | Set to 0x1 . |
0x0280, bits [15:0] | Set to 0x0000 . Bit [15:15 ] must be set each time rendering is performed. |
0x0281, bits [23:0] | No setting required. |
0x0282, bits [23:0] | No setting required. |
0x0283, bits [23:0] | No setting required. |
0x0284, bits [23:0] | No setting required. |
0x0289, bits [3:0] | Set to 0x2 because the vertex shader outputs three vertex attributes: vertex coordinates, colors, and normals. |
0x0289, bits [15:8] | Set to 0x00 . |
0x0289, bits [31:24] | Set to 0x08 . |
0x028D, bits [15:0] | Set to 0x0003 . |
0x0290 – 0x0293 |
Write the following combinations to registers { { |
When you use silhouette shaders, you must also set the reserved uniforms that are assigned to the specific registers shown below.
Reserved Uniform | Allocated Register |
---|---|
dmp_Silhouette.width | c71.xy |
dmp_Silhouette.openEdgeDepthBias | c71.z |
dmp_Silhouette.color | c72.xyzw |
dmp_Silhouette.openEdgeColor | c73.xyzw |
dmp_Silhouette.openEdgeWidth | c74.xyzw |
dmp_Silhouette.acceptEmptyTriangles | b0 |
dmp_Silhouette.scaleByW | b1 |
dmp_Silhouette.frontFaceCCW | b2 |
dmp_Silhouette.openEdgeWidthScaleByW | b3 |
dmp_Silhouette.openEdgeDepthBiasScaleByW | b4 |
8.8.25.4. Catmull-Clark Subdivision Shaders
The following table shows the register values to set when using Catmull-Clark subdivision shaders.
Register | Description |
---|---|
0x004F, bits [2:0] | Set to the number of output registers defined by the linked vertex shader. |
0x0050 – 0x0056 |
Set these, starting from Set registers If any attributes are unused, that portion is packed bytewise with |
0x0064 | Set in accordance with the output register attributes defined by the linked vertex shader. |
0x006F | Set in accordance with the output register attributes defined by the linked vertex shader. |
0x0229, bits [31:31] | Set to 1 . |
0x0242, bits [3:0] | Set to one less than the number of vertex attributes to input to the linked vertex shader. |
0x024A, bits [3:0] | Set to one less than the number of output registers defined by the linked vertex shader. |
0x0251, bits [3:0] | Set to one less than the number of output registers defined by the linked vertex shader. |
0x0252 | Set to 0x00000001 . |
0x0254, bits [4:0] | Set to 0x03 . |
0x025E, bits [3:0] | Set to one less than the number of output registers defined by the linked vertex shader. |
0x0280, bits [15:0] | Set to 0x0000 . Bit [15:15 ] must be set each time rendering is performed. |
0x0281, bits [23:0] | No setting required. |
0x0282, bits [23:0] |
The value to set varies with the linked geometry shader's program object file. Set |
0x0283, bits [23:0] | No setting required. |
0x0284, bits [23:0] | No setting required. |
0x0289, bits [3:0] | Set to one less than the number of output registers defined by the linked vertex shader. |
0x0289, bits [15:8] | Set to 0x01. |
0x0289, bits [31:24] | Set to 0x08 . |
0x028D, bits [15:0] | Set to ((1 << N) - 1), where N is the number of output registers defined by the linked vertex shader. |
0x0290 – 0x0293 |
Write the following combinations to registers {
Set Set Set Set Set Set |
When you use Catmull-Clark subdivision shaders, you must also set the reserved uniforms that are assigned to the specific registers shown below.
Reserved Uniform | Allocated Register |
---|---|
dmp_Subdivision.level | c74.x |
dmp_Subdivision.fragmentLightingEnabled | b2 |
8.8.25.5. Loop Subdivision Shader
The following table shows the register values to set when Loop subdivision shaders are used.
Register | Description |
---|---|
0x004F, bits [2:0] | Set to the number of output registers defined by the linked vertex shader. This count does not include generic attributes. |
0x0050 – 0x0056 |
Set these, starting from Set registers If any attributes are unused, that portion is packed bytewise with |
0x0064 | Set in accordance with the output register attributes defined by the linked vertex shader. |
0x006F | Set in accordance with the output register attributes defined by the linked vertex shader. |
0x0229, bits [31:31] | Set to 1 . |
0x0242, bits [3:0] | Set to one less than the number of vertex attributes to input to the linked vertex shader. |
0x024A, bits [3:0] | Set to one less than the number of output registers defined by the linked vertex shader. This count includes generic attributes. |
0x0251, bits [3:0] | Set to one less than the number of output registers defined by the linked vertex shader. This count includes generic attributes. |
0x0252 | Set to 0x00000001 . |
0x0254, bits [4:0] | Set to 0x02 . |
0x025E, bits [3:0] | Set to one less than the number of output registers defined by the linked vertex shader. This count does not include generic attributes. |
0x0280, bits [15:0] | Set to 0x0000 . Bit [15:15 ] must be set each time rendering is performed. |
0x0281, bits [23:0] | No setting required. |
0x0282, bits [23:0] | No setting required. |
0x0283, bits [23:0] | No setting required. |
0x0284, bits [23:0] | No setting required. |
0x0289, bits [3:0] | Set to one less than the number of output registers defined by the linked vertex shader. This count includes generic attributes. |
0x0289, bits [15:8] | Set to 0x01. |
0x0289, bits [31:24] | Set to 0x08 . |
0x028D, bits [15:0] | Set to ((1 << N) - 1), where N is the number of output registers defined by the linked vertex shader. This count does not include generic attributes. |
0x0290 – 0x0293 |
Write the following combinations to registers {
|
When you use Loop subdivision shaders, you must also set the reserved uniforms that are assigned to the specific registers shown below.
Reserved Uniform | Allocated Register |
---|---|
dmp_Subdivision.level | c86.x |
dmp_Subdivision.fragmentLightingEnabled | b0 |
8.8.25.6. Particle System Shaders
The following table shows the register values to set when using particle system shaders.
Register | Description |
---|---|
0x004F, bits [2:0] | Set to 0x3 . |
0x0050 – 0x0056 |
Set register Set registers |
0x0064 | Set to 0x00000000 . |
0x006F | Set to 0x00000503 when texture coordinate 2 is used or 0x00000103 otherwise. |
0x0229, bits [31:31] | Set to 0 . |
0x0242, bits [3:0] | Set to one less than the number of vertex attributes to input to the linked vertex shader. |
0x024A, bits [3:0] | Set to 0x4 . |
0x0251, bits [3:0] | Set to 0x4 . |
0x0252 | Set to 0x01004302 . |
0x0254, bits [4:0] | No setting required. |
0x025E, bits [3:0] | Set to 0x2 . |
0x0280, bits [15:0] | Set to 0x0000 . |
0x0281, bits [23:0] | No setting required. |
0x0282, bits [23:0] | No setting required. |
0x0283, bits [23:0] | No setting required. |
0x0284, bits [23:0] | Set to 0x0100FE . |
0x0289, bits [3:0] | Set to 0x4 because the vertex shader outputs a total of five vertex attributes: the vertex coordinates and the four bounding-box sizes for the control points. |
0x0289, bits [15:8] | Set to 0x01. |
0x0289, bits [31:24] | Set to 0x08 . |
0x028D, bits [15:0] | Set to 0x0007 . |
0x0290 – 0x0293 |
Write the following combinations to registers {
|
When you use particle system shaders, you must also set the reserved uniforms that are assigned to the specific registers shown below.
Reserved Uniform | Allocated Register |
---|---|
dmp_PartSys.color | c26.xyzw – c29.xyzw |
dmp_PartSys.viewport | c30.xy |
dmp_PartSys.pointSize | c31.xy |
dmp_PartSys.time | c31.z |
dmp_PartSys.speed | c31.w |
dmp_PartSys.distanceAttenuation | c32.xyz |
dmp_PartSys.countMax | c32.w |
dmp_PartSys.randSeed | c33.xyzw |
dmp_PartSys.aspect | c34.xyzw – c37.xyzw |
dmp_PartSys.randomCore | c38.xyzw |
8.9. Format Conversions for Values Set to Registers
Some of the values set to registers undergo internal format conversion from the values set by the application. Most of these involve conversion from 32-bit floating-point numbers.
8.9.1. Conversion to a 24-Bit Floating-Point Number
The following code converts a 32-bit floating-point number into a 24-bit floating-point number (with a 1-bit sign, 7-bit exponent, and 16-bit significand). When you pass a 32-bit floating-point number into _inarg
, a 24-bit floating-point number is stored as an unsigned int
value in _outarg
.
#define UTL_F2F_16M7E(_inarg, _outarg) \ { \ unsigned uval_, m_; \ int e_; \ float f_; \ static const int bias_ = 128 - (1 << (7 - 1)); \ f_ = (_inarg); \ uval_ = *(unsigned*)&f_; \ e_ = (uval_ & 0x7fffffff) ? (((uval_ >> 23) & 0xff) - bias_) : 0; \ m_ = (uval_ & 0x7fffff) >> (23 - 16); \ if (e_ >= 0) \ _outarg = m_ | (e_ << 16) | ((uval_ >> 31) << (16 + 7)); \ else \ _outarg = ((uval_ >> 31) << (16 + 7)); \ }
8.9.2. Conversion to a 16-Bit Floating-Point Number
The following code converts a 32-bit floating-point number into a 16-bit floating-point number (with a 1-bit sign, 5-bit exponent, and 10-bit significand). When you pass a 32-bit floating-point number into _inarg
, a 16-bit floating-point number is stored as an unsigned int
value in _outarg
.
#define UTL_F2F_10M5E(_inarg, _outarg) \ { \ unsigned uval_, m_; \ int e_; \ float f_; \ static const int bias_ = 128 - (1 << (5 - 1)); \ f_ = (_inarg); \ uval_ = *(unsigned*)&f_; \ e_ = (uval_ & 0x7fffffff) ? (((uval_ >> 23) & 0xff) - bias_) : 0; \ m_ = (uval_ & 0x7fffff) >> (23 - 10); \ if (e_ >= 0) \ _outarg = m_ | (e_ << 10) | ((uval_ >> 31) << (10 + 5)); \ else \ _outarg = ((uval_ >> 31) << (10 + 5)); \ }
8.9.3. Conversion to a 31-Bit Floating-Point Number
The following code converts a 32-bit floating-point number into a 31-bit floating-point number (with a 1-bit sign, 7-bit exponent, and 23-bit significand). When you pass a 32-bit floating-point number into _inarg
, a 31-bit floating-point number is stored as an unsigned int
value in _outarg
.
#define UTL_F2F_23M7E(_inarg, _outarg) \ { \ unsigned uval_, m_; \ int e_; \ float f_; \ static const int bias_ = 128 - (1 << (7 - 1)); \ f_ = (_inarg); \ uval_ = *(unsigned*)&f_; \ e_ = (uval_ & 0x7fffffff) ? (((uval_ >> 23) & 0xff) - bias_) : 0; \ m_ = (uval_ & 0x7fffff) >> (23 - 23); \ if (e_ >= 0) \ _outarg = m_ | (e_ << 23) | ((uval_ >> 31) << (23 + 7)); \ else \ _outarg = ((uval_ >> 31) << (23 + 7)); \ }
8.9.4. Conversion to a 20-Bit Floating-Point Number
The following code converts a 32-bit floating-point number into a 20-bit floating-point number (with a 1-bit sign, 7-bit exponent, and 12-bit significand). When you pass a 32-bit floating-point number into _inarg
, a 20-bit floating-point number is stored as an unsigned int
value in _outarg
.
#define UTL_F2F_12M_7E(_inarg, _outarg) \ { \ unsigned uval_, m_; \ int e_; \ float f_; \ static const int bias_ = 128 - (1 << (7 - 1)); \ f_ = (_inarg); \ uval_ = *(unsigned*)&f_; \ e_ = (uval_ & 0x7fffffff) ? (((uval_ >> 23) & 0xff) - bias_) : 0; \ m_ = (uval_ & 0x7fffff) >> (23 - 12); \ if (e_ >= 0) \ _outarg = m_ | (e_ << 12) | ((uval_ >> 31) << (12 + 7)); \ else \ _outarg = ((uval_ >> 31) << (12 + 7)); \ }
8.9.5. Conversion to an 8-Bit Signed Fixed-Point Number With 7 Fractional Bits
The following code converts a 32-bit floating-point number into an 8-bit signed fixed-point number with 7 fractional bits. Negative values are represented in two's complement. If you pass a 32-bit floating-point number to _inarg
, a 8-bit fixed-point number is stored in _outarg
.
#define UTL_F2FX_8W_1I_T(_inarg, _outarg) \ { \ float f_; \ unsigned v_; \ f_ = (_inarg); \ v_ = *(unsigned*)&f_; \ if (f_ == 0.f || (v_ & 0x7f800000) == 0x7f800000) \ _outarg = 0; \ else \ { \ f_ += 0.5f * (1 << 1); \ f_ *= 1 << (8 - 1); \ if (f_ < 0) \ f_ = 0; \ else if (f_ >= (1 << 8)) \ f_ = (1 << 8) - 1; \ if (f_ >= (1 << (8 - 1))) \ _outarg = (unsigned)(f_ - (1 << (8 - 1))); \ else \ _outarg = (unsigned)(f_ + (1 << (8 - 1))); \ } \ }
8.9.6. Conversion to a 12-Bit Signed Fixed-Point Number With 11 Fractional Bits
The following code converts a 32-bit floating-point number into a 12-bit signed fixed-point number with 11 fractional bits. Because the fractional bits are absolute values, negative values are not represented in two's complement. If you pass a 32-bit floating-point number to _inarg
, a 12-bit fixed-point number is stored in _outarg
.
#define UTL_F2FX_12W_1I_F(_inarg, _outarg) \ { \ float f_; \ unsigned v_; \ f_ = (_inarg); \ v_ = *(unsigned*)&f_; \ if (f_ == 0.f || (v_ & 0x7f800000) == 0x7f800000) \ _outarg = 0; \ else \ { \ f_ *= (1 << (12 - 1)); \ if (f_ < 0) \ { \ _outarg = 1 << (12 - 1); \ f_ = -f_; \ } \ else \ _outarg = 0; \ if (f_ >= (1 << (12 - 1))) f_ = (1 << (12 - 1)) - 1; \ _outarg |= (unsigned)(f_); \ } \ }
8.9.7. Conversion to a 12-Bit Signed Fixed-Point Number With 11 Fractional Bits
The following code converts a 32-bit floating-point number into a 12-bit signed fixed-point number with 11 fractional bits. Negative values are represented in two's complement. If you pass a 32-bit floating-point number to _inarg
, a 12-bit fixed-point number is stored in _outarg
.
#define UTL_F2FX_12W_1I_T(_inarg, _outarg) \ { \ float f_; \ unsigned v_; \ f_ = (_inarg); \ v_ = *(unsigned*)&f_; \ if (f_ == 0.f || (v_ & 0x7f800000) == 0x7f800000) \ _outarg = 0; \ else \ { \ f_ += 0.5f * (1 << 1); \ f_ *= 1 << (12 - 1); \ if (f_ < 0) \ f_ = 0; \ else if (f_ >= (1 << 12)) \ f_ = (1 << 12) - 1; \ if (f_ >= (1 << (12 - 1))) \ _outarg = (unsigned)(f_ - (1 << (12 - 1))); \ else \ _outarg = (unsigned)(f_ + (1 << (12 - 1))); \ } \ }
8.9.8. Conversion to a 13-Bit Signed Fixed-Point Number With 8 Fractional Bits
The following code converts a 32-bit floating-point number into a 13-bit signed fixed-point number with 8 fractional bits. Negative values are represented in two's complement. If you pass a 32-bit floating-point number to _inarg
, a 13-bit fixed-point number is stored in _outarg
.
#define UTL_F2FX_13W_5I_T(_inarg, _outarg) \ { \ float f_; \ unsigned v_; \ f_ = (_inarg); \ v_ = *(unsigned*)&f_; \ if (f_ == 0.f || (v_ & 0x7f800000) == 0x7f800000) \ _outarg = 0; \ else \ { \ f_ += 0.5f * (1 << 5); \ f_ *= 1 << (13 - 5); \ if (f_ < 0) \ f_ = 0; \ else if (f_ >= (1 << 13)) \ f_ = (1 << 13) - 1; \ if (f_ >= (1 << (13 - 1))) \ _outarg = (unsigned)(f_ - (1 << (13 - 1))); \ else \ _outarg = (unsigned)(f_ + (1 << (13 - 1))); \ } \ }
8.9.9. Conversion to a 13-Bit Signed Fixed-Point Number With 11 Fractional Bits
The following code converts a 32-bit floating-point number into a 13-bit signed fixed-point number with 11 fractional bits. Negative values are represented in two's complement. If you pass a 32-bit floating-point number to _inarg
, a 13-bit fixed-point number is stored in _outarg
.
#define UTL_F2FX_13W_2I_T(_inarg, _outarg) \ { \ float f_; \ unsigned v_; \ f_ = (_inarg); \ v_ = *(unsigned*)&f_; \ if (f_ == 0.f || (v_ & 0x7f800000) == 0x7f800000) \ _outarg = 0; \ else \ { \ f_ += 0.5f * (1 << 2); \ f_ *= 1 << (13 - 2); \ if (f_ < 0) \ f_ = 0; \ else if (f_ >= (1 << 13)) \ f_ = (1 << 13) - 1; \ if (f_ >= (1 << (13 - 1))) \ _outarg = (unsigned)(f_ - (1 << (13 - 1))); \ else \ _outarg = (unsigned)(f_ + (1 << (13 - 1))); \ } \ }
8.9.10. Conversion to a 16-Bit Signed Fixed-Point Number With 12 Fractional Bits
The following code converts a 32-bit floating-point number into a 16-bit signed fixed-point number with 12 fractional bits. Negative values are represented in two's complement. If you pass a 32-bit floating-point number to _inarg
, a 16-bit fixed-point number is stored in _outarg
.
#define UTL_F2FX_16W_4I_T(_inarg, _outarg) \ { \ float f_; \ unsigned v_; \ f_ = (_inarg); \ v_ = *(unsigned*)&f_; \ if (f_ == 0.f || (v_ & 0x7f800000) == 0x7f800000) \ _outarg = 0; \ else \ { \ f_ += 0.5f * (1 << 4); \ f_ *= 1 << (16 - 4); \ if (f_ < 0) \ f_ = 0; \ else if (f_ >= (1 << 16)) \ f_ = (1 << 16) - 1; \ if (f_ >= (1 << (16 - 1))) \ _outarg = (unsigned)(f_ - (1 << (16 - 1))); \ else \ _outarg = (unsigned)(f_ + (1 << (16 - 1))); \ } \ }
8.9.11. Conversion to a 0-Bit Signed Fixed-Point Number With 8 Fractional Bits
The following code converts a 32-bit floating-point number into an 8-bit unsigned fixed-point number with 0 fractional bits. If you pass a 32-bit floating-point number to _inarg
, an 8-bit fixed-point number is stored in _outarg
.
#define UTL_F2UFX_8W_8I(_inarg, _outarg) \ { \ float f_ = (_inarg); \ unsigned val_; \ unsigned v_ = *(unsigned*)&f_; \ if (f_ <= 0 || (v_ & 0x7f800000) == 0x7f800000) \ val_ = 0; \ else \ { \ f_ *= 1 << (8 - 8); \ if (f_ >= (1 << 8)) \ val_ = (1 << 8) - 1; \ else \ val_ = (unsigned)(f_); \ } \ (_outarg) = val_; \ }
8.9.12. Conversion to an 11-Bit Signed Fixed-Point Number With 11 Fractional Bits
The following code converts a 32-bit floating-point number into an 11-bit unsigned fixed-point number with 11 fractional bits. If you pass a 32-bit floating-point number to _inarg
, an 11-bit fixed-point number is stored in _outarg
.
#define UTL_F2UFX_11W_0I(_inarg, _outarg) \ { \ float f_ = (_inarg); \ unsigned val_; \ unsigned v_ = *(unsigned*)&f_; \ if (f_ <= 0 || (v_ & 0x7f800000) == 0x7f800000) \ val_ = 0; \ else \ { \ f_ *= 1 << (11 - 0); \ if (f_ >= (1 << 11)) \ val_ = (1 << 11) - 1; \ else \ val_ = (unsigned)(f_); \ } \ (_outarg) = val_; \ }
8.9.13. Conversion to a 12-Bit Unsigned Fixed-Point Number With 12 Fractional Bits
The following code converts a 32-bit floating-point number into a 12-bit unsigned fixed-point number with 12 fractional bits. If you pass a 32-bit floating-point number to _inarg
, a 12-bit fixed-point number is stored in _outarg
.
#define UTL_F2UFX_12W_0I(_inarg, _outarg) \ { \ float f_ = (_inarg); \ unsigned val_; \ unsigned v_ = *(unsigned*)&f_; \ if (f_ <= 0 || (v_ & 0x7f800000) == 0x7f800000) \ val_ = 0; \ else \ { \ f_ *= 1 << (12 - 0); \ if (f_ >= (1 << 12)) \ val_ = (1 << 12) - 1; \ else \ val_ = (unsigned)(f_); \ } \ (_outarg) = val_; \ }
8.9.14. Conversion to a 24-Bit Unsigned Fixed-Point Number With 24 Fractional Bits
The following code converts a 32-bit floating-point number into a 24-bit unsigned fixed-point number with 24 fractional bits. If you pass a 32-bit floating-point number to _inarg
, a 24-bit fixed-point number is stored in _outarg
.
#define UTL_F2UFX_24W_0I(_inarg, _outarg) \ { \ float f_ = (_inarg); \ unsigned val_; \ unsigned v_ = *(unsigned*)&f_; \ if (f_ <= 0 || (v_ & 0x7f800000) == 0x7f800000) \ val_ = 0; \ else \ { \ f_ *= 1 << (24 - 0); \ if (f_ >= (1 << 24)) \ val_ = (1 << 24) - 1; \ else \ val_ = (unsigned)(f_); \ } \ (_outarg) = val_; \ }
8.9.15. Conversion to an 8-Bit Unsigned Fixed-Point Number With 24 Fractional Bits
The following code converts a 32-bit floating-point number into a 24-bit unsigned fixed-point number with 8 fractional bits. If you pass a 32-bit floating-point number to _inarg
, a 24-bit fixed-point number is stored in _outarg
.
#define UTL_F2UFX_24W_16I(_inarg, _outarg) \ { \ float f_ = (_inarg); \ unsigned val_; \ unsigned v_ = *(unsigned*)&f_; \ if (f_ <= 0 || (v_ & 0x7f800000) == 0x7f800000) \ val_ = 0; \ else \ { \ f_ *= 1 << (24 - 16); \ if (f_ >= (1 << 24)) \ val_ = (1 << 24) - 1; \ else \ val_ = (unsigned)(f_); \ } \ (_outarg) = val_; \ }
8.9.16. Conversion from a Floating-Point Number (Between 0 and 1) to an 8-Bit Unsigned Integer
The following code converts a 32-bit floating-point number in the range from 0 through 1 into an 8-bit unsigned integer. If you pass a 32-bit floating-point number into f
, an 8-bit unsigned integer is returned.
((unsigned)(0.5f + (f) * (float)((1 << 8) - 1)))
8.9.17. Conversion From a Floating-Point Number (Between 0 and 1) to an 8-Bit Unsigned Integer
The following code converts a 32-bit floating-point number in the range from 0 through 1 into an 8-bit unsigned integer. If you pass a 32-bit floating-point number into f
, an 8-bit unsigned integer is returned.
((unsigned)((f) * (float)((1 << 8) - 1)))
8.9.18. Conversion From a Floating-Point Number (From -1 Through 1) to an 8-Bit Unsigned Integer
The following code converts a 32-bit floating-point number in the range from –1 through 1 into an 8-bit signed integer. If you pass a 32-bit floating-point number into f
, an 8-bit signed integer is returned.
(((unsigned int)(fabs(127.f * (f))) & 0x7f)|(f < 0 ? 0x80 : 0))
8.9.19. Conversion From a 16-Bit Floating-Point Number to a 32-Bit Floating-Point Number
The following code converts a 16-bit floating-point number (with one sign bit, a 5-bit exponent, and a 10-bit significand) into a 32-bit floating-point number. If you pass a 16-bit floating-point number stored as an unsigned int
to _inarg
, a 32-bit floating-point number is stored in the float
variable specified by _outarg
.
#define UTL_U2F_10M5E(_inarg, _outarg) \ { \ int e_; \ unsigned m_; \ unsigned u_ = (_inarg); \ const int width_ = 10 + 5 + 1; \ const int bias_ = 128 - (1 << (5 - 1)); \ e_ = (u_ >> 10) & ((1 << 5) - 1); \ m_ = u_ & ((1 << 10) - 1); \ if (u_ & ((1 << (width_ - 1)) - 1)) \ u_ = ((u_ >> (5 + 10)) << 31) | (m_ << (23 - 10)) | ((e_ + bias_) << 23); \ else \ u_ = ((u_ >> (5 + 10)) << 31); \ (_outarg) = *(float*)&u_; \ }
8.10. Register Map
This section lists register maps for the registers described in 8.8. PICA Register Information. The register maps are organized by register address, and they include the related section to see in this document, related functions or uniforms, and the states in which the register may be updated.
8.10.1. Register Map for Registers 0x0010 – 0x00FF
Address | Section | Function / Reserved Uniform | NN_GX_STATE_ |
---|---|---|---|
0x0010 | 8.8.22 | nngxSplitDrawCmdlist, nngxTransferRenderImage | - |
0x0040 | 8.8.15 | glCullFace, glDisable(GL_CULL_FACE), glEnable(GL_CULL_FACE), glFrontFace | OTHERS |
0x0041 through 0x0044 | 8.8.10 | glViewport |
OTHERS |
0x0047 | 8.8.9.4 | dmp_FragOperation.enableClippingPlane |
FSUNIFORM |
0x0048 through 0x004B | dmp_FragOperation.clippingPlane |
FSUNIFORM | |
0x004D | 8.8.9.3 | dmp_FragOperation.wScale, glDepthRangef, glDisable(GL_POLYGON_OFFSET_FILL), glEnable(GL_POLYGON_OFFSET_FILL), glPolygonOffset |
FSUNIFORM TRIOFFSET |
0x004E | |||
0x004F | Sets the number of used output registers. | SHADERPROGRAM | |
0x0050 through 0x0056 | Output register attributes settings. | SHADERPROGRAM | |
0x0061 | 8.8.13 | glEarlyDepthFuncDMP |
OTHERS |
0x0062 | glDisable(GL_EARLY_DEPTH_TEST_DMP), glEnable(GL_EARLY_DEPTH_TEST_DMP) | OTHERS | |
0x0063 | glClear(GL_EARLY_DEPTH_BUFFER_BIT_DMP) |
- | |
0x0064 | 8.8.20.10 | Output register attributes settings. | SHADERPROGRAM |
0x0065 through 0x0067 | 8.8.16 | glDisable(GL_SCISSOR_TEST), glEnable(GL_SCISSOR_TEST), glScissor | SCISSOR |
0x0068 | 8.8.10 | glViewport |
OTHERS |
0x006A | 8.8.13 | glClearEarlyDepthDMP |
OTHERS |
0x006D | 8.8.9.3 | dmp_FragOperation.wScale |
FSUNIFORM |
0x006E | 8.8.3 | glRenderbufferStorage, glTexture2DImage2D | FRAMEBUFFER |
0x006F | Output attributes clock control. | SHADERPROGRAM | |
0x0080 | 8.8.6.2 | dmp_Texture[i].samplerType (i=0,1,2) |
TEXTURE |
8.8.6.3 | dmp_Texture[2].texcoord, dmp_Texture[3].texcoord, dmp_Texture[3].samplerType | FSUNIFORM | |
8.8.19 | glDrawArrays, glDrawElements | - | |
0x0081 | 8.8.6.8 | glTexParameter |
TEXTURE |
0x0082 | 8.8.6.6 | glTexImage2D , glCompressedTexImage2D , glCopyTexImage2D |
TEXTURE |
0x0083 | 8.8.6.2 | dmp_Texture[0].samplerType |
TEXTURE |
8.8.6.7 | glTexImage2D, glCompressedTexImage2D, glCopyTexImage2D |
TEXTURE | |
8.8.6.8 | glTexParameter |
TEXTURE | |
0x0084 | 8.8.6.8 | glTexParameter, glCopyTexImage2D, glCompressedTexImage2D, glTexImage2D | TEXTURE |
0x0085 through 0x008A | 8.8.2 | glTexImage2D , glCompressedTexImage2D , glCopyTexImage2D |
TEXTURE |
0x008B | 8.8.6.1 | dmp_Texture[0].perspectiveShadow, dmp_Texture[0].shadowZBias | FSUNIFORM |
0x008E | 8.8.6.7 | glTexImage2D , glCompressedTexImage2D , glCopyTexImage2D |
TEXTURE |
0x008F | 8.8.5.1 | dmp_FragmentLighting.enabled |
FSUNIFORM |
0x0091 | 8.8.6.8 | glTexParameter |
TEXTURE |
0x0092 | 8.8.6.6 | glTexImage2D , glCompressedTexImage2D , glCopyTexImage2D |
TEXTURE |
0x0093 | 8.8.6.7 | glTexImage2D , glCompressedTexImage2D , glCopyTexImage2D |
TEXTURE |
8.8.6.8 | glTexParameter |
TEXTURE | |
0x0094 | 8.8.6.8 | glTexImage2D , glCompressedTexImage2D , glCopyTexImage2D |
TEXTURE |
0x0095 | 8.8.2 | glTexImage2D , glCompressedTexImage2D , glCopyTexImage2D |
TEXTURE |
0x0096 | 8.8.6.7 | glTexImage2D , glCompressedTexImage2D , glCopyTexImage2D |
TEXTURE |
0x0099 | 8.8.6.8 | glTexParameter |
TEXTURE |
0x009A | 8.8.6.6 | glTexImage2D , glCompressedTexImage2D , glCopyTexImage2D |
TEXTURE |
0x009B | 8.8.6.7 | glTexImage2D , glCompressedTexImage2D , glCopyTexImage2D |
TEXTURE |
8.8.6.8 | glTexParameter |
TEXTURE | |
0x009C | 8.8.6.8 | glTexParameter, glCopyTexImage2D, glCompressedTexImage2D, glTexImage2D | TEXTURE |
0x009D | 8.8.2 | glTexImage2D , glCompressedTexImage2D , glCopyTexImage2D |
TEXTURE |
0x009E | 8.8.6.7 | glTexImage2D , glCompressedTexImage2D , glCopyTexImage2D |
TEXTURE |
0x00A8 | 8.8.6.4 | dmp_Texture[3].ptClampU, dmp_Texture[3].ptClampV, dmp_Texture[3].ptRgbMap, dmp_Texture[3].ptAlphaMap, dmp_Texture[3].ptAlphaSeparate, dmp_Texture[3].ptNoiseEnable, dmp_Texture[3].ptShiftU, dmp_Texture[3].ptShiftV, dmp_Texture[3].ptTexBias | FSUNIFORM |
0x00A9 through 0x00AB | dmp_Texture[3].ptNoiseU, dmp_Texture[3].ptNoiseV | FSUNIFORM | |
0x00AC | dmp_Texture[3].ptMinFilter, dmp_Texture[3].ptTexWidth, dmp_Texture[3].ptTexBias | FSUNIFORM | |
0x00AD | dmp_Texture[3].ptTexOffset |
FSUNIFORM | |
0x00AF | 8.8.6.5 | dmp_Texture[3].ptSampler{RgbMap,AlphaMap,NoiseMap,R,G,B,A} |
LUT |
0x00B0 through 0x00B7 | Sets data to lookup tables. | LUT | |
0x00C0 | 8.8.4 | dmp_TexEnv[0].srcRgb, dmp_TexEnv[0].srcAlpha | FSUNIFORM |
0x00C1 | dmp_TexEnv[0].operandRgb, dmp_TexEnv[0].operandAlpha | FSUNIFORM | |
0x00C2 | dmp_TexEnv[0].combineRgb, dmp_TexEnv[0].combineAlpha | FSUNIFORM | |
0x00C3 | dmp_TexEnv[0].constRgba |
FSUNIFORM | |
0x00C4 | dmp_TexEnv[0].scaleRgb, dmp_TexEnv[0].scaleAlpha | FSUNIFORM | |
0x00C8 | 8.8.4 | dmp_TexEnv[1].srcRgb, dmp_TexEnv[1].srcAlpha | FSUNIFORM |
0x00C9 | dmp_TexEnv[1].operandRgb, dmp_TexEnv[1].operandAlpha | FSUNIFORM | |
0x00CA | dmp_TexEnv[1].combineRgb, dmp_TexEnv[1].combineAlpha | FSUNIFORM | |
0x00CB | dmp_TexEnv[1].constRgba |
FSUNIFORM | |
0x00CC | dmp_TexEnv[1].scaleRgb, dmp_TexEnv[1].scaleAlpha | FSUNIFORM | |
0x00D0 | 8.8.4 | dmp_TexEnv[2].srcRgb, dmp_TexEnv[2].srcAlpha | FSUNIFORM |
0x00D1 | dmp_TexEnv[2].operandRgb, dmp_TexEnv[2].operandAlpha | FSUNIFORM | |
0x00D2 | dmp_TexEnv[1].combineRgb, dmp_TexEnv[1].combineAlpha | FSUNIFORM | |
0x00D3 | dmp_TexEnv[2].constRgba |
FSUNIFORM | |
0x00D4 | dmp_TexEnv[2].scaleRgb, dmp_TexEnv[2].scaleAlpha | FSUNIFORM | |
0x00D8 | 8.8.4 | dmp_TexEnv[3].srcRgb, dmp_TexEnv[3].srcAlpha | FSUNIFORM |
0x00D9 | dmp_TexEnv[3].operandRgb, dmp_TexEnv[3].operandAlpha | FSUNIFORM | |
0x00DA | dmp_TexEnv[3].combineRgb, dmp_TexEnv[3].combineAlpha | FSUNIFORM | |
0x00DB | dmp_TexEnv[3].constRgba |
FSUNIFORM | |
0x00DC | dmp_TexEnv[3].scaleRgb, dmp_TexEnv[3].scaleAlpha | FSUNIFORM | |
0x00E0 | 8.8.4.1 | dmp_TexEnv[i].bufferInput (i=1,2,3,4) |
FSUNIFORM |
8.8.7.1 | dmp_Gas.shadingDensitySrc |
FSUNIFORM | |
8.8.8.1 | dmp_Fog.mode, dmp_Fog.zFlip | FSUNIFORM | |
0x00E1 | 8.8.8.1 | dmp_Fog.color |
FSUNIFORM |
0x00E4 | 8.8.7.1 | dmp_Gas.attenuation |
FSUNIFORM |
0x00E5 | dmp_Gas.accMax |
FSUNIFORM | |
0x00E6 | 8.8.8.2 | dmp_Fog.sampler |
LUT |
0x00E8 through 0x00EF | Sets data to lookup tables. | LUT | |
0x00F0 | 8.8.4 | dmp_TexEnv[4].srcRgb, dmp_TexEnv[4].srcAlpha | FSUNIFORM |
0x00F1 | dmp_TexEnv[4].operandRgb, dmp_TexEnv[4].operandAlpha | FSUNIFORM | |
0x00F2 | dmp_TexEnv[4].combineRgb, dmp_TexEnv[4].combineAlpha | FSUNIFORM | |
0x00F3 | dmp_TexEnv[4].constRgba |
FSUNIFORM | |
0x00F4 | dmp_TexEnv[4].scaleRgb, dmp_TexEnv[4].scaleAlpha | FSUNIFORM | |
0x00F8 | 8.8.4 | dmp_TexEnv[5].srcRgb, dmp_TexEnv[5].srcAlpha | FSUNIFORM |
0x00F9 | dmp_TexEnv[5].operandRgb, dmp_TexEnv[5].operandAlpha | FSUNIFORM | |
0x00FA | dmp_TexEnv[5].combineRgb, dmp_TexEnv[5].combineAlpha | FSUNIFORM | |
0x00FB | dmp_TexEnv[5].constRgba |
FSUNIFORM | |
0x00FC | dmp_TexEnv[5].scaleRgb, dmp_TexEnv[5].scaleAlpha | FSUNIFORM | |
0x00FD | 8.8.4.1 | dmp_TexEnv[0].bufferColor |
FSUNIFORM |
8.10.2. Register Map for Registers 0x0100 – 0x01FF
Address | Section | Function / Reserved Uniform | NN_GX_STATE_ |
---|---|---|---|
0x0100 | 8.8.9.1 | dmp_FragOperation.mode |
FSUNIFORM |
8.8.12 | glDisable(GL_BLEND), glEnable(GL_BLEND), glDisable(GL_COLOR_LOGIC_OP), glEnable(GL_COLOR_LOGIC_OP) | OTHERS | |
0x0101 | 8.8.12 | glBlendEquation, glBlendEquationSeparate, glBlendFunc, glBlendFuncSeparate | OTHERS |
0x0102 | glLogicOp |
OTHERS | |
0x0103 | glBlendColor |
OTHERS | |
0x0104 | 8.8.9.5 | dmp_FragOperation.enableAlphaTest, dmp_FragOperation.alphaTestFunc, dmp_FragOperation.alphaRefValue | FSUNIFORM |
0x0105 | 8.8.14 | glDisable(GL_STENCIL_TEST), glEnable(GL_STENCIL_TEST), glStencilFunc, glStencilMask | OTHERS |
0x0106 | glStencilOp |
OTHERS | |
0x0107 | 8.8.11 | glDisable(GL_DEPTH_TEST), glEnable(GL_DEPTH_TEST), glDepthFunc, glDepthMask | OTHERS |
8.8.17 | glColorMask |
OTHERS | |
0x0110 | 8.8.21 | glFinish, glFlush, nngxSplitDrawCmdlist, nngxTransferRenderImage |
FRAMEBUFFER FBACCESS |
0x0111 | glFinish, glFlush, glDrawArrays, glDrawElements, nngxSplitDrawCmdlist, nngxTransferRenderImage |
FRAMEBUFFER FBACCESS |
|
0x0112 through 0x0115 | 8.8.9.6 | dmp_FragOperation.mode, glDisable(GL_BLEND), glEnable(GL_BLEND), glDisable(GL_COLOR_LOGIC_OP), glEnable(GL_COLOR_LOGIC_OP), glColorMask, glDisable(GL_DEPTH_TEST), glEnable(GL_DEPTH_TEST), glDepthMask, glDisable(GL_STENCIL_TEST), glEnable(GL_STENCIL_TEST), glStencilMask | FBACCESS |
0x0116 | 8.8.3 | glRenderbufferStorage, glTexture2DImage2D | FRAMEBUFFER |
0x0117 | |||
0x0118 | 8.8.13 | glDisable(GL_EARLY_DEPTH_TEST_DMP), glEnable(GL_EARLY_DEPTH_TEST_DMP) | OTHERS |
0x011B | 8.8.18 | glRenderBlockModeDMP |
OTHERS |
0x011C through 0x011E | 8.8.3 | glRenderbufferStorage, glTexImage2D, glTexture2DImage2D | FRAMEBUFFER |
0x0120 | 8.8.7.1 | dmp_Gas.lightXY |
FSUNIFORM |
0x0121 | dmp_Gas.lightZ |
FSUNIFORM | |
0x0122 | |||
0x0123 | 8.8.7.2 | dmp_Gas.sampler{TR,TG,TB} |
LUT |
0x0124 | Sets data to lookup tables. | LUT | |
0x0125 | 8.8.7.1 | dmp_Gas.autoAcc |
- |
0x0126 | 8.8.7.1 | dmp_Gas.deltaZ |
FSUNIFORM |
8.8.11 | glDepthFunc |
OTHERS | |
0x0130 | 8.8.9.2 | dmp_FragOperation.penumbraScale, dmp_FragOperation.penumbraBias | FSUNIFORM |
0x0140 | 8.8.5.3 | dmp_FragmentMaterial.specular0, dmp_FragmentLightSource[0].specular0 | FSUNIFORM |
0x0141 | dmp_LightEnv.lutEnabledRefl, dmp_FragmentMaterial.specular1, dmp_FragmentLightSource[0].specular1 | FSUNIFORM | |
0x0142 | dmp_FragmentMaterial.diffuse, dmp_FragmentLightSource[0].diffuse | FSUNIFORM | |
0x0143 | dmp_FragmentMaterial.ambient, dmp_FragmentLightSource[0].ambient | FSUNIFORM | |
0x0144 | dmp_FragmentLightSource[0].position |
FSUNIFORM | |
0x0145 | |||
0x0146 | dmp_FragmentLightSource[0].spotDirection |
FSUNIFORM | |
0x0147 | |||
0x0149 | dmp_FragmentLightSource[0].position, dmp_FragmentLightSource[0].twoSideDiffuse, dmp_FragmentLightSource[0].geomFactor0, dmp_FragmentLightSource[0].geomFactor1 | FSUNIFORM | |
0x014A | dmp_FragmentLightSource[0].distanceAttenuationBias |
FSUNIFORM | |
0x014B | dmp_FragmentLightSource[0].distanceAttenuationScale |
FSUNIFORM | |
0x0150 | 8.8.5.3 | dmp_FragmentMaterial.specular0, dmp_FragmentLightSource[1].specular0 | FSUNIFORM |
0x0151 | dmp_LightEnv.lutEnabledRefl, dmp_FragmentMaterial.specular1, dmp_FragmentLightSource[1].specular1 | FSUNIFORM | |
0x0152 | dmp_FragmentMaterial.diffuse, dmp_FragmentLightSource[1].diffuse | FSUNIFORM | |
0x0153 | dmp_FragmentMaterial.ambient, dmp_FragmentLightSource[1].ambient | FSUNIFORM | |
0x0154 | dmp_FragmentLightSource[1].position |
FSUNIFORM | |
0x0155 | |||
0x0156 | dmp_FragmentLightSource[1].spotDirection |
FSUNIFORM | |
0x0157 | |||
0x0159 | dmp_FragmentLightSource[1].position, dmp_FragmentLightSource[1].twoSideDiffuse, dmp_FragmentLightSource[1].geomFactor0, dmp_FragmentLightSource[1].geomFactor1 | FSUNIFORM | |
0x015A | dmp_FragmentLightSource[1].distanceAttenuationBias |
FSUNIFORM | |
0x015B | dmp_FragmentLightSource[1].distanceAttenuationScale |
FSUNIFORM | |
0x0160 | 8.8.5.3 | dmp_FragmentMaterial.specular0, dmp_FragmentLightSource[2].specular0 | FSUNIFORM |
0x0161 | dmp_LightEnv.lutEnabledRefl, dmp_FragmentMaterial.specular1, dmp_FragmentLightSource[2].specular1 | FSUNIFORM | |
0x0162 | dmp_FragmentMaterial.diffuse, dmp_FragmentLightSource[2].diffuse | FSUNIFORM | |
0x0163 | dmp_FragmentMaterial.ambient, dmp_FragmentLightSource[2].ambient | FSUNIFORM | |
0x0164 | dmp_FragmentLightSource[2].position |
FSUNIFORM | |
0x0165 | |||
0x0166 | dmp_FragmentLightSource[2].spotDirection |
FSUNIFORM | |
0x0167 | |||
0x0169 | dmp_FragmentLightSource[2].position, dmp_FragmentLightSource[2].twoSideDiffuse, dmp_FragmentLightSource[2].geomFactor0, dmp_FragmentLightSource[2].geomFactor1 | FSUNIFORM | |
0x016A | dmp_FragmentLightSource[2].distanceAttenuationBias |
FSUNIFORM | |
0x016B | dmp_FragmentLightSource[2].distanceAttenuationScale |
FSUNIFORM | |
0x0170 | 8.8.5.3 | dmp_FragmentMaterial.specular0, dmp_FragmentLightSource[3].specular0 | FSUNIFORM |
0x0171 | dmp_LightEnv.lutEnabledRefl, dmp_FragmentMaterial.specular1, dmp_FragmentLightSource[3].specular1 | FSUNIFORM | |
0x0172 | dmp_FragmentMaterial.diffuse, dmp_FragmentLightSource[3].diffuse | FSUNIFORM | |
0x0173 | dmp_FragmentMaterial.ambient, dmp_FragmentLightSource[3].ambient | FSUNIFORM | |
0x0174 | dmp_FragmentLightSource[3].position |
FSUNIFORM | |
0x0175 | |||
0x0176 | dmp_FragmentLightSource[3].spotDirection |
FSUNIFORM | |
0x0177 | |||
0x0179 | dmp_FragmentLightSource[3].position, dmp_FragmentLightSource[3].twoSideDiffuse, dmp_FragmentLightSource[3].geomFactor0, dmp_FragmentLightSource[3].geomFactor1 | FSUNIFORM | |
0x017A | dmp_FragmentLightSource[3].distanceAttenuationBias |
FSUNIFORM | |
0x017B | dmp_FragmentLightSource[3].distanceAttenuationScale |
FSUNIFORM | |
0x0180 | 8.8.5.3 | dmp_FragmentMaterial.specular0, dmp_FragmentLightSource[4].specular0 | FSUNIFORM |
0x0181 | dmp_LightEnv.lutEnabledRefl, dmp_FragmentMaterial.specular1, dmp_FragmentLightSource[4].specular1 | FSUNIFORM | |
0x0182 | dmp_FragmentMaterial.diffuse, dmp_FragmentLightSource[4].diffuse | FSUNIFORM | |
0x0183 | dmp_FragmentMaterial.ambient, dmp_FragmentLightSource[4].ambient | FSUNIFORM | |
0x0184 | dmp_FragmentLightSource[4].position |
FSUNIFORM | |
0x0185 | |||
0x0186 | dmp_FragmentLightSource[4].spotDirection |
FSUNIFORM | |
0x0187 | |||
0x0189 | dmp_FragmentLightSource[4].position, dmp_FragmentLightSource[4].twoSideDiffuse, dmp_FragmentLightSource[4].geomFactor0, dmp_FragmentLightSource[4].geomFactor1 | FSUNIFORM | |
0x018A | dmp_FragmentLightSource[4].distanceAttenuationBias |
FSUNIFORM | |
0x018B | dmp_FragmentLightSource[4].distanceAttenuationScale |
FSUNIFORM | |
0x0190 | 8.8.5.3 | dmp_FragmentMaterial.specular0, dmp_FragmentLightSource[5].specular0 | FSUNIFORM |
0x0191 | dmp_LightEnv.lutEnabledRefl, dmp_FragmentMaterial.specular1, dmp_FragmentLightSource[5].specular1 | FSUNIFORM | |
0x0192 | dmp_FragmentMaterial.diffuse, dmp_FragmentLightSource[5].diffuse | FSUNIFORM | |
0x0193 | dmp_FragmentMaterial.ambient, dmp_FragmentLightSource[5].ambient | FSUNIFORM | |
0x0194 | dmp_FragmentLightSource[5].position |
FSUNIFORM | |
0x0195 | |||
0x0196 | dmp_FragmentLightSource[5].spotDirection |
FSUNIFORM | |
0x0197 | |||
0x0199 | dmp_FragmentLightSource[5].position, dmp_FragmentLightSource[5].twoSideDiffuse, dmp_FragmentLightSource[5].geomFactor0, dmp_FragmentLightSource[5].geomFactor1 | FSUNIFORM | |
0x019A | dmp_FragmentLightSource[5].distanceAttenuationBias |
FSUNIFORM | |
0x019B | dmp_FragmentLightSource[5].distanceAttenuationScale |
FSUNIFORM | |
0x01A0 | 8.8.5.3 | dmp_FragmentMaterial.specular0, dmp_FragmentLightSource[6].specular0 | FSUNIFORM |
0x01A1 | dmp_LightEnv.lutEnabledRefl, dmp_FragmentMaterial.specular1, dmp_FragmentLightSource[6].specular1 | FSUNIFORM | |
0x01A2 | dmp_FragmentMaterial.diffuse, dmp_FragmentLightSource[6].diffuse | FSUNIFORM | |
0x01A3 | dmp_FragmentMaterial.ambient, dmp_FragmentLightSource[6].ambient | FSUNIFORM | |
0x01A4 | dmp_FragmentLightSource[6].position |
FSUNIFORM | |
0x01A5 | |||
0x01A6 | dmp_FragmentLightSource[6].spotDirection |
FSUNIFORM | |
0x01A7 | |||
0x01A9 | dmp_FragmentLightSource[6].position, dmp_FragmentLightSource[6].twoSideDiffuse, dmp_FragmentLightSource[6].geomFactor0, dmp_FragmentLightSource[6].geomFactor1 | FSUNIFORM | |
0x01AA | dmp_FragmentLightSource[6].distanceAttenuationBias |
FSUNIFORM | |
0x01AB | dmp_FragmentLightSource[6].distanceAttenuationScale |
FSUNIFORM | |
0x01B0 | 8.8.5.3 | dmp_FragmentMaterial.specular0, dmp_FragmentLightSource[7].specular0 | FSUNIFORM |
0x01B1 | dmp_LightEnv.lutEnabledRefl, dmp_FragmentMaterial.specular1, dmp_FragmentLightSource[7].specular1 | FSUNIFORM | |
0x01B2 | dmp_FragmentMaterial.diffuse, dmp_FragmentLightSource[7].diffuse | FSUNIFORM | |
0x01B3 | dmp_FragmentMaterial.ambient, dmp_FragmentLightSource[7].ambient | FSUNIFORM | |
0x01B4 | dmp_FragmentLightSource[7].position |
FSUNIFORM | |
0x01B5 | |||
0x01B6 | dmp_FragmentLightSource[7].spotDirection |
FSUNIFORM | |
0x01B7 | |||
0x01B9 | dmp_FragmentLightSource[7].position, dmp_FragmentLightSource[7].twoSideDiffuse, dmp_FragmentLightSource[7].geomFactor0, dmp_FragmentLightSource[7].geomFactor1 | FSUNIFORM | |
0x01BA | dmp_FragmentLightSource[7].distanceAttenuationBias |
FSUNIFORM | |
0x01BB | dmp_FragmentLightSource[7].distanceAttenuationScale |
FSUNIFORM | |
0x01C0 | 8.8.5.2 | dmp_FragmentLighting.ambient, dmp_FragmentMaterial.ambient, dmp_FragmentMaterial.emission | FSUNIFORM |
0x01C2 | 8.8.5.1 | dmp_FragmentLightSource[i].enabled(i = 0 through 7) | FSUNIFORM |
0x01C3 | 8.8.5.8 | dmp_LightEnv.invertShadow, dmp_LightEnv.shadowAlpha, dmp_LightEnv.shadowPrimary, dmp_LightEnv.shadowSecondary, dmp_LightEnv.shadowSelector | FSUNIFORM |
8.8.5.9 | dmp_LightEnv.bumpMode, dmp_LightEnv.bumpRenorm, dmp_LightEnv.bumpSelector, dmp_LightEnv.clampHighlights, dmp_LightEnv.config, dmp_LightEnv.fresnelSelector | FSUNIFORM | |
0x01C4 | 8.8.5.3 | dmp_FragmentLightSource[i].distanceAttenuationEnabled(i = 0 through 7), dmp_FragmentLightSource[i].shadowed, dmp_FragmentLightSource[i].spotEnabled | FSUNIFORM |
8.8.5.9 | dmp_LightEnv.fresnelSelector, dmp_LightEnv.lutEnabledD0, dmp_LightEnv.lutEnabledD1, dmp_LightEnv.lutEnabledRefl | FSUNIFORM | |
0x01C5 | 8.8.5.4 |
|
LUT |
0x01C6 | 8.8.5.1 | dmp_FragmentLighting.enabled |
FSUNIFORM |
0x01C8 through 0x01CF | 8.8.5.4 | Sets data to lookup tables. | LUT |
0x01D0 | 8.8.5.5 | dmp_LightEnv.absLutInputD0, dmp_LightEnv.absLutInputD1, dmp_LightEnv.absLutInputSP, dmp_LightEnv.absLutInputFR, dmp_LightEnv.absLutInputRB, dmp_LightEnv.absLutInputRG, dmp_LightEnv.absLutInputRR | FSUNIFORM |
0x01D1 | 8.8.5.6 | dmp_LightEnv.lutInputD0, dmp_LightEnv.lutInputD1, dmp_LightEnv.lutInputSP, dmp_LightEnv.lutInputFR, dmp_LightEnv.lutInputRB, dmp_LightEnv.lutInputRG, dmp_LightEnv.lutInputRR | FSUNIFORM |
0x01D2 | 8.8.5.7 | dmp_LightEnv.lutScaleD0, dmp_LightEnv.lutScaleD1, dmp_LightEnv.lutScaleSP, dmp_LightEnv.lutScaleFR, dmp_LightEnv.lutScaleRB, dmp_LightEnv.lutScaleRG, dmp_LightEnv.lutScaleRR | FSUNIFORM |
0x01D9 | 8.8.5.1 | dmp_FragmentLightSource[i].enabled(i = 0 through 7) | FSUNIFORM |
8.10.3. Register Map for Registers 0x0200 – 0x02FF
Address | Section | Function / Reserved Uniform | NN_GX_STATE_ |
---|---|---|---|
0x0200 | 8.8.1.9 | glBufferData |
VERTEX |
0x0201 | glVertexAttribPointer |
VERTEX | |
0x0202 | glEnableVertexAttribArray, glDisableVertexAttribArray, glVertexAttribPointer | VERTEX | |
0x0203 through 0x0226 | glBufferData, glVertexAttribPointer | VERTEX | |
0x0227 | 8.8.1.9 | glBufferData |
- |
8.8.19 | glDrawElements |
- | |
0x0228 | 8.8.19 | glDrawElements, glDrawArrays | - |
0x0229 | glDrawElements |
SHADERMODE | |
0x022A | glDrawArrays |
- | |
0x022E | glDrawArrays |
- | |
0x022F | glDrawElements |
- | |
0x0231 | glDrawElements, glDrawArrays | - | |
0x0232 | 8.8.1.8 | glVertexAttribPointer, glVertexAttrib{1234}f, glVertexAttrib{1234}fv | VERTEX |
0x0233 through 0x0235 | Vertex attribute data settings. | VERTEX | |
0x0238 through 0x023D | 8.8.23 |
Command buffer execution. nngxAddJumpCommand, nngxAddSubroutineCommand |
- |
0x0242 | 8.8.1.6 | Sets the number of vertex attributes to input. | SHADERPROGRAM |
0x0244 | 8.8.20 | Settings for the shared processor. | SHADERMODE |
0x0245 | 8.8.19 | glDrawElements, glDrawArrays | - |
0x024A | 8.8.1.10 | Sets the number of used output registers. | SHADERPROGRAM |
0x0251 | 8.8.1.10 | Sets the number of used output registers. | SHADERPROGRAM |
0x0252 | 8.8.20.12 | Sets the use of subdivision shaders. | SHADERPROGRAM |
0x0253 | 8.8.19 | glDrawElements, glDrawArrays | - |
0x0254 | 8.8.20.12 | Sets the use of subdivision shaders. | SHADERPROGRAM |
0x025E | 8.8.1.10 | Sets the number of used output registers. | SHADERPROGRAM |
8.8.20.8 | Sets the number of used output registers (geometry shaders). | SHADERPROGRAM | |
8.8.19 | glDrawElements, glDrawArrays | - | |
0x025F | 8.8.19 | glDrawElements, glDrawArrays |
- |
0x0280 | 8.8.20.2 | Boolean register (geometry shaders). |
VSUNIFORM SHADERMODE |
0x0281 through 0x0284 | 8.8.20.3 | Integer registers (geometry shaders). |
VSUNIFORM SHADERMODE |
0x0289 | 8.8.20.6 | Setting register for number of vertex inputs (geometry shaders). | SHADERPROGRAM |
8.8.20.12 | Sets use of geometry shaders. | SHADERMODE | |
0x028A | 8.8.20.5 | Starting address setting register (geometry shaders). |
SHADERPROGRAM SHADERMODE |
0x028B | 8.8.20.7 | Input register mapping setting registers (geometry shaders). | VERTEX |
0x028C | |||
0x028D | 8.8.20.9 | Output register mask setting register (geometry shaders). |
SHADERPROGRAM SHADERMODE |
0x028F | 8.8.20.4 | Program code setting register (geometry shaders). | SHADERBINARY |
0x0290 | 8.8.20.1 | Floating-point constant register (geometry shaders). |
SHADERFLOAT VSUNIFORM |
0x0291 through 0x0298 | Floating-point constant loading (geometry shaders). | ||
0x029B | 8.8.20.4 | Program code loading address (geometry shaders). | SHADERBINARY |
0x029C through 0x02A3 | Program code loading (geometry shaders). | ||
0x02A5 | 8.8.20.4 | Swizzle pattern loading address (geometry shaders). | SHADERBINARY |
0x02A6 through 0x02AD | Swizzle pattern loading (geometry shaders). | ||
0x02B0 | 8.8.1.2 | Boolean register. |
VSUNIFORM SHADERMODE |
0x02B1 through 0x02B4 | 8.8.1.3 | Integer registers. |
VSUNIFORM SHADERMODE |
0x02B9 | 8.8.1.6 | Setting register for number of vertex inputs. |
SHADERPROGRAM SHADERMODE |
0x02BA | 8.8.1.5 | Starting address setting register. |
SHADERPROGRAM SHADERMODE |
0x02BB | 8.8.1.7 | Input register mapping setting registers. | VERTEX |
0x02BC | |||
0x02BD | 8.8.1.11 | Output register mask setting register. |
SHADERPROGRAM SHADERMODE |
0x02BF | 8.8.1.4 | Program code setting register. | SHADERBINARY |
0x02C0 | 8.8.1.1 | Floating-point constant register. |
SHADERFLOAT VSUNIFORM |
0x02C1 through 0x02C8 | Floating-point constant loading. | ||
0x02CB | 8.8.1.4 | Program code loading address. | SHADERBINARY |
0x02CC through 0x02D3 | Program code loading. | ||
0x02D5 | 8.8.1.4 | Swizzle pattern loading address. | SHADERBINARY |
0x02D6 through 0x02DD | Swizzle pattern loading. |