Command lists are new to the Nintendo 3DS. The gl
and nngx()
functions called using 3D graphics processing can be recorded as commands and then executed all at the same time. Command list processing occurs using command list objects. The Nintendo 3DS handles command lists by the 3D graphics rendering execution unit.
Command lists include commands that write to registers using direct GPU execution (3D commands), and command requests for communicating instructions from the CPU to the GPU. 3D commands accumulate in the 3D command buffer as gl
and nngx()
functions carry out rendering work and other tasks. Command requests are queued by the specific gl
and nngx()
functions that called them. For information about the types of command requests, see 4.2. Command Request Types.
When a 3D execution command queued in the command request is processed, the GPU loads the 3D command from the command buffer and executes it. Multiple 3D commands are handled and run together as a single command set. Each command set ends with a split command, so the GPU can track where to end the loading of the 3D buffer.
4.1. How to Use
The 3DS GPU renders 3D graphics by running commands in units of command lists. Applications create command list objects into which gl()
functions and other functions accumulate 3D commands, which are then executed as one batch when the GPU runs the command list.
4.1.1. Creating Objects
First, use the nngxGenCmdlists()
function to create the command list objects.
void nngxGenCmdlists(GLsizei n, GLuint* cmdlists);
This code creates n
command list objects and stores their object names in cmdlists
.
Command lists have their own namespace. The command list with an object name of 0
is reserved by the system.
Error Code |
Cause |
---|---|
|
A negative value was specified for |
|
The internal buffer failed to be allocated. |
4.1.2. Binding Command Lists
Next, use the nngxBindCmdlist()
function to bind a generated command list object to the GPU. 3D commands are accumulated in the bound command list's 3D command buffer.
void nngxBindCmdlist(GLuint cmdlist);
If cmdlist
is set to an unused object name, that object is created.
Error Code |
Cause |
---|---|
|
The internal buffer failed to be allocated. |
|
Called while command caches and command lists were in a state of being saved. (For more information, see the CTR Programming Manual: Advanced Graphics.) |
4.1.3. Allocating Memory Regions
Use the nngxCmdlistStorage()
function to allocate a memory region for the bound command list.
void nngxCmdlistStorage(GLsizei bufsize, GLsizei requestcount);
Set bufsize
to the size of the 3D command buffer and requestcount
to the number of command requests that can be queued.
You must call the nngxBindCmdlist()
and nngxCmdlistStorage()
functions on each command list object that you create. Calls to these functions are ignored when a command list with an object name of 0
is bound. If you call this function again on an object that already has an allocated region, that region is freed and reallocated.
A GL_ERROR_COMMANDBUFFER_FULL_DMP
error is generated by the relevant function if 3D commands have accumulated past this allocated 3D command buffer size or if the 3D command buffer has not been set. A GL_ERROR_COMMANDREQUEST_FULL_DMP
error is generated by the relevant function if command requests have been queued past this maximum queue size or if the buffer has not been set.
Error Code |
Cause |
---|---|
|
Failed to allocate a memory region. |
|
Called on a command list that is being executed. |
|
A negative value was specified as an argument. |
4.1.4. Running Command List Objects
Call the nngxRunCmdlist()
function to start executing command requests that have been queued in the bound command list.
void nngxRunCmdlist(void); void nngxRunCmdlistByID(GLuint cmdlist);
Execution is ignored if the bound command list has an object name of 0. Likewise, attempts to bind a different command list and run this function are ignored while command requests are executing.
After command requests have started executing, you can accumulate more commands in that same list or you can bind another command list and accumulate commands there. However, commands must be executed in the same order in which they were accumulated.
The nngxRunCmdlistByID()
function runs the command list specified by cmdlist
rather than the currently bound command list. Besides running the specified command list, it works the same as the nngxRunCmdlist()
function.
Error Code |
Cause |
---|---|
|
Called on a command list for which a memory region has not
been allocated ( |
|
Called on a command list for which a memory region has not
been allocated ( |
4.1.4.1. Getting the State of an Executing Command List
Determine whether a command list is running using the nngxGetIsRunning()
function.
GLboolean nngxGetIsRunning(void);
This function returns GL_TRUE
if a command list is currently running, regardless of whether the command list is currently bound.
Similarly, you can determine whether a command list is currently running by passing the NN_GX_CMDLIST_IS_RUNNING
value for pname
for the nngxGetCmdlistParameteri()
function. This method can be used to determine whether the currently bound command list is running. For information about the nngxGetCmdlistParameteri()
function, see 4.1.10. Getting Command List Parameters.
4.1.5. Destroying Command List Objects
You can call the nngxDeleteCmdlists()
function to destroy command list objects that are no longer necessary.
void nngxDeleteCmdlists(GLsizei n, const GLuint* cmdlists);
This destroys the command list objects specified by the n object names in cmdlists
. A GL_ERROR_8003_DMP
error occurs if any of the specified command list objects are currently being executed, but all other specified command list objects are still destroyed.
Error Code |
Cause |
---|---|
|
A negative value was specified for |
|
A command list included in |
4.1.6. Stopping a Command List
Call either of the following functions to stop a command list that is being executed.
void nngxStopCmdlist(void); void nngxReserveStopCmdlist(GLint id);
When the nngxStopCmdlist()
function is called, it waits for any executing command request to complete and then stops the command list. You cannot stop a command request after it has started executing (or is waiting to commence execution).
The nngxReserveStopCmdlist()
function stops the command list immediately after the id
th accumulated command request finishes executing.
Call the nngxRunCmdlist()
function to resume a stopped command list. Note, however, that this function will be ignored if it is called after the instruction to stop the command list but before the executing command requests finish executing.
Error Code |
Cause |
---|---|
|
Called on a command list that is being executed. |
|
Zero, a negative number, or a value that exceeds the maximum number of command requests was specified for |
4.1.7. Splitting the 3D Command Buffer
Use the nngxSplitDrawCmdlist()
function to add a buffer loading complete command to the 3D command buffer and start queuing render command requests. If a command list accumulates 3D commands while it is executing, its 3D commands are run up to the point at which they are split by this function.
void nngxSplitDrawCmdlist(void);
Render command requests are not queued until the buffer loading complete command is added. In addition to this function, other functions also queue render command requests. Because functions such as glClear
and glTexImage2D
must stop 3D command execution, they each add a buffer loading complete command, and then queue the render command requests.
Error Code |
Cause |
---|---|
|
Called when the bound command list’s object name is |
|
The command request has already reached the maximum number of accumulated command requests allowable. |
|
The 3D command buffer is full because of commands added by this function. |
These errors may also be generated by other functions that call this one internally.
4.1.7.1. Flushing the Accumulated 3D Command Buffer
When the nngxSplitDrawCmdlist()
function is called, the buffer loading complete command is added and queuing of render command requests is carried out, even if there are no 3D commands accumulated in the 3D command buffer. In other words, this function can potentially add unneeded commands. We recommend calling the nngxFlush3DCommand()
function (which adds a command to split the 3D command buffer) and the nngxFlush3DCommandNoCacheFlush()
function only when 3D commands have accumulated. If a cache flush occurs multiple times, call a later function that reflects the cache content all at the same time by using the nngxUpdateBufferLight()
function. This can reduce CPU overhead.
void nngxFlush3DCommand(void); void nngxFlush3DCommandNoCacheFlush(void);
These functions will not add a buffer loading complete command and render command request if there are no 3D commands accumulated in the 3D command buffer of the bound command list after the buffer has been split for the last time. This function only adds a buffer loading complete command and render command request when 3D commands have been accumulated in the buffer. If 3D commands are accumulating as they are executing, the 3D commands execute up to where the buffer was split by this function.
Error Code |
Cause |
---|---|
|
Called when the bound command list’s object name is 0. |
|
The command request has already reached the maximum number of accumulated command requests allowable. |
|
The 3D command buffer is full because of commands added by this function. |
4.1.7.2. Partially Flushing the Accumulated 3D Command Buffer
The nngxFlush3DCommandPartially()
function is provided for executing a 3D command of a specified size. This function is an extension of the features provided with the nngxFlush3DCommand()
function and can be called to correctly execute 3D commands, including the command buffer execution register kick command added to functions such as nngxAdd3DCommand
. For more information, see 3DS Programming Manual: Advanced Graphics, in 8.8.23. Command Buffer Execution Registers (0x0238 – 0x023D).
void nngxFlush3DCommandPartially(GLsizei buffersize);
Specify the size, in bytes, of the command buffer to be executed in buffersize
. The number must be a multiple of 16.
The size must be correctly specified in buffersize
from the address following the previous command flush to the first kick command (and including the kick command). If the wrong value is specified, the command is executed in an unintended order and the operation may not be able to complete properly.
Using the application, accurately perform cache flush on 3D commands accumulated from the previous command flush up to the point this function is called. Overall flushing of the cache must be performed after calling the function because commands that generate an interrupt are generated within this function. In addition, as a means of avoiding cache execution before the cache is flushed, this function cannot be called for command lists whose execution is in progress. Functions such as glClear
, nngxTransferRenderImage
, and glCopyTexImage2D
execute flushes according to the same method as the nngxFlush3DCommand()
function. Always perform a flush using this function before calling those functions.
When a kick command is added using the nngxAddJumpCommand
or nngxAddSubroutineCommand()
function, the driver adjusts the size so that the command kicks the execution size up to the first kick command. It is unnecessary to call the nngxFlush3DCommandPartially()
function when using those functions.
Note that when a partial flush is performed for a command buffer with a kick command added with the nngxAddSubroutineCommand()
function, the execution size used is specified in buffersize
instead of an execution size calculated by the driver.
Error Code |
Cause |
---|---|
|
Called when the bound command list’s object name is 0. |
|
The command request has already reached the maximum number of accumulated command requests allowable. |
|
The 3D command buffer is full because of commands added by this function. |
|
A value less than |
|
Called on a command list that is being executed. |
4.1.8. Clearing the Command List
The following function clears the command list and sets both the 3D command buffer and command request queue to the unused state (the state immediately after their memory regions are allocated).
void nngxClearCmdlist(void);
Error Code |
Cause |
---|---|
|
Called on a command list that is being executed. |
4.1.8.1. Clearing the Command List and Filling Its 3D Command Buffer
The following function clears the command list and initializes the 3D command buffer with the specified data. Both the 3D command buffer and command request queue enter the unused state.
void nngxClearFillCmdlist(GLuint data);
Error Code |
Cause |
---|---|
|
Called on a command list that is being executed. |
4.1.9. Setting Command List Parameters
You can call the nngxSetCmdlistParameteri()
function to set command list parameters
void nngxSetCmdlistParameteri(GLenum pname, GLint param);
pname |
Setting |
---|---|
|
This parameter is set for individual command list objects and can have one of the following values.
If this parameter is If this parameter is This setting takes effect depending on whether it is For more information about how additive blend results are updated for rendered gas density information, see the Gas Control Setting Registers section in the 3DS Programming Manual: Advanced Graphics. |
Error Code |
Cause |
---|---|
|
Called on a command list that is being executed. |
|
An invalid value was specified for |
4.1.10. Getting Command List Parameters
You can call the nngxGetCmdlistParameteri()
function to get command list parameters.
void nngxGetCmdlistParameteri(GLenum pname, GLint* param);
pname |
Parameter Obtained |
---|---|
|
The command list execution state.
|
|
The number of bytes accumulated in the 3D command buffer. |
|
The number of accumulated command requests. |
|
The maximum 3D command buffer size. This value is specified as |
|
The maximum number of command requests. This value was specified in the |
|
The starting address of the 3D command buffer. |
|
The object name of the command list that is currently bound. |
|
The number of executed bytes in the 3D command buffer. |
|
The number of executed command requests. |
|
The starting address of the data region used for the command request queue. |
|
The type of command request that is currently executing or will be executed next. The value returned in Command request types are defined by the following macros.
|
|
The command buffer's address and byte size. The command buffer's address is stored in the first element of If the bound command list is currently stopped, parameter information is returned for the command request that will be executed next. If the bound command list is executing, parameter information is returned for the command request that is currently executing. Nothing is returned if all command requests have finished executing. The function only returns information when a render command request is the command request that is currently executing or that will be executed next. Nothing is returned for any other type of command. |
|
32 bits of data indicating the hardware state. Each of the following bits is set to 1 to indicate that the command list is in the specified state. bit 20: A post transfer command request is executing. |
|
Buffer address of the next 3D command stored in the currently bound command list. |
Error Code |
Cause |
---|---|
|
An invalid value was specified for |
|
|
4.1.11. Command Completion Interrupts
You can cause interrupts to occur and call interrupt handlers, when the command requests in a command list finish. You can register an interrupt handler with the nngxSetCmdlistCallback()
function.
void nngxSetCmdlistCallback(void (*func)(GLint));
An interrupt handler is valid only for the bound command list. If this function is called with func
set to 0
(NULL
), the handler is unregistered.
The interrupt handler is called from a different thread than the main thread, so mutual exclusion is needed when referencing any data shared with the main thread. However, mutual exclusion is not needed for data shared with any callback functions for the same graphics processing registered using the nngxSetVSyncCallback()
function.
Error Code |
Cause |
---|---|
|
Called on a command list that is being executed. |
Use the nngxEnableCmdlistCallback()
function to specify a command request that normally triggers an interrupt when it ends. The nngxDisableCmdlistCallback()
function can disable interrupts.
void nngxEnableCmdlistCallback(GLint id); void nngxDisableCmdlistCallback(GLint id);
An interrupt occurs upon completion of the id
th accumulated command request. You can call this function on a single command list several times with separate id
values to cause multiple interrupts to occur. Note that id
indicates a command request in the order that it was accumulated, not in the order that it was executed. You can call nngxGetCmdlistParameteri()
with pname
set to NN_GX_CMDLIST_USED_REQCOUNT
to get a value to specify for id
. If id
is -1
, an interrupt occurs when all command requests accumulated in the command list have finished.
The command list is still executing when an interrupt handler is called. This occurs for every interrupt except for the last to the command request accumulated in the command list. Consequently, the interrupt handler cannot, itself, call any functions that cannot be called while a command list is executing.
Even without registering an interrupt handler, you can determine when a command request has finished executing by calling nngxGetCmdlistParameteri()
passing pname
as NN_GX_CMDLIST_IS_RUNNING
, and then waiting until you get a value of GL_FALSE
.
Error Code |
Cause |
---|---|
|
Zero, a negative number other than |
4.1.12. Waiting for Command Execution to Complete
You can call nngxWaitCmdlistDone
to wait for all of the command requests accumulated in the command list to complete.
void nngxWaitCmdlistDone(void);
Render command requests are executed until the point at which they are split. To execute all of the accumulated render command requests, call nngxSplitDrawCmdlist
before this function.
This function does not return until command execution is complete. However, you can use the nngxSetTimeout()
function to set a timeout period.
void nngxSetTimeout(GLint64EXT time, void (*callback)(void));
Set time
to the number of ticks to wait before the nngxWaitCmdlistDone()
function times out. Timeouts do not occur when a value of 0
is specified.
Set callback
to the callback function to invoke when a timeout occurs. If this is NULL
, a callback function is not invoked when the timeout occurs.
No timeouts occur by default because the initial values for time
and callback
are 0
and NULL
respectively.
4.1.13. Adding a DMA Transfer Command Request
When the nngxAddVramDmaCommand
or nngxAddVramDmaCommandNoCacheFlush()
function is called, a command request that runs a DMA transfer to VRAM is accumulated in the command list. The former flushes the source cache, but the latter does not. This function can only use DMA transfers from main memory to VRAM.
void nngxAddVramDmaCommand( const GLvoid* srcaddr, GLvoid* dstaddr, GLsizei size); void nngxAddVramDmaCommandNoCacheFlush( const GLvoid* srcaddr, GLvoid* dstaddr, GLsizei size);
An amount of data specified by size
is transferred from the address specified by srcaddr
to the address specified by dstaddr
.
When calling the nngxAddVramDmaCommand()
function, a GL_ERROR_8062_DMP
error indicates that this function was called when no valid command list was bound, while a GL_ERROR_8064_DMP
error indicates that size
is negative.
When calling the nngxAddVramDmaCommandNoCacheFlush()
function, a GL_ERROR_8090_DMP
error indicates that this function was called when no valid command list was bound, and a GL_ERROR_8091_DMP
error indicates that size
is negative.
4.1.14. Adding an Anti-Aliasing Filter Transfer Command Request
When the nngxFilterBlockImage()
function is called, a command request that transfers an image with an anti-aliasing filter applied is accumulated in the command list. (This is one kind of post-filter command request.) The image is transferred in block format, unconverted. The only supported antialiasing specification is 2×2.
void nngxFilterBlockImage(const GLvoid* srcaddr, GLvoid* dstaddr, GLsizei width, GLsizei height, GLenum format);
An image with a width, height, and format specified by width
, height
, and format
respectively is transferred from the address specified by srcaddr
to the address specified by dstaddr
.
The width
and height
arguments are restricted as follows by the value specified for format
.
format |
width |
height |
---|---|---|
GL_RGBA8_OES |
A multiple of 64, greater than or equal to 64. |
A multiple of 16, greater than or equal to 64. |
GL_RGBA4 |
A multiple of 128, greater than or equal to 128. |
A multiple of 16, greater than or equal to 128. |
If the transfer source and destination memory regions overlap, the function works properly when the scraddr
and dstaddr
values are the same, or when the scraddr
value is bigger than the dstaddr
value. The transfer results could be corrupted if the scraddr
value is smaller than the dstaddr
value.
When the value for srcaddr
specifies an address in device memory, the transfer results could be incorrect if the destination memory cache has not been flushed.
Error Code |
Cause |
---|---|
|
Called when a command list with an object name of |
|
The address specified for |
|
A |
|
A |
4.1.15. Adding an Image Transfer Command Request
When the nngxTransferLinearImage()
function is called, a command request that transfers an image to a render buffer or texture is accumulated in the command list. (This is one kind of copy-texture command request.) If the current 3D command buffer has accumulated unsplit commands, a split command is added, and then the transfer command request is added.
Although images are converted from linear format to block format while they are transferred, this conversion only affects addressing. If this function is called on a render buffer, the block mode setting automatically determines whether a conversion to 8 block addressing or 32 block addressing is applied during the transfer. If this function is called on a texture, a conversion to 8 block addressing is applied. In either case, you must flip an image in the V direction and convert its byte order before you transfer it.
For information about block mode, see Block Mode Settings in the 3DS Programming Manual: Advanced Graphics.
void nngxTransferLinearImage(const GLvoid* srcaddr, GLuint dstid, GLenum target);
For srcaddr
, specify the starting address of the image to transfer. The image must have the same format, width, and height as the render buffer or texture to which it is transferred. However, the source pixel format must be 32-bit when the target pixel format is 24-bit because the hardware does not support transfers between 24-bit pixel formats. In this case, for each 4 bytes that are transferred, the first byte (the internal format's alpha component) is truncated.
The image is transferred to the render buffer or texture that has the object ID specified by dstid
and the object type specified by target
.
When target is: |
Set dstid to: |
---|---|
|
The object ID of a render buffer. If a value of |
|
The object ID of a 2D texture. |
GL_TEXTURE_CUBE_MAP_POSITIVE_X{,Y,Z} |
The object ID of a cube map texture. |
The width and height of the target render buffer must be multiples of 8, in block 8 mode or multiples of 32, in block 32 mode. Both the width and height must be at least 128.
Error Code |
Cause |
---|---|
|
Called when the bound command list’s object name is 0. |
|
The maximum number of command requests has already accumulated. |
|
The 3D command buffer is full because of commands added by this function. |
|
The render buffer or texture specified for |
|
There is a violation of the width and height restrictions for the target render buffer. |
|
An invalid value was specified for |
|
The target render buffer or texture does not use 32-bit, 24-bit, or 16-bit pixel sizes. |
4.1.16. Adding a Block-to-Linear Image Conversion and Transfer Command Request
A command request for converting a block image to a linear image and transferring the result can be added to the command list by calling the nngxAddB2LTransferCommand()
function. (This is one kind of post-filter command request.) Although the nngxTransferRenderImage()
function provides the same functionality, the nngxAddB2LTransferCommand()
function is more versatile. They also differ in that the latter function adds only a transfer request command and does not add a split command.
void nngxAddB2LTransferCommand( const GLvoid* srcaddr, GLsizei srcwidth, GLsizei srcheight, GLenum srcformat, GLvoid* dstaddr, GLsizei dstwidth, GLsizei dstheight, GLenum dstformat, GLenum aamode, GLboolean yflip, GLsizei blocksize);
The srcaddr
parameter specifies the transfer source (block image) address. The dstaddr
parameter specifies the transfer destination (linear image) address. Both srcaddr
and dstaddr
must be 16-byte aligned.
The srcwidth
, srcheight
, dstwidth
, and dstheight
parameters specify the transfer source image width and height and transfer destination width and height, in pixels. The height and width of the source image and destination image must be a multiple of the block size (8 or 32). Finally, if the pixel size of the destination image is 24 bits and the block size is 8, the width of the source image and width of the destination image must be a multiple of 16. If 0
is specified for srcwidth
, srcheight
, dstwidth
, or dstheight
, the command is not issued. The height and width of the destination image in pixels must be equal to, or less than, that of the source image.
The height and width of the source and destination images, as measured in pixels, must be at least as big as the minimum allowed. The minimum height and width for source images is 128. The minimum height and width for destination images depends on the anti-alias setting. If anti-aliasing is disabled, the minimum for both height and width is 128. If 2x1 anti-aliasing is enabled, the height minimum is 128 and the width minimum is 64. If 2x2 anti-aliasing is enabled, the minimum for both height and width is 64.
The srcformat
and dstformat
parameters specify the pixel format of the source and destination image. The five types of pixel formats that can be specified are listed in the following table.
Definition |
Bits |
Description of Format |
---|---|---|
|
16 |
The R, G, B, and alpha components are 4 bits each. |
|
16 |
The R, G, and B components are 5 bits each, and the alpha component is 1 bit. |
|
16 |
5-bit RB components and 6-bit G component. No alpha component. |
|
24 |
8-bit RGB components. No alpha component. |
|
32 |
The R, G, B, and alpha components are 8 bits each. |
Conversion to a pixel format with a higher pixel depth is not supported. For example, you cannot convert from a 24-bit format to a 32-bit format, or from a 16-bit format to the 24-bit or 32-bit format.
aamode
specifies the anti-alias filter mode. The three modes that can be specified are listed in the following table. The widths and heights indicate the minimum dimensions of the source image relative to the destination image.
Definition |
Anti-Aliasing |
Width |
Height |
---|---|---|---|
|
No anti-aliasing. |
Equal |
Equal |
|
Transferred using 2x1 anti-aliasing. |
2 times |
Equal |
|
Transferred using 2x2 anti-aliasing. |
2 times |
2 times |
yflip
specifies whether vertical flipping is enabled during image transfer. Flipping is performed if GL_TRUE
(or a value other than 0
) is specified. Flipping is not performed if GL_FALSE
(or 0
) is specified.
For blocksize
, specify the block size used for the transfer source image (8 or 32).
Error Code |
Cause |
---|---|
|
A command list with object name 0 was bound, or there is no space in the command request queue. |
|
Either |
|
A value other than 8 or 32 is specified in |
|
An invalid value is specified in |
|
An invalid value is specified in either |
|
The pixel size of |
|
An invalid value is specified for |
GL_ERROR_8083_DMP |
The specified width or height of the destination image is greater than the width or height in pixels of the source image. |
|
The specified height or width of the source image was smaller than the minimum. |
|
The specified height or width of the destination image was smaller than the minimum. |
4.1.17. Adding a Linear-to-Block Image Conversion and Transfer Command Request
A command for converting from a linear image to a block image and then transferring the result can be added to the command list by calling the nngxAddL2BTransferCommand()
function. (This is one kind of post-filter command request.) The nngxTransferLinearImage()
function also provides the same functionality, but the nngxAddL2BTransferCommand()
function is more versatile. They also differ in that the latter function adds only a transfer request command and does not add a split command.
void nngxAddL2BTransferCommand( const GLvoid* srcaddr, GLvoid* dstaddr, GLsizei width, GLsizei height, GLenum format, GLsizei blocksize);
srcaddr
specifies the transfer source (linear image) address. dstaddr
specifies the transfer destination (block image) address. Both srcaddr
and dstaddr
must be 16-byte aligned.
width
and height
specify the height and width, in pixels, of the transfer source and transfer destination images. The transfer source and transfer destination images must have the same width and height, and each dimension must be 128 or greater and a multiple of the block size (8 or 32). Finally, if the bit depth of the source image is 24 bits, the image width must be a multiple of 32, even if the block size is 8. The command is not added if 0
is specified for either width
or height
.
format
specifies the pixel format of the image being transferred. The specifiable pixel format is the same as that for the nngxAddB2LTransferCommand()
function (Table 4-22). The source and destination images must have the same pixel format. Note, however, that if the format is 24-bit, the source image must be in 32-bit format because hardware does not support 24-bit to 24-bit transfer. In this case, the last byte of every 4 bytes of source data is thrown away.
The blocksize
parameter specifies the block size of the source image as either 8 or 32.
Error Code |
Cause |
---|---|
|
A command list with object name 0 was bound or there is no space in the command request queue. |
|
Either |
|
A value other than 8 or 32 is specified in |
|
An invalid value is specified in either |
|
An invalid value is specified in |
4.1.18. Adding a Block Image Transfer Command Request
A command request for transferring a block image is added to the command list by calling the nngxAddBlockImageCopyCommand()
function. The added command request allows you to copy graphics between textures and render buffers that contain rendered images. Because transfer is performed by specifying a combination of transfer size and skip size, you can clip part of the source image region or paste to part of the destination image region. The main purpose of this function is to transfer block format images. It can be used for transfer of various types of data because it does not perform format conversion.
void nngxAddBlockImageCopyCommand( const GLvoid* srcaddr, GLsizei srcunit, GLsizei srcinterval, GLvoid* dstaddr, GLsizei dstunit, GLsizei dstinterval, GLsizei totalsize);
Use the srcaddr
parameter to specify the transfer source start address. dstaddr
specifies the transfer destination start address. Both srcaddr
and dstaddr
must be 16-byte aligned.
totalsize
specifies the total amount of data to be transferred, in bytes. totalsize
must be 16-byte aligned.
srcunit
and srcinterval
specify the unit size used for reading each transfer and the skip size, respectively. srcunit
bytes of data are transferred, and then srcinterval
bytes in the address being read are skipped, repeating alternately. Transfer ends when the amount of data transferred reaches totalsize
. If srcinterval
is 0
, memory is read continuously from the start address until totalsize
is reached. If srcinterval
is any value other than 0
, srcunit
bytes of data are read and then srcinterval
bytes are skipped, repeatedly. This operation allows part of the source image to be clipped.
dstunit
specifies the write unit size of the transfer destination, and dstinterval
specifies the skip size, in bytes. dstunit
bytes of data are written and dstinterval
bytes in the address being written are skipped, repeating alternately. Transfer ends when the amount of data transferred reaches totalsize
. If dstinterval
is 0
, memory is written continuously from the start address until totalsize
is reached. If dstinterval
is any value other than 0
, writing and skipping are repeated, allowing the image to be inserted into a portion of the memory region for the transfer destination image.
The srcunit
, srcinterval
, dstunit
, and dstinterval
parameters must be multiples of 16. Negative values and values greater than or equal to 0x100000
cannot be specified.
When transferring rendering results, such as block images, note that the start address of the transfer image (at both the source and destination) is normally the upper-left corner of the image (or the lower-left corner in OpenGL ES), and that data is arranged in block units of 8×8 pixels when using a format with a block size of 8. For more information about the block format, see 7.10. Native PICA Format.
Error Code |
Cause |
---|---|
|
A command list with object name 0 was bound or there is no space in the command request queue. |
|
Either |
|
|
|
An invalid value was specified in |
4.1.19. Adding a Memory Fill Command Request
A command request for filling the specified region of memory with the specified data can be added to the command list by calling the nngxAddMemoryFillCommand()
function. The command request added by this function can be used for purposes such as clearing the color buffer or depth buffer (stencil buffer). The glClear()
function provides the same functionality, but this function is more versatile. Two memory regions of different sizes can be cleared simultaneously by making settings for two channels with independently specifiable parameters.
void nngxAddMemoryFillCommand( GLvoid* startaddr0, GLsizei size0, GLuint data0, GLsizei width0, GLvoid* startaddr1, GLsizei size1, GLuint data1, GLsizei width1);
startaddr0
, size0
, data0
, and width0
represent settings for Channel 0. startaddr1
, size1
, data1
, and width1
represent settings for Channel 1. Memory is filled simultaneously for both channel 0 and channel 1. If the memory regions specified for Channel 0 and Channel 1 overlap, the fill data that is ultimately applied to the overlapping part is undefined.
startaddr0
and startaddr1
specify the start addresses of the memory regions. Addresses must be 16-byte aligned. If 0 is specified for an address, that channel is not used. If 0
is specified for startaddr0
, no error checking is performed for size0
, data0
, or width0
. If 0
is specified for startaddr1
, no error checking is performed for size1
, data1
, or width1
.
size0
and size1
specify the sizes of the memory regions, in bytes. Sizes must be multiples of 16.
data0
and data1
specify the fill pattern data. The specified fill pattern is repeatedly inserted into the memory region until it is full.
width0
and width1
specify the bit width of the fill pattern. The values 16, 24, or 32 can be specified for the bit width. If 16 is specified, the memory region is filled in 16-bit units using bits [15:0] of the data. If 24 is specified, the memory region is filled in 24-bit units using bits [23:0] of the data. If 32 is specified, the memory region is filled in 32-bit units using bits [31:0] of the data.
The following table provides fill pattern specifications (bit width and various parameter values) according to the render buffer format being used.
Render Buffer Format |
Bit Width |
R / D |
G / S |
B |
A |
---|---|---|---|---|---|
|
32 |
[31:24] |
[23:16] |
[15:8] |
[7:0] |
|
24 |
[23:16] |
[15:8] |
[7:0] |
- |
|
16 |
[15:12] |
[11:8] |
[7:4] |
[3:0] |
|
16 |
[15:11] |
[10:6] |
[5:1] |
[0:0] |
|
16 |
[15:11] |
[10:5] |
[4:0] |
- |
|
32 |
[23:0] |
[31:24] |
- |
- |
|
24 |
[23:0] |
- |
- |
- |
|
16 |
[15:0] |
- |
- |
- |
Error Code |
Cause |
---|---|
|
A command list with object name |
|
|
|
|
|
An invalid value is specified in |
4.1.20. Moving the 3D Command Buffer Pointer
Call the nngxMoveCommandbufferPointer()
function to move the pointer in the 3D command buffer of the currently bound command list. (This 3D command buffer pointer is the position in the 3D commands from which to start running the 3D commands.)
void nngxMoveCommandbufferPointer(GLint offset);
Specify the amount by which to move the pointer (in bytes) as the offset
parameter.
A GL_ERROR_8061_DMP
error occurs when no command list is bound, or this operation would move the pointer outside of the 3D command buffer region.
4.1.21. Adding Jump Commands
Call the nngxAddJumpCommand()
function to add to the currently bound command list a jump command that executes a 3D command in the specified memory region. Use a jump command to move execution to a different command list without causing any interrupts.
This function uses the command buffer execution PICA register. This only uses channel 0, so the content of two registers (0x0238
and 0x023A
) are both written when this function is run. For more information, see 8.8.23. Command Buffer Execution Registers (0x0238 – 0x023D) and 8.8.23.1. Consecutive Execution of Command Buffers in 3DS Programming Manual: Advanced Graphics.
void nngxAddJumpCommand(const GLvoid* bufferaddr, GLsizei buffersize);
In bufferaddr
and buffersize
, specify the address and size of the command buffer to move execution to. Both bufferaddr
and buffersize
must be multiples of 16.
The content of the destination command buffer (the command list specified by bufferaddr
and buffersize
) is not copied to the command buffer of the currently bound command list. A jump command changes the execution address of a command buffer and directly executes the destination command buffer. Consequently, the application must ensure that the jump destination memory cache has been flushed.
The last command executed at the jump destination must be a split command (a command to write to the split command setting register, added by the nngxSplitDrawCmdlist()
function). Alternatively, this command could be another jump command. When using multiple jump commands, the last command in the last command buffer in the chain must be a split command.
This function adds a command request for a 3D execution command. A GL_ERROR_809A_DMP
error occurs when this function is called immediately after the command buffer has been flushed (for example, by a call to the nngxFlush3DCommand()
function) because doing so is meaningless. To add a 3D command to the command buffer immediately after a flush, call the nngxAdd3DCommand()
function.
Error Code |
Cause |
---|---|
|
The bound command list’s object name is 0. |
|
|
|
|
|
|
|
This function was called immediately after the command buffer was flushed. |
|
The command request added by this function makes the queue overflow. |
|
The command added by this function makes the command buffer overflow. |
4.1.22. Adding Subroutine Commands
Call the nngxAddSubroutineCommand()
function to add both a jump command to execute a 3D command in the specified memory region and a command to set the address for returning to the command buffer jumped from, to the currently bound command list. Use a subroutine command to execute another command list without causing any interrupts, as if it were a subroutine.
This function uses the command buffer execution PICA register. This uses all channels, so the content of four registers (0x0238
through 0x023B
) are written when this function is run. For more information, see 8.8.23. Command Buffer Execution Registers (0x0238 – 0x023D) and 8.8.23.1. Consecutive Execution of Command Buffers in 3DS Programming Manual: Advanced Graphics.
void nngxAddSubroutineCommand(const GLvoid* bufferaddr, GLsizei buffersize);
In bufferaddr
and buffersize
, specify the address and size of the command buffer to move execution to. Both bufferaddr
and buffersize
must be multiples of 16.
The content of the destination command buffer (the command list specified by bufferaddr
and buffersize
) is not copied to the command buffer of the currently bound command list. A jump command changes the execution address of a command buffer and directly executes the destination command buffer. Consequently, the application must ensure that the jump destination memory cache has been flushed.
The jump command is executed on channel 0, and the command to return to the command buffer jumped from is executed on channel 1. Consequently, the last command executed at the jump destination must be a kick command for channel 1 (a command to write to the command buffer execution register 0x023D
). Alternatively, this command could be a jump command to another command buffer, but the channel used by the jump must not be channel 0, and the last command in the last command buffer in the chain must be a kick command for channel 1. In addition, you must not write to the channel 1 address setting registers (0x0239
and 0x023B
). This function adds a jump command (channel 0) and an address setting (channel 1). The application must place the channel 1 kick command and the jump commands within the subroutine.
This function does not add a command request for a 3D execution command. After calling this function, continue accumulating commands, and then execute them after flushing the command buffer, such as by using the nngxFlush3DCommand()
function. Values written to the channel 1 size setting register (0x023B
) added by this function are undefined until the command buffer is flushed. Operation is similarly undefined if you reuse the copied content of this register until the command buffer is flushed.
Error Code |
Cause |
---|---|
|
The bound command list’s object name is 0. |
|
|
|
|
|
|
|
The command added by this function makes the command buffer overflow. |
4.2. Command Request Types
The following command requests are queued in a command list.
DMA Transfer Command Requests
These command requests use DMA transfers to send texture images and vertex buffers from main memory into VRAM.
These command requests are queued by glTexImage2D
and other functions that allocate texture regions, and by glBufferData
and other functions that allocate vertex buffer regions.
Render Command Requests
These command requests execute a single command set of 3D commands accumulated in the 3D command buffer.
When glClear()
, glTexImage2D()
, and other functions are called, they write a buffer loading complete 3D command and then queue the accumulated 3D command buffer as a single render command request.
The nngxSplitDrawCmdlist()
function allows you to queue render command requests at any time.
Memory-Fill Command Requests
These command requests use the GPU memory-fill feature to clear a region allocated in VRAM using a specified data pattern.
These command requests specify a render buffer and are queued when the glClear()
function is called. The glClear()
function also requires a 3D command other than a memory-fill command request to be executed. In other words, when the glClear()
function is called, it first writes 3D commands for the glClear()
function and a buffer loading complete 3D command, and then it queues a render command request and a memory-fill command request.
Post-Transfer Command Requests
These command requests use the GPU post-filter feature to convert images rendered in PICA block format into a linear format that can be read by the LCDs.
These command requests are queued when the nngxTransferRenderImage()
function is called. If the nngxSplitDrawCmdlist()
function has not been called in advance to stop reading from the 3D command buffer, these command requests are queued after a buffer loading complete command is written and a render command request is queued.
Copy Texture Command Requests
These command requests copy GPU rendering results into memory as texture images.
These command requests are queued when glCopyTexImage2D
or glCopyTexSubImage2D
is called.
If the nngxSplitDrawCmdlist()
function has not been called in advance to stop reading from the 3D command buffer, these command requests are queued after a buffer loading complete command is written and a render command request is queued.
4.3. Methods for Optimizing 3D Command Buffer Performance
Methods for optimizing performance during 3D command buffer execution are described below.
4.3.1. Changes in Load Speed due to Address and Size
The address and size of a 3D command buffer can have an effect on load speed at run time.
There are two types of command buffer execution: executing 3D execution commands queued in a command request, and executing the command buffer execution register.
When executing 3D execution commands, execution is affected by the size from the address immediately after a split command added by nngxFlush3DCommand
or nngxSplitDrawCmdlist
, up to the next split command added. You can get the address of 3D commands being accumulated in the 3D command buffer by calling the nngxGetCmdlistParameteri()
function and passing NN_GX_CMDLIST_CURRENT_BUFADDR for pname
.
When executing using the command buffer execution register, execution is affected by the address and size of the following command buffers: added by nngxAddJumpCommand()
, added as subroutines by nngxAddSubroutineCommand()
, or executed to return from a subroutine to the calling location.
If the 3D command buffer address is 128-byte aligned, and the size is a multiple of 256 bytes (256, 512, 768, and so on), transfer speed may be faster.
If the 3D command buffer address is not 128-byte aligned and the size starting from the previous 128-byte aligned address to the end of the 3D command buffer is a multiple of 256, speed may be increased. For example, if the 3D command buffer address and size are 0x20000010
and 0x1F0
respectively, the preceding 128-byte aligned address is only 0x10
earlier, at 0x20000000
. The distance from there to the end is 0x1F0 + 0x10
, which is 0x200
(and a multiple of 256).
The address and size of 3D command buffers have these characteristics due to implementation details of the GPU, but they may not have significant effect in some cases due to factors such as: where the buffer is stored, details of the 3D commands, or memory access conflicts with other modules.
4.3.2. Using Subroutine Execution
It may be possible to improve performance by using 3D command buffer subroutine execution.
4.3.2.1. Overview
3D command buffer subroutine execution uses the command buffer execution register for execution. In contrast to the ordinary method of storing 3D commands in a sequence of 3D command buffers and executing it, a command buffer stored in a different location is executed successively using a command buffer address jump feature. This method is called command buffer subroutine execution because of performing the following controls: first performing an address jump specifying the address of a 3D command buffer, executing the 3D command buffer at that location, and then returning to the calling location.
For more information about using command buffer subroutine execution, see 4.1.22. Adding Subroutine Commands and also refer to Command Buffer Execution Registers in the 3DS Programming Manual: Advanced Graphics.
4.3.2.2. Effect on Behavior
Command buffer subroutine execution has the following advantages.
- Only a jump command to the subroutine command buffer needs to be stored, eliminating the CPU processing needed to copy the 3D commands. The technique is effective for tasks that are quite large and configured frequently, such as loading reference table data or shader programs.
- The subroutine command buffer is not copied to the current 3D command buffer, but is referenced directly by the GPU, allowing the total size of the command buffer to be reduced.
- If the subroutine command buffer is stored in VRAM, GPU access to the command buffer is faster than if it is in main memory (device memory). If memory access to the command buffer is a performance bottleneck, this technique could improve overall system processing speed.
On the other hand, it has the following disadvantage.
- Switching the address due to a jump command incurs memory access overhead. If the granularity of subroutines in the implementation is small and they are called frequently, a decrease in GPU processing speed could result.
The effect of converting to subroutines on processing performance is heavily influenced by issues such as memory access conflicts, so it is strongly dependent on the actual implementation of the application.
4.3.2.3. Storage Location
Command buffer access speed is faster in VRAM than in main memory (device memory), so we recommend storing subroutine command buffers in VRAM.
There is some memory access overhead when executing a subroutine command buffer using a jump command, but if the executed command buffer is stored in VRAM, this overhead is decreased.
To store a command buffer in VRAM, it must first be generated in device memory and then transferred to VRAM by DMA using nngxAddVramDmaCommand
. For information about DMA transfers to VRAM, see 4.1.13. Adding a DMA Transfer Command Request.
4.3.2.4. Balance Between Execution and Access Processes
Depending on the content of subroutine command buffers, the processing bottleneck could move between accessing and executing 3D commands.
If the 3D command is the register write command of the rasterization module or a later module (including the rasterization module), each 3D command requires 2 cycles to process, so it is relatively processor-intensive. When 3D commands are composed of burst commands, execution is even more processor-intensive relative to access processing. In this case, the bottleneck is in command execution, and the processing cost of memory access due to conversion to subroutines is hidden.
If the 3D command is the register write command of a module before the rasterization module (not including the rasterization module), each 3D command requires only one cycle to process, so processing emphasis is light relative to commands discussed in the previous paragraph. In this case, the bottleneck is more likely to be access processing, and the memory access processing cost incurred by conversion to subroutines is more likely to affect the overall performance.
For information about the relative positioning of each module, see 2.2. Rendering Pipeline.