MmxSplitWord Function

Description

This function splits the input character string into phrase units and returns processing results for each phrase. If the return value indicates an error, the values in processing results and string length that are stored after parsing is complete may be undefined.
This function uses the following dictionaries in the specified dictionary set (iwnn->dicSet).
  ■ Integrated dictionary
  ■ User dictionary
  ■ Ancillary word dictionary
  ■ Rule dictionary
  ■ Compressed customized dictionary
  ■ Uncompressed customized dictionary
  ■ Single-kanji dictionary
However, do not specify a customized dictionary in which parts of speech have not been accurately assigned in the dictionary set.

Syntax

#include <mw/iwnn/iwnnCTR.h>

s16 MmxSplitWord(
     IWNN_CLASS* iwnn,                            // Parsing information class
     IWNN_RESULT* result,                         // Processing result storage buffer
     u8* processLen,                              // String length after parsing is complete
     const wchar_t* input                         // Input string
);

Arguments

NameDescription
IN / OUT IWNN_CLASS* iwnn Parsing information class. An error results if NULL is specified.
OUT IWNN_RESULT* result Buffer for storing the delimited input result.
Prepare a buffer for storing results corresponding to the maximum morphological analysis string length (MM_MAX_MORPHOLOZE_LEN).
An error results if NULL is specified.
OUT u8* processLen Position (string length) after parsing of delimited input.
An error results if NULL is specified.
IN const wchar_t* input The delimited input string.
An error results if NULL is specified.
0 is returned if an empty string ("") is specified.
Add a terminator at the end of the string.
Because this string is used (overwritten) internally by iWnn during delimited input, do not change its contents until operations are complete.

Return Value

s16 Negative value: Error
If an empty string ("") is specified for the input string: 0
Other: Normal termination

Error Code Description of Error
NJ_ERR_PARAM_ENVIRONMENT_NULL A NULL pointer was specified for iwnn
NJ_ERR_PARAM_READING_NULL A NULL pointer was specified in input
NJ_ERR_PARAM_PROCESS_LEN_NULL A NULL pointer was specified in processLen
NJ_ERR_PARAM_RESULT_NULL A NULL pointer was specified for result
NJ_ERR_DIC_BROKEN   ■ The add location for user dictionary information could not be obtained from the user dictionary specified in iwnn->dicSet
  ■ The number of words to register is greater than the maximum number of registerable words as defined in the user dictionary header specified in iwnn->dicSet
  ■ The data in the user dictionary specified in iwnn->dicSet is corrupted
NJ_ERR_NO_RULE_DIC Returned when a rule dictionary was not set in iwnn->dicSet
NJ_ERR_NO_PARTS_OF_SPEECH The required part-of-speech information cannot be obtained from the rule dictionary

Get delimited input position

The length of the input string in the processing result structure (IWNN_RESULT) obtained by the split word function (MmxSplitWord) can be obtained using the MM_GET_CANDIDATE_LEN macro. Also, the boundary locations between independent words and ancillary words inside the processing result structure (IWNN_RESULT) can be obtained using the MM_GET_STEM_LEN macro.
Macros Description
MM_GET_STEM_LEN(IWNN_RESULT *) Gets independent word strings from the candidate string.
MM_GET_CANDIDATE_LEN(IWNN_RESULT *) Gets the string length of the candidate string.

Depending on the specified text, only phrases consisting of ancillary words (such as "ですね" and "さ") may be returned. In this case, the MM_GET_STEM_LEN macro returns 0.

CONFIDENTIAL