nlib
nn::nlib::succinct::AhoCorasickBuilder Class Referencefinal

Creates the index (automaton) used in the Aho-Corasick algorithm. More...

#include "nn/nlib/succinct/AhoCorasickBuilder.h"

Public Types

typedef bool(* MatchCallback) (const char *first, const char *last, uint32_t nodeid, void *user_obj)
 User-defined callback function. More...
 

Public Member Functions

size_t GetNumWords () const noexcept
 Gets the number of registered strings or patterns. More...
 
size_t GetNumBytes () const noexcept
 Gets the sum size of registered strings or patterns in bytes. More...
 
size_t GetNumNodes () const noexcept
 Gets the number of created automaton nodes. More...
 
Constructor, Destructor, and Initialization
constexpr AhoCorasickBuilder () noexcept
 Instantiates the object with default parameters (default constructor).
 
 ~AhoCorasickBuilder () noexcept
 Destructor.
 
 AhoCorasickBuilder (AhoCorasickBuilder &&rhs)
 Instantiates the object (move constructor).
 
AhoCorasickBuilderoperator= (AhoCorasickBuilder &&rhs)
 Move assignment operator.
 
void Reset () noexcept
 Resets this object to the state immediately after the default constructor was executed.
 
bool Init () noexcept
 Initializes an object. Returns true if successful.
 
Adding Detection-Target String or Data

Adds detection-target strings or data with these functions. Executes Build() when they have been added.

bool AddWord (const char *str) noexcept
 Adds a detection target string. More...
 
bool AddPattern (const void *p, size_t n) noexcept
 Adds detection target data. More...
 
bool AddWords (const char *str, size_t len) noexcept
 Adds a string from an array containing a set of target strings. The strings must be delimited by newlines (CRLF or LF). More...
 
bool AddWords (const char *str) noexcept
 Adds a string from an array containing a set of target strings. The strings must be delimited by newlines (CRLF or LF). More...
 
AhoCorasickBuild () noexcept
 Creates an Aho-Corasick object. Constructs an Aho-Corasick algorithm automaton to detect the added string or pattern. More...
 
std::unique_ptr< AhoCorasickBuild2 () noexcept
 Creates the AhoCorasick object and return it with unique_ptr. Constructs an Aho-Corasick algorithm automaton to detect the added string or pattern. More...
 
String Matching
void MatchByBuilder (const char *doc, MatchCallback callback, void *user_obj) noexcept
 Inspects the string to detect the target string. More...
 
void MatchByBuilder (const char *doc, MatchCallback callback) noexcept
 A parameter omitted version of the above function.
 

Detailed Description

Creates the index (automaton) used in the Aho-Corasick algorithm.

Description
Add target strings or patterns using methods such as AddWord, and then create the Aho-Corasick object using the Build function. The user must delete the created Aho-Corasick object.

Definition at line 33 of file AhoCorasickBuilder.h.

Member Typedef Documentation

◆ MatchCallback

nn::nlib::succinct::AhoCorasickBuilder::MatchCallback

User-defined callback function.

Parameters
[in]firstPointer to the head of the extracted terms.
[in]lastPointer to the tail of the extracted terms.
[in]nodeidThe ID of the detected term (not a continuous value).
[in,out]user_objUser data.
Returns
Continues analysis if true. Ends the process when false.
Description
Called when the string (pattern) is detected. Developers will write a process to handle each.

Definition at line 48 of file AhoCorasickBuilder.h.

Member Function Documentation

◆ AddPattern()

nn::nlib::succinct::AhoCorasickBuilder::AddPattern ( const void *  p,
size_t  n 
)
noexcept

Adds detection target data.

Parameters
[in]pPointer to data.
[in]nData size.
Returns
Returns true when successful.

◆ AddWord()

nn::nlib::succinct::AhoCorasickBuilder::AddWord ( const char *  str)
noexcept

Adds a detection target string.

Parameters
[in]strString.
Returns
Returns true when successful.

◆ AddWords() [1/2]

nn::nlib::succinct::AhoCorasickBuilder::AddWords ( const char *  str,
size_t  len 
)
noexcept

Adds a string from an array containing a set of target strings. The strings must be delimited by newlines (CRLF or LF).

Parameters
[in]strPointer to the string.
[in]lenThe length of the string. (The strlen value.)
Returns
Returns true when successful.

◆ AddWords() [2/2]

nn::nlib::succinct::AhoCorasickBuilder::AddWords ( const char *  str)
inlinenoexcept

Adds a string from an array containing a set of target strings. The strings must be delimited by newlines (CRLF or LF).

Parameters
[in]strPointer to the string.
Returns
Returns true when successful.

Definition at line 43 of file AhoCorasickBuilder.h.

◆ Build()

nn::nlib::succinct::AhoCorasickBuilder::Build ( )
noexcept

Creates an Aho-Corasick object. Constructs an Aho-Corasick algorithm automaton to detect the added string or pattern.

Returns
Pointer to the Aho-Corasick object.

◆ Build2()

nn::nlib::succinct::AhoCorasickBuilder::Build2 ( )
inlinenoexcept

Creates the AhoCorasick object and return it with unique_ptr. Constructs an Aho-Corasick algorithm automaton to detect the added string or pattern.

Returns
unique_ptr to the Aho-Corasick object.

Definition at line 46 of file AhoCorasickBuilder.h.

◆ GetNumBytes()

nn::nlib::succinct::AhoCorasickBuilder::GetNumBytes ( ) const
noexcept

Gets the sum size of registered strings or patterns in bytes.

Returns
Returns the size of registered strings or patterns.

◆ GetNumNodes()

nn::nlib::succinct::AhoCorasickBuilder::GetNumNodes ( ) const
noexcept

Gets the number of created automaton nodes.

Returns
Returns the number of created automaton nodes.

◆ GetNumWords()

nn::nlib::succinct::AhoCorasickBuilder::GetNumWords ( ) const
noexcept

Gets the number of registered strings or patterns.

Returns
Returns the number of registered strings or patterns.

◆ MatchByBuilder()

nn::nlib::succinct::AhoCorasickBuilder::MatchByBuilder ( const char *  doc,
MatchCallback  callback,
void *  user_obj 
)
noexcept

Inspects the string to detect the target string.

Parameters
[in]docString that terminates at null.
[in]callbackCallback function called when the term is detected.
[in,out]user_objUser data.

The documentation for this class was generated from the following files: