nlib
nn::nlib::TextReader Class Reference

The class for reading text from streams. More...

#include "nn/nlib/TextReader.h"

Public Member Functions

errno_t Init () noexcept
 Initializes the text reader. More...
 
errno_t Open (InputStream *stream) noexcept
 Opens a text reader with a stream specified. More...
 
int Read () noexcept
 Reads one character from the stream and returns UTF-32 data. More...
 
int Peek () noexcept
 Returns one character from the start of the stream in UTF-32. More...
 
int SkipWs () noexcept
 Skips white-space characters (space, newline, tab, and return) in the stream and returns the number that were skipped. More...
 
bool ReadUntil (size_t *len, char *buf, size_t n, char delim) noexcept
 Reads as many as n bytes of UTF-8 characters until delim and stores them in buf. More...
 
template<size_t N>
bool ReadUntil (size_t *len, char(&buf)[N], char delim) noexcept
 Calls ReadUntil(len, buf, N, delim).
 
template<class T >
bool ReadUntil (size_t *len, char *buf, size_t n, T pred) noexcept
 Reads as many as n bytes of UTF-8 characters and stores them in buf. More...
 
template<class T , size_t N>
bool ReadUntil (size_t *len, char(&buf)[N], T pred) noexcept
 Calls ReadUntil(len, buf, N, pred).
 
size_t ReadDecimalString (char *buf, size_t n) noexcept
 Reads as many as n of the characters 0 through 9 and stores them in buf. More...
 
template<size_t N>
size_t ReadDecimalString (char(&buf)[N]) noexcept
 Calls ReadDecimalString(buf, N).
 
bool Proceed (const char *str, size_t n) noexcept
 Advances the stream by the amount of the text string str. More...
 
bool Proceed (char c) noexcept
 Advances the stream by the amount of the character specified by c. More...
 
bool ProceedEx (const char *str) noexcept
 Advances the stream by the amount of the text string str. There is no limit on the length of the text string, and the position of the stream might be changed even if it does not match. More...
 
bool Close () noexcept
 Closes the text reader. More...
 
void SetError (errno_t e) const noexcept
 Sets an error value. More...
 
errno_t GetErrorValue () const noexcept
 This function can get the cause of the error when reading has failed. More...
 
InputStreamGetStream () noexcept
 Gets the stream for the text reader to read. More...
 
int GetLine () const noexcept
 Gets the current line number. More...
 
int GetColumn () const noexcept
 Gets the current column. More...
 
 operator bool () const
 Returns true if no internal error has occurred.
 
Basic Member Functions
 TextReader () noexcept
 Instantiates the object with default parameters (default constructor).
 
virtual ~TextReader () noexcept
 Destructor. The stream is not closed.
 

Detailed Description

The class for reading text from streams.

Description
Reads a UTF-8 text string from a stream and gets one character at a time (UTF-32 or UTF-16).
Newline strings are processed as follows.
  • CRLF is passed as LF.
  • CR is passed as LF.
  • LF is passed as LF.
If verbose UTF-8 is detected, an error is generated (EILSEQ). An error (EILSEQ) is also generated if UTF-8 corresponding to U+D800-U+DFFF is detected.
const char str[] = "multibyte \r\nstring";
MemoryInputStream istr;
istr.Init(str);
if (nlib_is_error(r.Init(&istr))) { ERROR; }
int c;
while ((c = r.Read()) != -1) {
// c is a UTF-32 value and can be processed in terms of Unicode code points.
// If you want to convert to these units instead of processing one character at a time, it is better to use a function like unicode::Utf8ToUtf16.
// L"multibyte \nstring" is read one character at a time, in order.
// Newlines are normalized.
}
if (nlib_is_error(r)) { ERROR; }
if (nlib_is_error(r.Close())) { ERROR; }
You can add a check for UTF-8 text by inheriting TextReader and overriding a FillBuffer_ member function. The TextReader class checks if UTF-8 is enabled and processes newline codes.
The following code is a rough sketch of the implementation of the derived class.
virtual void DerivedClass::FillBuffer() NLIB_NOEXCEPT {
TextReader::FillBuffer_();
// The text string is buffered from GetCur to GetBufEnd.
// This can be checked and then processed or an error can be generated.
// To decrease the number of characters, you must set this using the SetBufEnd member function.
// The number of characters cannot be increased.
}
Examples:
misc/readfile/readfile.cpp, and misc/writefile/writefile.cpp.

Definition at line 20 of file TextReader.h.

Member Function Documentation

§ Close()

nn::nlib::TextReader::Close ( )
noexcept

Closes the text reader.

Returns
Returns true if successful.
Description
Clears the reference to the stream, closes the text reader, and detaches the base stream. The base stream is not closed by this operation.

§ GetColumn()

nn::nlib::TextReader::GetColumn ( ) const
inlinenoexcept

Gets the current column.

Returns
The current column number, starting from 1.
Description
The function returns 0 and sets the error EBADF if the stream is not open.

Definition at line 108 of file TextReader.h.

§ GetErrorValue()

nn::nlib::TextReader::GetErrorValue ( ) const
inlinenoexcept

This function can get the cause of the error when reading has failed.

Return values
0No error occurred.
EINVALInvalid argument.
EEXISTInitialized redundantly.
EBADFNo stream to read.
EIOFailed to read from the stream for some reason.
EILSEQInvalid character found.

Definition at line 105 of file TextReader.h.

§ GetLine()

nn::nlib::TextReader::GetLine ( ) const
inlinenoexcept

Gets the current line number.

Returns
The current line number, starting from 1.

Definition at line 107 of file TextReader.h.

§ GetStream()

nn::nlib::TextReader::GetStream ( )
inlinenoexcept

Gets the stream for the text reader to read.

Returns
The pointer to the stream.

Definition at line 106 of file TextReader.h.

§ Init()

nn::nlib::TextReader::Init ( )
noexcept

Initializes the text reader.

Returns
Returns 0 if successful. Returns EALREADY if already initialized.

§ Open()

nn::nlib::TextReader::Open ( InputStream stream)
noexcept

Opens a text reader with a stream specified.

Parameters
[in]streamA stream.
Return values
0if successful.
EINVALif the stream is NULL.
EEXISTif already opened.
Description
If the text is UTF-8 with BOM, the BOM is skipped..

§ Peek()

nn::nlib::TextReader::Peek ( )
inlinenoexcept

Returns one character from the start of the stream in UTF-32.

Returns
The character that was read (in UTF-32). The function returns -1 if the end of the stream was reached or there was an error.

Definition at line 46 of file TextReader.h.

§ Proceed() [1/2]

nn::nlib::TextReader::Proceed ( const char *  str,
size_t  n 
)
noexcept

Advances the stream by the amount of the text string str.

Parameters
[in]strA pointer to a UTF-8 text string.
[in]nThe length of the string, in bytes.
Return values
trueIndicates that the start of the stream matched str.
falseReturned in all other cases.
Description
If the start of the stream matches str, the reading of the stream is advanced by that amount. If it does not match, the stream remains at the current position.
The text string specified for str must be no longer than 200 characters and it must end with a UTF-8 delimiter not including a newline character. Behavior is undefined if str does not conform to these limitations.

§ Proceed() [2/2]

nn::nlib::TextReader::Proceed ( char  c)
inlinenoexcept

Advances the stream by the amount of the character specified by c.

Parameters
[in]cThe character to skip over.
Return values
trueIndicates that the start of the stream matched c.
falseReturned in all other cases.
Description
If the start of the stream matches c, the reading of the stream is advanced by that amount. If it does not match, the stream remains at the current position.
The character specified for c must be an ASCII character and not a newline character.

Definition at line 90 of file TextReader.h.

§ ProceedEx()

nn::nlib::TextReader::ProceedEx ( const char *  str)
noexcept

Advances the stream by the amount of the text string str. There is no limit on the length of the text string, and the position of the stream might be changed even if it does not match.

Parameters
[in]strA pointer to a UTF-8 text string.
Return values
trueIndicates that the start of the stream matched str.
falseReturned in all other cases.
Description
The text string specified for str must end with a UTF-8 delimiter not including a newline character.

§ Read()

nn::nlib::TextReader::Read ( )
inlinenoexcept

Reads one character from the stream and returns UTF-32 data.

Returns
The character that was read (in UTF-32). The function returns -1 if the end of the stream was reached or there was an error.

Definition at line 26 of file TextReader.h.

§ ReadDecimalString()

nn::nlib::TextReader::ReadDecimalString ( char *  buf,
size_t  n 
)
noexcept

Reads as many as n of the characters 0 through 9 and stores them in buf.

Parameters
[out]bufThe buffer to which the string is stored.
[in]nThe size of the buffer.
Returns
The number of characters that were read.
Description
buf is not terminated with the null character.

§ ReadUntil() [1/2]

nn::nlib::TextReader::ReadUntil ( size_t *  len,
char *  buf,
size_t  n,
char  delim 
)
noexcept

Reads as many as n bytes of UTF-8 characters until delim and stores them in buf.

Parameters
[out]lenThe number of bytes stored in buf.
[out]bufThe buffer where the text string is stored.
[in]nThe size of the buffer.
[in]delimThe delimiter.
Returns
Returns true if a delimiter was found somewhere within the n bytes. If not, returns false.
Description
delim is not read, and buf is not terminated with the null character. The data is always read in terms of UTF-8 code points.

§ ReadUntil() [2/2]

template<class T>
bool nn::nlib::TextReader::ReadUntil ( size_t *  len,
char *  buf,
size_t  n,
pred 
)
noexcept

Reads as many as n bytes of UTF-8 characters and stores them in buf.

Template Parameters
TThe type for function objects for making determinations.
Parameters
[out]lenThe number of bytes stored in buf.
[out]bufThe buffer where the text string is stored.
[in]nThe size of the buffer.
[in]predA function object.
Returns
Returns true if a delimiter was found somewhere within the n bytes. If not, returns false.
Description
This function calls pred(const char* ptr) and determines whether there is a delimiter. ptr, which is an argument for pred, takes a pointer to a UTF-8 character. One code point of data can be accessed.
Use code like the following to conduct the determination. The static member function is called for just the beginning portion of the code point.
struct SearchE38182 {
bool operator()(const char* ptr) {
const unsigned char* p = reinterpret_cast<const unsigned char*>(ptr);
return p[0] == 0xE3 && p[1] == 0x81 && p[2] == 0x82;
}
};
buf is not terminated with the null character. The data is always read in terms of UTF-8 code points.

Definition at line 143 of file TextReader.h.

§ SetError()

nn::nlib::TextReader::SetError ( errno_t  e) const
inlinenoexcept

Sets an error value.

Parameters
[in]eAn error value.
Description
If an error value has not been set, the one specified by e is set.

Definition at line 102 of file TextReader.h.

§ SkipWs()

nn::nlib::TextReader::SkipWs ( )
inlinenoexcept

Skips white-space characters (space, newline, tab, and return) in the stream and returns the number that were skipped.

Returns
The number of skipped white-space characters.

Definition at line 56 of file TextReader.h.


The documentation for this class was generated from the following files: