UCommon
Static Public Member Functions | Static Public Attributes
ucommon::utf8 Class Reference

A core class of ut8 encoded string functions. More...

#include <unicode.h>

Inheritance diagram for ucommon::utf8:
Inheritance graph
[legend]

Static Public Member Functions

static unsigned ccount (char *string, ucs4_t character)
 Count occurrences of a unicode character in string.
static size_t chars (unicode_t string)
 How many chars requires to encode a given wchar string.
static size_t chars (ucs4_t character)
 How many chars requires to encode a given unicode character.
static ucs4_t codepoint (char *encoded)
 Convert a utf8 encoded codepoint to a ucs4 character value.
static size_t count (char *string)
 Count ut8 encoded ucs4 codepoints in string.
static char * find (char *string, ucs4_t character, size_t start=0)
 Find first occurance of character in string.
static ucs4_t get (CharacterProtocol &buffer)
 Get a unicode character from a character protocol.
static char * offset (char *string, ssize_t position)
 Get codepoint offset in a string.
static size_t pack (unicode_t unicode, CharacterProtocol &buffer, size_t size)
 Convert a utf8 string into a unicode data buffer.
static ucs4_t put (ucs4_t character, CharacterProtocol &buffer)
 Push a unicode character to a character protocol.
static char * rfind (char *string, ucs4_t character, size_t end=(size_t)-1l)
 Find last occurrence of character in string.
static unsigned size (char *codepoint)
 Compute character size of utf8 string codepoint.
static ucs4_tudup (char *string)
 Dup a utf8 string into a ucs4_t string.
static size_t unpack (unicode_t string, CharacterProtocol &buffer)
 Convert a unicode string into utf8.
static ucs2_twdup (char *string)
 Dup a utf8 string into a ucs2_t representation.

Static Public Attributes

static char * nil
 A convenient NULL pointer value.
static unsigned ucsize
 Size of "unicode_t" character codes, may not be ucs4_t size.

Detailed Description

A core class of ut8 encoded string functions.

This is a foundation for all utf8 string processing.

Author:
David Sugar

Definition at line 62 of file unicode.h.


Member Function Documentation

static unsigned ucommon::utf8::ccount ( char *  string,
ucs4_t  character 
) [static]

Count occurrences of a unicode character in string.

Parameters:
stringto search in.
charactercode to search for.
Returns:
count of occurrences.
static size_t ucommon::utf8::chars ( unicode_t  string) [static]

How many chars requires to encode a given wchar string.

Parameters:
stringof ucs4 data.
Returns:
number of chars required to encode given string.
static size_t ucommon::utf8::chars ( ucs4_t  character) [static]

How many chars requires to encode a given unicode character.

Parameters:
characterto encode.
Returns:
number of chars required to encode given character.
static ucs4_t ucommon::utf8::codepoint ( char *  encoded) [static]

Convert a utf8 encoded codepoint to a ucs4 character value.

Parameters:
encodedutf8 codepoint.
Returns:
ucs4 string or 0 if invalid.
static size_t ucommon::utf8::count ( char *  string) [static]

Count ut8 encoded ucs4 codepoints in string.

Parameters:
stringof utf8 data.
Returns:
codepount count, 0 if empty or invalid.
static char* ucommon::utf8::find ( char *  string,
ucs4_t  character,
size_t  start = 0 
) [static]

Find first occurance of character in string.

Parameters:
stringto search in.
charactercode to search for.
startoffset in string in codepoints.
Returns:
pointer to first instance or NULL if not found.
static ucs4_t ucommon::utf8::get ( CharacterProtocol buffer) [static]

Get a unicode character from a character protocol.

Parameters:
bufferof character protocol to read from.
Returns:
unicode character or EOF error.
static char* ucommon::utf8::offset ( char *  string,
ssize_t  position 
) [static]

Get codepoint offset in a string.

Parameters:
stringof utf8 data.
positionof codepoint in string, negative offsets are from tail.
Returns:
offset of codepoint or NULL if invalid.
static size_t ucommon::utf8::pack ( unicode_t  unicode,
CharacterProtocol buffer,
size_t  size 
) [static]

Convert a utf8 string into a unicode data buffer.

Parameters:
unicodedata buffer.
bufferof character protocol to pack from.
sizeof unicode data buffer in codepoints.
Returns:
number of code points converted.
static ucs4_t ucommon::utf8::put ( ucs4_t  character,
CharacterProtocol buffer 
) [static]

Push a unicode character to a character protocol.

Parameters:
characterto push to file.
bufferof character protocol to push character to.
Returns:
unicode character or EOF on error.
static char* ucommon::utf8::rfind ( char *  string,
ucs4_t  character,
size_t  end = (size_t)-1l 
) [static]

Find last occurrence of character in string.

Parameters:
stringto search in.
charactercode to search for.
endoffset to start from in codepoints.
Returns:
pointer to last instance or NULL if not found.
static unsigned ucommon::utf8::size ( char *  codepoint) [static]

Compute character size of utf8 string codepoint.

Parameters:
codepointin string.
Returns:
size of codepoint as utf8 encoded data, 0 if invalid.
static size_t ucommon::utf8::unpack ( unicode_t  string,
CharacterProtocol buffer 
) [static]

Convert a unicode string into utf8.

Parameters:
stringof unicode data to pack
bufferof character protocol to put data into.
Returns:
number of code points converted.

The documentation for this class was generated from the following file: