Static Public Member Functions | List of all members
BUnicodeChar Class Reference

Management of all information about characters. More...

Static Public Member Functions

static int32 DigitValue (uint32 c)
 Gets the numeric value c. More...
 
static uint32 FromUTF8 (const char **in)
 Transform a UTF-8 string to an UTF-32 character. More...
 
static bool IsAlNum (uint32 c)
 Determine if c is alphanumeric. More...
 
static bool IsAlpha (uint32 c)
 Determine if c is alphabetic. More...
 
static bool IsBase (uint32 c)
 Determine if c can be used with a diacritic. More...
 
static bool IsControl (uint32 c)
 Determine if c is a control character. More...
 
static bool IsDefined (uint32 c)
 Determine if c is defined. More...
 
static bool IsDigit (uint32 c)
 Determine if c is numeric. More...
 
static bool IsHexDigit (uint32 c)
 Determine if c is a hexadecimal digit. More...
 
static bool IsLower (uint32 c)
 Determine if c is lowercase. More...
 
static bool IsPrintable (uint32 c)
 Determine if c is printable. More...
 
static bool IsPunctuation (uint32 c)
 Determine if c is punctuation character. More...
 
static bool IsSpace (uint32 c)
 Determine if c is a space. More...
 
static bool IsTitle (uint32 c)
 Determine if c is title case. More...
 
static bool IsUpper (uint32 c)
 Determine if c is uppercase. More...
 
static bool IsWhitespace (uint32 c)
 Determine if c is whitespace. More...
 
static uint32 ToLower (uint32 c)
 Transforms c to lowercase. More...
 
static uint32 ToTitle (uint32 c)
 Transforms c to title case. More...
 
static uint32 ToUpper (uint32 c)
 Transforms c to uppercase. More...
 
static void ToUTF8 (uint32 c, char **out)
 Transform a character to UTF-8 encoding. More...
 
static int8 Type (uint32 c)
 Gets the type of a character. More...
 
static size_t UTF8StringLength (const char *string)
 Counts the characters in the given NUL terminated string. More...
 
static size_t UTF8StringLength (const char *string, size_t maxLength)
 Counts the characters in the given string up to maxLength characters. More...
 

Detailed Description

Management of all information about characters.

This class provide a set of tools for managing the whole set of characters defined by unicode. This include information about special sets of characters such as if the character is whitespace, or alphanumeric. It also provides the uppercase equivalent of a character and determines whether a character can be ornamented with accents.

This class consists entirely of static methods, so you do not have to instantiate it. You can call one of the methods passing in the character that you want to be examined.

Note all the function work with chars encoded in UTF-32. This is not the most usual way to handle characters, but it is the fastest. To convert an UTF-8 string to an UTF-32 character use the FromUTF8() method.

Since
Haiku R1

Member Function Documentation

◆ DigitValue()

int32 BUnicodeChar::DigitValue ( uint32  c)
static

Gets the numeric value c.

Returns
The numeric version of the specified unicode character.
Since
Haiku R1

◆ FromUTF8()

uint32 BUnicodeChar::FromUTF8 ( const char **  in)
static

Transform a UTF-8 string to an UTF-32 character.

If the string contains multiple characters, only the fist one is used. This function updates the in pointer so that it points on the next character for the following call.

Returns
The UTF-32 encoded version of in.
Since
Haiku R1

◆ IsAlNum()

static bool BUnicodeChar::IsAlNum ( uint32  c)
static

Determine if c is alphanumeric.

Returns
true if the specified unicode character is a alphabetic or numeric character.
Since
Haiku R1

◆ IsAlpha()

static bool BUnicodeChar::IsAlpha ( uint32  c)
static

Determine if c is alphabetic.

Returns
true if the specified unicode character is an alphabetic character.
Since
Haiku R1

◆ IsBase()

static bool BUnicodeChar::IsBase ( uint32  c)
static

Determine if c can be used with a diacritic.

Note
IsBase() does not determine if a unicode character is distinct.
Returns
true if the specified unicode character is a base form character that can be used with a diacritic.
Since
Haiku R1

◆ IsControl()

static bool BUnicodeChar::IsControl ( uint32  c)
static

Determine if c is a control character.

Example control characters are the non-printable ASCII characters from 0x0 to 0x1F.

Returns
true if the specified unicode character is a control character.
See also
IsPrintable()
Since
Haiku R1

◆ IsDefined()

static bool BUnicodeChar::IsDefined ( uint32  c)
static

Determine if c is defined.

In unicode some codes are not valid or not attributed yet. For these codes this method will return false.

Returns
true if the specified unicode character is defined.
Since
Haiku R1

◆ IsDigit()

static bool BUnicodeChar::IsDigit ( uint32  c)
static

Determine if c is numeric.

Returns
true if the specified unicode character is a number character.
Since
Haiku R1

◆ IsHexDigit()

static bool BUnicodeChar::IsHexDigit ( uint32  c)
static

Determine if c is a hexadecimal digit.

Returns
true if the specified unicode character is a hexadecimal number character.
Since
Haiku R1

◆ IsLower()

static bool BUnicodeChar::IsLower ( uint32  c)
static

Determine if c is lowercase.

Returns
true if the specified unicode character is a lowercase character.
Since
Haiku R1

◆ IsPrintable()

static bool BUnicodeChar::IsPrintable ( uint32  c)
static

Determine if c is printable.

Printable characters are not control characters.

Returns
true if the specified unicode character is a printable character.
See also
IsControl()
Since
Haiku R1

◆ IsPunctuation()

static bool BUnicodeChar::IsPunctuation ( uint32  c)
static

Determine if c is punctuation character.

Returns
true if the specified unicode character is a punctuation character.
Since
Haiku R1

◆ IsSpace()

static bool BUnicodeChar::IsSpace ( uint32  c)
static

Determine if c is a space.

Unlike IsWhitespace() this function will return true for non-breakable spaces. This method is useful for determining if the character will render as an empty space which can be stretched on-screen.

Returns
true if the specified unicode character is some kind of a space character.
See also
IsWhitespace()
Since
Haiku R1

◆ IsTitle()

static bool BUnicodeChar::IsTitle ( uint32  c)
static

Determine if c is title case.

Title case characters are a smaller version of normal uppercase letters.

Returns
true if the specified unicode character is a title case character.
Since
Haiku R1

◆ IsUpper()

static bool BUnicodeChar::IsUpper ( uint32  c)
static

Determine if c is uppercase.

Returns
true if the specified unicode character is an uppercase character.
Since
Haiku R1

◆ IsWhitespace()

static bool BUnicodeChar::IsWhitespace ( uint32  c)
static

Determine if c is whitespace.

This method is essentially the same as IsSpace(), but excludes all non-breakable spaces.

Returns
true if the specified unicode character is a whitespace character.
See also
IsSpace()
Since
Haiku R1

◆ ToLower()

uint32 BUnicodeChar::ToLower ( uint32  c)
static

Transforms c to lowercase.

Returns
The lowercase version of the specified unicode character.
Since
Haiku R1

◆ ToTitle()

uint32 BUnicodeChar::ToTitle ( uint32  c)
static

Transforms c to title case.

Returns
The title case version of the specified unicode character.
Since
Haiku R1

◆ ToUpper()

uint32 BUnicodeChar::ToUpper ( uint32  c)
static

Transforms c to uppercase.

Returns
The uppercase version of the specified unicode character.
Since
Haiku R1

◆ ToUTF8()

void BUnicodeChar::ToUTF8 ( uint32  c,
char **  out 
)
static

Transform a character to UTF-8 encoding.

Returns
The UTF-8 encoding of the specified unicode character.
Since
Haiku R1

◆ Type()

static int8 BUnicodeChar::Type ( uint32  c)
static

Gets the type of a character.

Returns
A member of the unicode_char_category enum.
Since
Haiku R1

◆ UTF8StringLength() [1/2]

size_t BUnicodeChar::UTF8StringLength ( const char *  string)
static

Counts the characters in the given NUL terminated string.

Returns
the number of UTF-8 characters in the NUL terminated string.
See also
BString::CountChars()
Since
Haiku R1

◆ UTF8StringLength() [2/2]

size_t BUnicodeChar::UTF8StringLength ( const char *  string,
size_t  maxLength 
)
static

Counts the characters in the given string up to maxLength characters.

Parameters
stringdoes not need to be NUL terminated if you specify a maxLength that is shorter than the maximum length of the string.
maxLengthThe maximum length of the string in bytes.
Returns
the number of UTF-8 characters in the NUL terminated string up to maxLength characters.
Since
Haiku R1