This page details how Haiku reads keys from the keyboard including modifier key and special characters, and how you can read and process these encoded characters in your application.
Haiku encodes all characters using UTF-8. UTF-8 allows Haiku to represent characters from all over the world while still maintaining backwards compatibility with 7-bit ASCII codes. This means that the most commonly used characters are encoded in just one byte while less common characters can be encoded by extending the character encoding to use two, three, or, rarely, four bytes.
Each key on the keyboard is assigned a numeric code to identify it to the operating system. Most of the time you should not have to access these codes directly, instead use one of the constants defined in InterfaceDefs.h such
B_ENTER or read the character from the
The following diagram shows the key codes as they appear on a US 104-key keyboard.
International keyboards each differ a bit but generally share an extra key located in-between the left shift key and Z with the key code 0x69.
Mac keyboards have an equal sign in the keypad with key code 0x6a and some other differences. Often times the keys produce the same key code but appear in different locations.
Modifier keys are keys which have no effect on their own but when combined with another key modify the usual behavior of that key.
The following modifier keys are defined in InterfaceDefs.h
|Transforms lowercase case characters into uppercase characters or chooses alternative punctuation characters. The shift key is also used in combination with |
|Produces keyboard shortcuts for common operations such as cut, copy, paste, print, and find.|
|Outputs control characters in terminal. The control key is sometimes also used as an alternative to |
|Used to in combination with other keys to output special characters such as accented letters and symbols. Because |
|The Menu key is used to produce contextual menus. Like |
In addition you can access the left and right modifier keys individually with the following constants:
Scroll lock, num lock, and caps lock alter other keys pressed after they are released. They are defined by the following constants:
|Produces uppercase characters. Reverses the effect of |
|Prevents the terminal from scrolling.|
|Informs the numeric keypad to output numbers when on. Reverses the function of |
To get the currently active modifiers use the modifiers() function defined in InterfaceDefs.h. This function returns a bitmap containing the currently active modifier keys. You can create a bit mask of the above constants to determine which modifiers are active.
The Interface Kit also defines constants for keys that are aren't represented by a symbol, these include:
B_FUNCTION_KEY constant can further be broken down into the following constants:
For Japanese keyboard two more constants are defined:
The characters produced by each of the key codes is determined by the keymap. The usual way to for the user to choose and modify their keymap is the Keymap preference application. A number of alternative keymaps such as dvorak and keymaps for different locales are available.
A full description of the Keymap preflet can be found in the User Guide.
The keymap is a map of the characters produced by each key on the keyboard including the characters produced when combined with the modifier constants described above. The keymap also contains the codes of the modifier keys and tables for dead keys.
To get the current system keymap create a pointer to a
key_map struct and
char array and pass their addresses to the get_key_map() function. The
key_map struct will be filled out with the current system keymap and the
char array will be filled out with the UTF-8 character encodings.
key_map struct contains a number of fields. Each field is described in several sections below.
The first section contains a version number and the code assigned to each of the modifier keys.
|The version number of the keymap|
|Lock key codes|
|Left and right shift key codes|
|Left and right command key codes|
|Left and right control key codes|
|Left and right option key codes|
|Menu key code|
|A bitmap containing the default state of the lock keys|
To programmatically set a modifier key in the system keymap use the set_modifier_key() function. You can also programmatically set the state of the num lock, caps lock, and scroll lock keys by calling the set_keyboard_locks() function.
The next section of the
key_map struct contains maps of offsets into the array of UTF-8 character encodings filled out in the second parameter of get_key_map(). Since the character maps are filled with UTF-8 characters they may be 1, 2, 3, or rarely 4 bytes long. The characters are contained in non-
NUL terminated Pascal strings. The first byte of the string indicates how many bytes the character is made up of. For example the string for a horizontal ellipses (...) character looks like this:
The first byte is 03 meaning that the character is 3 bytes long. The remaining bytes E2 80 A6 are the UTF-8 byte representation of the horizontal ellipses character. Recall that there is no terminating
NUL character for these strings.
Not every key is mapped to a character. If a key is unmapped the character array contains a 0-byte string. Unmapped keys do not produce
Modifier keys should not be mapped into the character array.
The following character maps are defined:
|Map of characters when the control key is pressed|
|Map of characters when caps lock is turned on and both the option key and shift keys are pressed.|
|Map of characters when caps lock is turned on and the option key is pressed|
|Map of characters when both shift and option keys are pressed|
|Map of characters when the option key is pressed|
|Map of characters when caps lock is on and the shift key is pressed|
|Map of characters when caps lock is turned on|
|Map of characters when shift is pressed|
|Map of characters when no modifiers keys are pressed|
Dead keys are keys that do not produce a character until they are combined with another key. Because these keys do not produce a character on their own they are considered "dead" until they are "brought to life" by being combined with another key. Dead keys are generally used to produce accented characters.
Each of the fields below is a 32-byte array of dead key characters. The dead keys are organized into pairs in the array. Each dead key array can contain up to 16 pairs of dead key characters. The first pair in the array should contain
B_SPACE followed by and the accent character in the second offset. This serves to identify which accent character is contained in the array and serves to define a space followed by accent pair to represent the unadorned accent character.
The rest of the array is filled with pairs containing an unaccented character followed by the accent character.
|Acute dead keys array|
|Grave dead keys array|
|Circumflex dead keys array|
|Dieresis dead keys array|
|Tilde dead keys array|
The final section contains bitmaps that indicate which character table is used for each of the above dead keys. The bitmap can contain any of the following constants:
The bitmaps often contain
B_OPTION_TABLE because accent characters are generally produced by combining a letter with
|Acute dead keys table bitmap|
|Grave dead keys table bitmap|
|Circumflex dead keys table bitmap|
|Dieresis dead keys table bitmap|
|Tilde dead keys table bitmap|