GNU Libidn API Reference Manual | ||||
---|---|---|---|---|
#define STRINGPREP_VERSION enum Stringprep_rc; enum Stringprep_profile_flags; enum Stringprep_profile_steps; #define STRINGPREP_MAX_MAP_CHARS struct Stringprep_table_element; struct Stringprep_table; typedef Stringprep_profile; struct Stringprep_profiles; #define stringprep_nameprep (in, maxlen) #define stringprep_nameprep_no_unassigned(in, maxlen) #define stringprep_plain (in, maxlen) #define stringprep_kerberos5 (in, maxlen) #define stringprep_xmpp_nodeprep (in, maxlen) #define stringprep_xmpp_resourceprep (in, maxlen) #define stringprep_iscsi (in, maxlen) int stringprep_4i (uint32_t *ucs4, size_t *len, size_t maxucs4len, Stringprep_profile_flags flags, const Stringprep_profile *profile); int stringprep_4zi (uint32_t *ucs4, size_t maxucs4len, Stringprep_profile_flags flags, const Stringprep_profile *profile); int stringprep (char *in, size_t maxlen, Stringprep_profile_flags flags, const Stringprep_profile *profile); int stringprep_profile (const char *in, char **out, const char *profile, Stringprep_profile_flags flags); const char* stringprep_strerror (Stringprep_rc rc); const char* stringprep_check_version (const char *req_version); int stringprep_unichar_to_utf8 (uint32_t c, char *outbuf); uint32_t stringprep_utf8_to_unichar (const char *p); uint32_t* stringprep_utf8_to_ucs4 (const char *str, ssize_t len, size_t *items_written); char* stringprep_ucs4_to_utf8 (const uint32_t *str, ssize_t len, size_t *items_read, size_t *items_written); char* stringprep_utf8_nfkc_normalize (const char *str, ssize_t len); uint32_t* stringprep_ucs4_nfkc_normalize (uint32_t *str, ssize_t len); const char* stringprep_locale_charset (void); char* stringprep_convert (const char *str, const char *to_codeset, const char *from_codeset); char* stringprep_locale_to_utf8 (const char *str); char* stringprep_utf8_to_locale (const char *str);
#define STRINGPREP_VERSION "0.6.7"
String defined via CPP denoting the header file version number.
Used together with stringprep_check_version()
to verify header file
and run-time library consistency.
typedef enum { STRINGPREP_OK = 0, /* Stringprep errors. */ STRINGPREP_CONTAINS_UNASSIGNED = 1, STRINGPREP_CONTAINS_PROHIBITED = 2, STRINGPREP_BIDI_BOTH_L_AND_RAL = 3, STRINGPREP_BIDI_LEADTRAIL_NOT_RAL = 4, STRINGPREP_BIDI_CONTAINS_PROHIBITED = 5, /* Error in calling application. */ STRINGPREP_TOO_SMALL_BUFFER = 100, STRINGPREP_PROFILE_ERROR = 101, STRINGPREP_FLAG_ERROR = 102, STRINGPREP_UNKNOWN_PROFILE = 103, /* Internal errors. */ STRINGPREP_NFKC_FAILED = 200, STRINGPREP_MALLOC_ERROR = 201 } Stringprep_rc;
Enumerated return codes of stringprep()
, stringprep_profile()
functions (and macros using those functions). The value 0 is
guaranteed to always correspond to success.
typedef enum { STRINGPREP_NO_NFKC = 1, STRINGPREP_NO_BIDI = 2, STRINGPREP_NO_UNASSIGNED = 4 } Stringprep_profile_flags;
Stringprep profile flags.
typedef enum { STRINGPREP_NFKC = 1, STRINGPREP_BIDI = 2, STRINGPREP_MAP_TABLE = 3, STRINGPREP_UNASSIGNED_TABLE = 4, STRINGPREP_PROHIBIT_TABLE = 5, STRINGPREP_BIDI_PROHIBIT_TABLE = 6, STRINGPREP_BIDI_RAL_TABLE = 7, STRINGPREP_BIDI_L_TABLE = 8 } Stringprep_profile_steps;
Various steps in the stringprep algorithm. You really want to study the source code to understand this one. Only useful if you want to add another profile.
#define STRINGPREP_MAX_MAP_CHARS 4
Maximum number of code points that can replace a single code point, during stringprep mapping.
struct Stringprep_table_element { uint32_t start; uint32_t end; /* 0 if only one character */ uint32_t map[STRINGPREP_MAX_MAP_CHARS]; /* NULL if end is not 0 */ };
struct Stringprep_table { Stringprep_profile_steps operation; Stringprep_profile_flags flags; const Stringprep_table_element *table; };
struct Stringprep_profiles { const char *name; const Stringprep_profile *tables; };
#define stringprep_nameprep(in, maxlen)
Prepare the input UTF-8 string according to the nameprep profile.
The AllowUnassigned flag is true, use
stringprep_nameprep_no_unassigned()
if you want a false
AllowUnassigned. Returns 0 iff successful, or an error code.
in : |
input/ouput array with string to prepare. |
maxlen : |
maximum length of input/output array. |
#define stringprep_nameprep_no_unassigned(in, maxlen)
Prepare the input UTF-8 string according to the nameprep profile.
The AllowUnassigned flag is false, use stringprep_nameprep()
for
true AllowUnassigned. Returns 0 iff successful, or an error code.
in : |
input/ouput array with string to prepare. |
maxlen : |
maximum length of input/output array. |
#define stringprep_plain(in, maxlen)
Prepare the input UTF-8 string according to the draft SASL ANONYMOUS profile. Returns 0 iff successful, or an error code.
in : |
input/ouput array with string to prepare. |
maxlen : |
maximum length of input/output array. |
#define stringprep_xmpp_nodeprep(in, maxlen)
Prepare the input UTF-8 string according to the draft XMPP node identifier profile. Returns 0 iff successful, or an error code.
in : |
input/ouput array with string to prepare. |
maxlen : |
maximum length of input/output array. |
#define stringprep_xmpp_resourceprep(in, maxlen)
Prepare the input UTF-8 string according to the draft XMPP resource identifier profile. Returns 0 iff successful, or an error code.
in : |
input/ouput array with string to prepare. |
maxlen : |
maximum length of input/output array. |
#define stringprep_iscsi(in, maxlen)
Prepare the input UTF-8 string according to the draft iSCSI stringprep profile. Returns 0 iff successful, or an error code.
in : |
input/ouput array with string to prepare. |
maxlen : |
maximum length of input/output array. |
int stringprep_4i (uint32_t *ucs4, size_t *len, size_t maxucs4len, Stringprep_profile_flags flags, const Stringprep_profile *profile);
Prepare the input UCS-4 string according to the stringprep profile, and write back the result to the input string.
The input is not required to be zero terminated (ucs4
[len
] = 0).
The output will not be zero terminated unless ucs4
[len
] = 0.
Instead, see stringprep_4zi()
if your input is zero terminated or
if you want the output to be.
Since the stringprep operation can expand the string, maxucs4len
indicate how large the buffer holding the string is. This function
will not read or write to code points outside that size.
The flags
are one of Stringprep_profile_flags values, or 0.
The profile
contain the Stringprep_profile instructions to
perform. Your application can define new profiles, possibly
re-using the generic stringprep tables that always will be part of
the library, or use one of the currently supported profiles.
ucs4 : |
input/output array with string to prepare. |
len : |
on input, length of input array with Unicode code points, on exit, length of output array with Unicode code points. |
maxucs4len : |
maximum length of input/output array. |
flags : |
a Stringprep_profile_flags value, or 0. |
profile : |
pointer to Stringprep_profile to use. |
Returns : | Returns STRINGPREP_OK iff successful, or an
Stringprep_rc error code.
|
int stringprep_4zi (uint32_t *ucs4, size_t maxucs4len, Stringprep_profile_flags flags, const Stringprep_profile *profile);
Prepare the input zero terminated UCS-4 string according to the stringprep profile, and write back the result to the input string.
Since the stringprep operation can expand the string, maxucs4len
indicate how large the buffer holding the string is. This function
will not read or write to code points outside that size.
The flags
are one of Stringprep_profile_flags values, or 0.
The profile
contain the Stringprep_profile instructions to
perform. Your application can define new profiles, possibly
re-using the generic stringprep tables that always will be part of
the library, or use one of the currently supported profiles.
ucs4 : |
input/output array with zero terminated string to prepare. |
maxucs4len : |
maximum length of input/output array. |
flags : |
a Stringprep_profile_flags value, or 0. |
profile : |
pointer to Stringprep_profile to use. |
Returns : | Returns STRINGPREP_OK iff successful, or an
Stringprep_rc error code.
|
int stringprep (char *in, size_t maxlen, Stringprep_profile_flags flags, const Stringprep_profile *profile);
Prepare the input zero terminated UTF-8 string according to the stringprep profile, and write back the result to the input string.
Note that you must convert strings entered in the systems locale
into UTF-8 before using this function, see
stringprep_locale_to_utf8()
.
Since the stringprep operation can expand the string, maxlen
indicate how large the buffer holding the string is. This function
will not read or write to characters outside that size.
The flags
are one of Stringprep_profile_flags values, or 0.
The profile
contain the Stringprep_profile instructions to
perform. Your application can define new profiles, possibly
re-using the generic stringprep tables that always will be part of
the library, or use one of the currently supported profiles.
in : |
input/ouput array with string to prepare. |
maxlen : |
maximum length of input/output array. |
flags : |
a Stringprep_profile_flags value, or 0. |
profile : |
pointer to Stringprep_profile to use. |
Returns : | Returns STRINGPREP_OK iff successful, or an error code.
|
int stringprep_profile (const char *in, char **out, const char *profile, Stringprep_profile_flags flags);
Prepare the input zero terminated UTF-8 string according to the stringprep profile, and return the result in a newly allocated variable.
Note that you must convert strings entered in the systems locale
into UTF-8 before using this function, see
stringprep_locale_to_utf8()
.
The output out
variable must be deallocated by the caller.
The flags
are one of Stringprep_profile_flags values, or 0.
The profile
specifies the name of the stringprep profile to use.
It must be one of the internally supported stringprep profiles.
in : |
input array with UTF-8 string to prepare. |
out : |
output variable with pointer to newly allocate string. |
profile : |
name of stringprep profile to use. |
flags : |
a Stringprep_profile_flags value, or 0. |
Returns : | Returns STRINGPREP_OK iff successful, or an error code.
|
const char* stringprep_strerror (Stringprep_rc rc);
Convert a return code integer to a text string. This string can be used to output a diagnostic message to the user.
STRINGPREP_OK: Successful operation. This value is guaranteed to
always be zero, the remaining ones are only guaranteed to hold
non-zero values, for logical comparison purposes.
STRINGPREP_CONTAINS_UNASSIGNED: String contain unassigned Unicode
code points, which is forbidden by the profile.
STRINGPREP_CONTAINS_PROHIBITED: String contain code points
prohibited by the profile.
STRINGPREP_BIDI_BOTH_L_AND_RAL: String contain code points with
conflicting bidirection category.
STRINGPREP_BIDI_LEADTRAIL_NOT_RAL: Leading and trailing character
in string not of proper bidirectional category.
STRINGPREP_BIDI_CONTAINS_PROHIBITED: Contains prohibited code
points detected by bidirectional code.
STRINGPREP_TOO_SMALL_BUFFER: Buffer handed to function was too
small. This usually indicate a problem in the calling
application.
STRINGPREP_PROFILE_ERROR: The stringprep profile was inconsistent.
This usually indicate an internal error in the library.
STRINGPREP_FLAG_ERROR: The supplied flag conflicted with profile.
This usually indicate a problem in the calling application.
STRINGPREP_UNKNOWN_PROFILE: The supplied profile name was not
known to the library.
STRINGPREP_NFKC_FAILED: The Unicode NFKC operation failed. This
usually indicate an internal error in the library.
STRINGPREP_MALLOC_ERROR: The malloc()
was out of memory. This is
usually a fatal error.
rc : |
a Stringprep_rc return code. |
Returns : | Returns a pointer to a statically allocated string
containing a description of the error with the return code rc .
|
const char* stringprep_check_version (const char *req_version);
Check that the the version of the library is at minimum the requested one and return the version string; return NULL if the condition is not satisfied. If a NULL is passed to this function, no check is done, but the version string is simply returned.
See STRINGPREP_VERSION
for a suitable req_version
string.
req_version : |
Required version number, or NULL. |
Returns : | Version string of run-time library, or NULL if the run-time library does not meet the required version number. |
int stringprep_unichar_to_utf8 (uint32_t c, char *outbuf);
Converts a single character to UTF-8.
c : |
a ISO10646 character code |
outbuf : |
output buffer, must have at least 6 bytes of space.
If NULL , the length will be computed and returned
and nothing will be written to outbuf .
|
Returns : | number of bytes written. |
uint32_t stringprep_utf8_to_unichar (const char *p);
Converts a sequence of bytes encoded as UTF-8 to a Unicode character.
If p
does not point to a valid UTF-8 encoded character, results are
undefined.
p : |
a pointer to Unicode character encoded as UTF-8 |
Returns : | the resulting character. |
uint32_t* stringprep_utf8_to_ucs4 (const char *str, ssize_t len, size_t *items_written);
Convert a string from UTF-8 to a 32-bit fixed width representation as UCS-4, assuming valid UTF-8 input. This function does no error checking on the input.
str : |
a UTF-8 encoded string |
len : |
the maximum length of str to use. If len < 0, then
the string is nul-terminated.
|
items_written : |
location to store the number of characters in the
result, or NULL .
|
Returns : | a pointer to a newly allocated UCS-4 string.
This value must be freed with free() .
|
char* stringprep_ucs4_to_utf8 (const uint32_t *str, ssize_t len, size_t *items_read, size_t *items_written);
Convert a string from a 32-bit fixed width representation as UCS-4. to UTF-8. The result will be terminated with a 0 byte.
str : |
a UCS-4 encoded string |
len : |
the maximum length of str to use. If len < 0, then
the string is terminated with a 0 character.
|
items_read : |
location to store number of characters read read, or NULL .
|
items_written : |
location to store number of bytes written or NULL .
The value here stored does not include the trailing 0
byte.
|
Returns : | a pointer to a newly allocated UTF-8 string.
This value must be freed with free() . If an
error occurs, NULL will be returned and
error set.
|
char* stringprep_utf8_nfkc_normalize (const char *str, ssize_t len);
Converts a string into canonical form, standardizing such issues as whether a character with an accent is represented as a base character and combining accent or as a single precomposed character.
The normalization mode is NFKC (ALL COMPOSE). It standardizes differences that do not affect the text content, such as the above-mentioned accent representation. It standardizes the "compatibility" characters in Unicode, such as SUPERSCRIPT THREE to the standard forms (in this case DIGIT THREE). Formatting information may be lost but for most text operations such characters should be considered the same. It returns a result with composed forms rather than a maximally decomposed form.
str : |
a UTF-8 encoded string. |
len : |
length of str , in bytes, or -1 if str is nul-terminated.
|
Returns : | a newly allocated string, that is the
NFKC normalized form of str .
|
uint32_t* stringprep_ucs4_nfkc_normalize (uint32_t *str, ssize_t len);
Converts UCS4 string into UTF-8 and runs
stringprep_utf8_nfkc_normalize()
.
str : |
a Unicode string. |
len : |
length of str array, or -1 if str is nul-terminated.
|
Returns : | a newly allocated Unicode string, that is the NFKC
normalized form of str .
|
const char* stringprep_locale_charset (void);
Find out current locale charset. The function respect the CHARSET environment variable, but typically uses nl_langinfo(CODESET) when it is supported. It fall back on "ASCII" if CHARSET isn't set and nl_langinfo isn't supported or return anything.
Note that this function return the application's locale's preferred charset (or thread's locale's preffered charset, if your system support thread-specific locales). It does not return what the system may be using. Thus, if you receive data from external sources you cannot in general use this function to guess what charset it is encoded in. Use stringprep_convert from the external representation into the charset returned by this function, to have data in the locale encoding.
Returns : | Return the character set used by the current locale. It will never return NULL, but use "ASCII" as a fallback. |
char* stringprep_convert (const char *str, const char *to_codeset, const char *from_codeset);
Convert the string from one character set to another using the
system's iconv()
function.
str : |
input zero-terminated string. |
to_codeset : |
name of destination character set. |
from_codeset : |
name of origin character set, as used by str .
|
Returns : | Returns newly allocated zero-terminated string which
is str transcoded into to_codeset.
|
char* stringprep_locale_to_utf8 (const char *str);
Convert string encoded in the locale's character set into UTF-8 by
using stringprep_convert()
.
str : |
input zero terminated string. |
Returns : | Returns newly allocated zero-terminated string which
is str transcoded into UTF-8.
|
char* stringprep_utf8_to_locale (const char *str);
Convert string encoded in UTF-8 into the locale's character set by
using stringprep_convert()
.
str : |
input zero terminated string. |
Returns : | Returns newly allocated zero-terminated string which
is str transcoded into the locale's character set.
|