CCUtf8_stringUnicode String, in UTF8
A unicode string represented by a utf8 bytestring. This representation is convenient for manipulating normal OCaml strings that are encoded in UTF8.
We perform only basic decoding and encoding between codepoints and bytestrings. For more elaborate operations, please use the excellent Uutf.
status: experimental
val hash : t -> intval pp : Stdlib.Format.formatter -> t -> unitval to_string : t -> stringIdentity.
Iter of unicode codepoints. Renamed from to_std_seq since 3.0.
val n_chars : t -> intNumber of characters.
val n_bytes : t -> intNumber of bytes.
val empty : tEmpty string.
concat sep l concatenates each string in l, inserting sep in between each string. Similar to String.concat.
Build a string from unicode codepoints Renamed from of_std_seq since 3.0.
Translate the unicode codepoint to a list of utf-8 bytes. This can be used, for example, in combination with Buffer.add_char on a pre-allocated buffer to add the bytes one by one (despite its name, Buffer.add_char takes individual bytes, not unicode codepoints).
val of_string_exn : string -> tValidate string by checking it is valid UTF8.
val of_string : string -> t optionSafe version of of_string_exn.
val unsafe_of_string : string -> tConversion from a string without validating. CAUTION this is unsafe and can break all the other functions in this module. Use only if you're sure the string is valid UTF8. Upon iteration, if an invalid substring is met, Malformed will be raised.