Uutf.String
Fold over the characters of UTF encoded OCaml string
values.
Note. Since OCaml 4.14, UTF decoders are available in Stdlib.String
. You are encouraged to migrate to them.
encoding_guess s
is the encoding guessed for s
coupled with true
iff there's an initial BOM.
Note. Initial BOMs are also folded over.
type 'a folder =
'a ->
int ->
[ `Uchar of Stdlib.Uchar.t | `Malformed of string ] ->
'a
The type for character folders. The integer is the index in the string where the `Uchar
or `Malformed
starts.
val fold_utf_8 : ?pos:int -> ?len:int -> 'a folder -> 'a -> string -> 'a
fold_utf_8 f a s ?pos ?len ()
is f (
... (f (f a pos u
0) j
1 u
1)
... )
... )
j
n u
n where u
i, j
i are characters and their start position in the UTF-8 encoded substring s
starting at pos
and len
long. The default value for pos
is 0
and len
is String.length s - pos
.
val fold_utf_16be : ?pos:int -> ?len:int -> 'a folder -> 'a -> string -> 'a
fold_utf_16be f a s ?pos ?len ()
is f (
... (f (f a pos u
0) j
1 u
1)
... )
... )
j
n u
n where u
i, j
i are characters and their start position in the UTF-8 encoded substring s
starting at pos
and len
long. The default value for pos
is 0
and len
is String.length s - pos
.
val fold_utf_16le : ?pos:int -> ?len:int -> 'a folder -> 'a -> string -> 'a
fold_utf_16le f a s ?pos ?len ()
is f (
... (f (f a pos u
0) j
1 u
1)
... )
... )
j
n u
n where u
i, j
i are characters and their start position in the UTF-8 encoded substring s
starting at pos
and len
long. The default value for pos
is 0
and len
is String.length s - pos
.