UTF-8
When simplicity meets efficiency
🍃 Zero dependencies—meticulously crafted code.
🚀 Blazing fast—almost as fast as light!
🌍 Universal compatibility—Windows, Linux, and macOS.
🛡️ Battle-tested—ready for production.
If you have not already added the library to your project, please review the installation guide for more information.
const utf8 = @import("io").string.utils.utf8;
Convert slice to codepoint
const cp = utf8.decode("😀"); // cp 👉 0x1F600
Convert codepoint to slice
var buf: [4]u8 = undefined;
const len = utf8.encode(0x1F600, &buf); // len 👉 4
// buf 👉 "😀"
Get codepoint length
const len = utf8.getCodepointLength(0x1F600) // 👉 4
Get UTF-8 sequence length
const len = utf8.getCodepointLength("😀"[0]) // 👉 4
Function | Description |
---|---|
encode | Fast encode a single Unicode codepoint to UTF-8 sequence , Returns the number of bytes written. |
decode | Fast decode a UTF-8 sequence to a Unicode codepoint , Returns the number of bytes read. |
Function | Description |
---|---|
getCodepointLength | Returns the number of bytes (1-4 ) needed to encode a codepoint in UTF-8 format. |
getFirstByteLength | Returns the expected number of bytes (1-4 ) in a UTF-8 sequence based on the first byte. |
A quick summary with sample performance test results between
SuperZIG
.io
.string
.utils
.utf8
implementations and its popular competitors.
std.unicode
In summary,
io
is faster by 5 times compared tostd
in most cases, thanks to its optimized implementation. ✨
zig build run -- utf8
)Implementation | Scale | Runs | Total Time | Avg Time/Run |
---|---|---|---|---|
std |
x10 | 1099 | 2.01s | 1.829ms |
io |
x10 | 3158 | 2.043s | 646.947μs |
std |
x1k | 10 | 1.885s | 188.5ms |
io |
x1k | 32 | 2.031s | 63.473ms |
std |
x100k | 1 | 18.938s | 18.938s |
io |
x100k | 1 | 6.38s | 6.38s |
zig build run --release=fast -- utf8
)Implementation | Scale | Runs | Total Time | Avg Time/Run |
---|---|---|---|---|
std |
x10 | 10829 | 1.963s | 181.28μs |
io |
x10 | 60651 | 1.951s | 32.169μs |
std |
x1k | 103 | 1.974s | 19.171ms |
io |
x1k | 592 | 2.101s | 3.55ms |
std |
x100k | 1 | 1.936s | 1.936s |
io |
x100k | 5 | 1.764s | 352.809ms |
It is normal for the values to differ each time the benchmark is run, but in general these percentages will remain close.
The benchmarks were run on a Windows 11 v24H2 with 11th Gen Intel® Core™ i5-1155G7 × 8 processor and 32GB of RAM.
The version of zig used is 0.14.0.
The source code of this benchmark bench/string/utils/utf8.zig.