let str = "ačŘ"
print("str has \(str.characters.count) characters") // 3
print("and \(str.utf8.count) bytes as encoded in UTF-8") // 5
更新(根据你的笔记)
let s = "✌?️"
let arr:[UInt8] = [226, 156, 140, 240, 159, 143, 191, 239, 184, 143]
var arrCchar = arr.map { (uint8) -> Int8 in
Int8(bitPattern: uint8)
}
arrCchar += [0] // to be null terminated
let str = String.fromCString(&arrCchar)
print(str) // Optional("✌?️")
s == str // TRUE !!!!
按字符
s.characters.forEach { (c) -> () in
let str = String(c)
print(str.utf8.map{$0}, "which represents character: ", c)
str.unicodeScalars.forEach({ (u) -> () in
print("composed from unicode scalar(s): ", u.debugDescription)
})
}
/*
[226, 156, 140] which represents character: ✌
composed from unicode scalar(s): "\u{270C}"
[240, 159, 143, 191, 239, 184, 143] which represents character: ?️
composed from unicode scalar(s): "\u{0001F3FF}"
composed from unicode scalar(s): "\u{FE0F}"
*/
Unicode 中的每个字符都可以由一个或多个 Unicode 标量表示。 unicode 标量是字符或修饰符的唯一 21 位数字(和名称),例如 U+0061 表示小写拉丁字母 A("a"),或 U+1F425 表示正面小鸡 ("\U0001f425 ”)。
当将 Unicode 字符串写入文本文件或其他存储时,这些 unicode 标量会以几种 Unicode 定义的格式之一进行编码。每种格式都将字符串编码为称为代码单元的小块。其中包括 UTF-8 格式(将字符串编码为 8 位代码单元)和 UTF-16 格式(将字符串编码为 16 位代码单元)。
//复制自 Apple Developer swift 编程指南