You've got strange characters like "�" or "ö" display in your application? Yes, handling non-English characters in application code, files and databases can be a challenge, to say the least. Whether that's German Umlauts, Cyrillic letters, Asian Glyphs or Emojis: It's always a mess in an international application. In this session you will see why that is and how handling characters evolved in computing. You will also see how handling characters in applications and databases can be done less painfully. And don't worry when EBCDIC, BOM or ISO-8859-7 are Greek to you and your Unicode is a bit rusty: we'll have a look at them too!

Comments

Please login to leave a comment

Katriel Wolfe at 11:56 on 26 Oct 2024

I had some things explained to me that I've always wanted to understand. So well presented for comprehension.

A really enlightening talk, especially for those of us who take encodings and charsets for granted (or just rely on our wonderful ops folks). A high-level understanding of binary is helpful, but even non-CS folks like me can get a lot out of the talk.

Korvin Szanto at 12:00 on 26 Oct 2024

Great talk, lots of information and very knowledgeable about the topic

I finally understand that UTF-8 and Unicode are different. Thank you so much for this talk it has been eye-opening!

This was so educational! I learned a lot in this talk.

Chris Abbey at 15:22 on 26 Oct 2024

Great deep dive into Unicode vs UTF-8 and explaining the differences and when you use one vs the other. I do wish you had given the graphene_* functions more billing and a bit more explanation around when to use the plain vs mb_ vs graphene_ version of various string functions. That might have been a bit more beneficial to developers than the bit level break down of what UTF-8 looks like on the wire. (Though admittedly, that was clearly well appreciated in the room.)