How to tame a 🦄

Andreas Heigl

Saturday 26 October 2024 from 11:00 to 11:50

Talk in English - US at CascadiaPHP 2024
Track Name: Crater Lake
View Slides: https://heiglandreas.github.io/slidedeck/20241026-cascadiaphp24_tameUnicorn/index.html#/
Short URL: https://joind.in/talk/3f944 (QR-Code (opens in new window))

Avg. Rating

You've got strange characters like "�" or "Ã¶" display in your application? Yes, handling non-English characters in application code, files and databases can be a challenge, to say the least. Whether that's German Umlauts, Cyrillic letters, Asian Glyphs or Emojis: It's always a mess in an international application. In this session you will see why that is and how handling characters evolved in computing. You will also see how handling characters in applications and databases can be done less painfully. And don't worry when EBCDIC, BOM or ISO-8859-7 are Greek to you and your Unicode is a bit rusty: we'll have a look at them too!

Comments

Comments are closed.

Katriel Wolfe at 11:56 on 26 Oct 2024 (via Web2 LIVE)

I had some things explained to me that I've always wanted to understand. So well presented for comprehension.

Steve Grunwell at 12:00 on 26 Oct 2024 (via Web2 LIVE)

A really enlightening talk, especially for those of us who take encodings and charsets for granted (or just rely on our wonderful ops folks). A high-level understanding of binary is helpful, but even non-CS folks like me can get a lot out of the talk.

Korvin Szanto at 12:00 on 26 Oct 2024 (via Web2 LIVE)

Great talk, lots of information and very knowledgeable about the topic

Scott Keck-Warren at 12:07 on 26 Oct 2024 (via Web2 LIVE)

I finally understand that UTF-8 and Unicode are different. Thank you so much for this talk it has been eye-opening!

Marcella Parker at 13:04 on 26 Oct 2024 (via Web2 LIVE)

This was so educational! I learned a lot in this talk.

Chris Abbey at 15:22 on 26 Oct 2024 (via Web2 LIVE)

Great deep dive into Unicode vs UTF-8 and explaining the differences and when you use one vs the other. I do wish you had given the graphene_* functions more billing and a bit more explanation around when to use the plain vs mb_ vs graphene_ version of various string functions. That might have been a bit more beneficial to developers than the bit level break down of what UTF-8 looks like on the wire. (Though admittedly, that was clearly well appreciated in the room.)