Unicode, UTF-8, ASCII, BOM, ISO 10646, multibyte, collation, charsets, etc... there's a lot of technical jargon when it comes to characters.

With beautiful slides, animated GIFs, and most importantly, in plain English, we will discover character encodings that every programmer must know, and how we can handle Unicode characters in PHP.

There are various languages used in the world, and each language has different scripts and glyphes of various lengths, heights and rules. With Emojis getting popular (with their own movies no less!), it is important to accommodate all these weird looking characters, understand how they are represented, possible gotchas, and how to process them.

During the talk, we will take a look at a flawed snippets, and how they can be fixed to process Unicode characters properly.

We will also take a look at other IO operations such as file read/write and database connections where we must pay attention to make sure everything works nice with Unicode characters.

Location: E105.

Comments

Comments are closed.

Thomas Berends at 16:49 on 7 Jun 2019

First talk of the day, and it was a great start of the day. I thought I just 'didn't know a lot' about unicode / UTF-8, but I found out that I knew absolutely nothing. It was a perfect combination of a history of unicode, what the differences between ASCII, UTF-8, UTF-16, etc. are, and how to apply what we just learned in PHP. For me, this was certainly one of the better talks in the conference so far.

Thank you Ayesh, I loved it and I hope you get to do your talk way more often.

Andreas Heigl at 21:41 on 8 Jun 2019

Great content and definitely worth hearing for every developer.

Still in my opinion the talk would gain from more clear distinctions like between charset and encoding or character and codepoint.

Arnout Boks at 22:48 on 8 Jun 2019

Good introduction to Unicode and how to handle it in PHP, covering most of the important concepts. The intro felt a bit slow to me, but other than that it was a great talk. Also, kudos for your amazing slides, I can see that a lot of work went into those.

PS: Regarding your slides, I noticed some small error that you might want to fix in a next version (numbering following PPTX version):
* Slide 79 seems to miss a '\'
* Slide 84 seems to have 'hex' and 'decimal' mixed up
* Slide 99 has '100.000' instead of '10.000' in the right column
* On slide 107, I think that "\x{0045}" should result in an "E" rather than a "D".

Bohuslav Simek at 14:10 on 9 Jun 2019

This was an excellent overview of Unicode in multiple situations, and I won't hesitate to recommend this talk to all PHP developers (I am looking forward to video record of this talk)! The only small issue for me was that there wasn't anything new for me, but this is my problem, not the speakers. ;-)

Bart Deurloo at 14:28 on 9 Jun 2019

Excellent talk with a very nice deck of slides. I enjoyed the simple and graphical explanation of the Unicode standard and UTF-8 structure in particular.

I wish the talk explained some actionable examples of Unicode security. It quickly mentioned a few key points to watch out for, but I would've loved to see so e real life examples. In fairness this was a 45 minute topic and I think the speaker had some pressure from the keynote that ended a bit late.