Use this information wisely

paequ2@lemmy.today · 2 months ago

Use this information wisely

Onno (VK6FLAB)@lemmy.radio · edit-2 2 months ago

Wow!

This seems to be further evidence that the process for assigning UTF entities has been thoroughly corrupted.

You can (apparently) copy/paste this on mobile:

“;” (Greek question mark)

“;” (Semicolon)

You can even render it in HTML:

    &#894;
    &#x37E;

And it’s included on Wikipedia, because of course it is:

https://en.wikipedia.org/wiki/Question_mark

Because I’m not sure what my mobile client will actually do with this comment, here’s the link to the HTML entity I used:

https://www.compart.com/en/unicode/U+037E

Also there’s plenty of other character joy to be had:

https://web.archive.org/web/20150118083005/http://www.tlg.uci.edu/~opoudjis/unicode/punctuation.html

tisktisk@piefed.social · 2 months ago

If I don’t understand what’s happening here but want to, should I research Unicode in general or something else?

Onno (VK6FLAB)@lemmy.radio · edit-2 2 months ago

Unicode is a way to encode the things that humans use to write stuff into a computer.

ASCII is for example another way, as is EBCDIC.

All these methods translate squiggles that we’ve used for centuries into something that can be represented inside a computer.

For example, the letter “A” is under ASCII represented by the number 65.

This post is pointing out that there are two characters that look identical, but have different numbers, which means that what the user sees is identical, but what the computer sees is different.

This is the basis for much tomfoolery.