Then they realised the world had moved on to 8-bit encodings and that there were international standards around, such as the ISO-8859 family. Once upon a time Microsoft, like everyone else, used 7-bit character sets, and they invented their own when it suited them, though they kept ASCII as a core subset. This makes “ANSI” utterly useless as an external encoding identifier. On other machines it could be anything else at all. This is not the same as ISO-8859-1 (although it is quite similar). On US and Western European default settings, “ANSI” maps to Windows code page 1252. Some code pages can even use top-bit-clear bytes as trailing bytes in a multibyte sequence, so it's not even strict compatible with plain ASCII. The system codepage is also sometimes known as ‘mbcs’, since on East Asian systems that can be a multiple-byte-per-character encoding. However years of misuse of the term by the DOS and subsequently Windows community has left its practical meaning as “the system codepage of whatever machine is being used”. Use of the top-bit-set characters is not defined in ASCII/ANSI as it is a 7-bit character set. It refers to the ANSI X3.4 standard, which is simply the ANSI organisation's ratified version of ASCII. Technically, ANSI should be the same as US-ASCII.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |