ShuDudu's Home was started in 2011, but the web data is lost, so now begin again, I would like to make some friends, I hope you like ShuDudu's home.
Current position: ShuDudu > Net Web >

UTF-8 / ANSI encoding scheme mix

Wednesday on February 5th, 2020Net Web

To achieve this function, before thinking too hard, in fact, relatively simple. Development found, PHP5 (and older versions) for UTF-8, ANSI, etc. encoding conversion, call perfect, so a site can complete a variety of coding mixed. A Chinese character, in the database, UTF-8 encoding occupies 3 bytes, GBK (ANSI) encoding a 2 bytes, so for a Chinese page, ANSI coding can be increased by 20?o 30?erformance, saving the corresponding traffic.

For programs of personal space, a large flow page is blog, micro-blog. Therefore, in a separate micro-blog Eachval projects to be developed, will support UTF-8, ANSI code switching can be micro bloggers UTF-8 encoding, the encoding may be switched to ANSI. (Why in the early stages of development must support this feature? Because too many features associated with this database, the best time to do, or in the future to change up the workload incalculable.)

For high traffic Chinese official website, platform page, forum page, using UTF-8/ANSI coding mix, is an ideal solution, such as Baidu Post Bar a few years ago is to use UTF-8/ANSI encoding mix, later changed UTF-8 encoding.

UTF-8 is relatively better compatibility code, but not all devices (clients) are fully compatible. For example, I installed the English version of CentOS desktop, Firefox browser, browse the Web UTF-8, the full-width characters all Chinese pages, all appear as squares or garbled, this time still need to install the Chinese package, as shown below. Why is there such a problem? Probably because foreign developers, do not completely multi-language test development time, so there will be this bug.

Herein sharing UTF-8/ANSI encoding mix, or quasi UTF-8 encoding, a web page is determined language, priority header () statement HTML code than meta>tag, for example, if a page is written simultaneously these two statements:

? Header? ( 'Content-Type: text/html; charset = utf-8');>

meta http-equiv = "Content-Type" content = "text/html; charset = gbk">

HTML5 code line 2 is meta charset = "gbk" />

Test, the browser language is set to UTF-8. At this time, the language setting meta>tag is invalid. Brackets in the title of this article "is intended to UTF-8 encoding", that is the actual web page coding GBK, but the HTML code and declaration is UTF-8 encoding, if not analysis, that does not look like a fact page GBK . Practice, Chinese GBK pages with higher performance; however UTF-8 encoding is the trend of Chinese web page.

Before most Chinese GB2312 or GBK coded pages are now almost replaced UTF-8 encoding. To solve this problem, maintain high performance, yet the trend, this method is intended to UTF-8 encoding to be used, meta>tag statement can think how to write. As follows, that is, the language of the swap above statement:

? Header? ( 'Content-Type: text/html; charset = gbk');>

meta http-equiv = "Content-Type" content = "text/html; charset = utf-8">

HTML5 code line 2 is meta charset = "utf-8" />

When you save the php file in EmEditor, GBK should be saved as ANSI character codes; UTF-8 should be saved as UTF-8 character code, otherwise they will be garbled.

Test, the database is written, not with coding header () statement priority. There are two factors that will determine the priority encoding format database, one php file encoding format of the file itself; the second is the language setting charset meta>tag. Therefore, in order to solve this problem, you can put this charset 7 characters into any character, so this meta>tag failure, so perfectly different to solve the garbage generated when encoding post.

In addition to the two factors that determine the coding format of the database is written, there is a third factor, iconv () statement can change the encoding format to be written. Test, the statement iconv conversion (coding) format, is stored in the format. For example, a web page is based on the UTF-8 encoding, encoding ANSI want to write, this can be converted:

$ A1 = iconv ( 'UTF-8', 'GBK // IGNORE', $ a1);

Database $ A1 value written is GBK (ANSI) encoding format. Or under GBK encoding, to the UTF-8 encoding, the use of this statement is the opposite:

$ A1 = iconv ( 'GBK', 'UTF-8 // IGNORE', $ a1);

UTF-8 pages open database and call GBK, GBK, or UTF-8 pages open a database, the same with the above statement of the two iconv conversion, the perfect solution for all garbled.

Copyright Protection: ShuDudu from the original article, reproduced Please keep the link: https://www.shududu.com/netweb/UTF-8-ANSI-encoding-scheme-mix.htm