设 _.---._ .:&quot是什么意思;:_'-.-...

Keyboard Shortcuts?
Next menu item
Previous menu item
Previous man page
Next man page
Scroll to bottom
Scroll to top
Goto homepage
Goto search(current page)
Focus search box
Change language:
Brazilian Portuguese
Chinese (Simplified)
mb_convert_encoding
mb_convert_encoding & Convert character encoding
Description
string mb_convert_encoding
( string $str
, string $to_encoding
$from_encoding = mb_internal_encoding()
Parameters
being encoded.
to_encoding
The type of encoding that str is being converted to.
from_encoding
Is specified by character code names before conversion. It is either
an , or a comma separated enumerated list.
If from_encoding is not specified, the internal
encoding will be used.
Return Values
The encoded .
Example #1 mb_convert_encoding() example
&?php/*&Convert&internal&character&encoding&to&SJIS&*/$str&=&mb_convert_encoding($str,&"SJIS");/*&Convert&EUC-JP&to&UTF-7&*/$str&=&mb_convert_encoding($str,&"UTF-7",&"EUC-JP");/*&Auto&detect&encoding&from&JIS,&eucjp-win,&sjis-win,&then&convert&str&to&UCS-2LE&*/$str&=&mb_convert_encoding($str,&"UCS-2LE",&"JIS,&eucjp-win,&sjis-win");/*&"auto"&is&expanded&to&"ASCII,JIS,UTF-8,EUC-JP,SJIS"&*/$str&=&mb_convert_encoding($str,&"EUC-JP",&"auto");?&
- Set/Get character encoding detection order
My solution below was slightly incorrect, so here is the correct version (I posted at the end of a long day, never a good idea!)Again, this is a quick and dirty solution to stop mb_convert_encoding from filling your string with question marks whenever it encounters an illegal character for the target encoding. &?phpfunction convert_to ( $source, $target_encoding )& & {& & $encoding = mb_detect_encoding( $source, "auto" );& & && & & $target = str_replace( "?", "[question_mark]", $source );& & && & & $target = mb_convert_encoding( $target, $target_encoding, $encoding);& & && & & $target = str_replace( "?", "", $target );& & && & & $target = str_replace( "[question_mark]", "?", $target );&& & & return $target;& & }?&Hope this helps someone! (Admins should feel free to delete my previous, incorrect, post for clarity)-A
I've been trying to find the charset of a norwegian (with a lot of ?, ae, ?) txt file written on a Mac, i've found it in this way:
&?php
$text = "A strange string to pass, maybe with some ?, ae, ? characters.";
foreach(mb_list_encodings() as $chr){
& & & & echo mb_convert_encoding($text, 'UTF-8', $chr)." : ".$chr."&br&";& &
The line that looks good, gives you the encoding it was written in.
Hope can help someone
For my last project I needed to convert several CSV files from Windows-1250 to UTF-8, and after several days of searching around I found a function that is partially solved my problem, but it still has not transformed all the characters. So I made this:function w1250_to_utf8($text) {& & // map based on:& & // & & // & & // & & $map = array(& & & & chr(0x8A) =& chr(0xA9),& & & & chr(0x8C) =& chr(0xA6),& & & & chr(0x8D) =& chr(0xAB),& & & & chr(0x8E) =& chr(0xAE),& & & & chr(0x8F) =& chr(0xAC),& & & & chr(0x9C) =& chr(0xB6),& & & & chr(0x9D) =& chr(0xBB),& & & & chr(0xA1) =& chr(0xB7),& & & & chr(0xA5) =& chr(0xA1),& & & & chr(0xBC) =& chr(0xA5),& & & & chr(0x9F) =& chr(0xBC),& & & & chr(0xB9) =& chr(0xB1),& & & & chr(0x9A) =& chr(0xB9),& & & & chr(0xBE) =& chr(0xB5),& & & & chr(0x9E) =& chr(0xBE),& & & & chr(0x80) =& '&',& & & & chr(0x82) =& '&',& & & & chr(0x84) =& '&',& & & & chr(0x85) =& '&',& & & & chr(0x86) =& '&',& & & & chr(0x87) =& '&D',& & & & chr(0x89) =& '&',& & & & chr(0x8B) =& '&',& & & & chr(0x91) =& '&',& & & & chr(0x92) =& '&',& & & & chr(0x93) =& '&',& & & & chr(0x94) =& '&',& & & & chr(0x95) =& '&',& & & & chr(0x96) =& '&',& & & & chr(0x97) =& '&',& & & & chr(0x99) =& '&',& & & & chr(0x9B) =& '&',& & & & chr(0xA6) =& '&',& & & & chr(0xA9) =& '&',& & & & chr(0xAB) =& '&',& & & & chr(0xAE) =& '&',& & & & chr(0xB1) =& '&',& & & & chr(0xB5) =& '&',& & & & chr(0xB6) =& '&',& & & & chr(0xB7) =& '&',& & & & chr(0xBB) =& '&',& & );& & return html_entity_decode(mb_convert_encoding(strtr($text, $map), 'UTF-8', 'ISO-8859-2'), ENT_QUOTES, 'UTF-8');}
Hey guys. For everybody who's looking for a function that is converting an iso-string to utf8 or an utf8-string to iso, here's your solution:public function encodeToUtf8($string) {& && return mb_convert_encoding($string, "UTF-8", mb_detect_encoding($string, "UTF-8, ISO-8859-1, ISO-8859-15", true));}public function encodeToIso($string) {& && return mb_convert_encoding($string, "ISO-8859-1", mb_detect_encoding($string, "UTF-8, ISO-8859-1, ISO-8859-15", true));}For me these functions are working fine. Give it a try
aaron, to discard unsupported characters instead of printing a ?, you might as well simply set the configuration directive:mbstring.substitute_character = "none"in your php.ini. Be sure to include the quotes around none. Or at run-time with&?phpini_set('mbstring.substitute_character', "none");?&
Note that `mb_convert_encoding($val, 'HTML-ENTITIES')` does not escape '\'', '"', '&', '&', or '&'.
To add to the Flash conversion comment below, here's how I convert back from what I've stored in a database after converting from Flash HTML text field output, in order to load it back into a Flash HTML text field:function htmltoflash($htmlstr){& return str_replace("&br /&","\n",& & str_replace("&","&",& & & str_replace("&","&",& & & & mb_convert_encoding(html_entity_decode($htmlstr),& & & & "UTF-8","ISO-8859-1"))));}
For those wanting to convert from $set to MacRoman, use iconv():&?php$string = iconv('UTF-8', 'macintosh', $string);?&('macintosh' is the IANA name for the MacRoman character set.)
instead of ini_set(), you can try thismb_substitute_character("none");
Why did you use the php html encode functions? mbstring has it's own Encoding which is (as far as I tested it) much more usefull:HTML-ENTITIESExample:$text = mb_convert_encoding($text, 'HTML-ENTITIES', "UTF-8");
mb_substr and probably several other functions works faster in ucs-2 than in utf-8. and utf-16 works slower than utf-8. here is test, ucs-2 is near 50 times faster than utf-8, and utf-16 is near 6 times slower than utf-8 here:&?phpheader('Content-Type: text/ charset=utf-8');mb_internal_encoding('utf-8');$s='укгез??ш?хз?х?шк2049??лдябчсячмииюсит.июб?рарэ'.'лдэфв??у?й?уй??у857?шаыдларораш??рлоавы';$s.=$s;$s.=$s;$s.=$s;$s.=$s;$s.=$s;$s.=$s;$s.=$s;$t1=microtime(true);$i=0;while($i&mb_strlen($s)){& & $a=mb_substr($s,$i,2);& & $i+=2;& & if($i==10)echo$a.'. ';& & }echo$i.'. ';echo(microtime(true)-$t1);echo'&br&';$s=mb_convert_encoding($s,'UCS-2','utf8');mb_internal_encoding('UCS-2');$t1=microtime(true);$i=0;while($i&mb_strlen($s)){& & $a=mb_substr($s,$i,2);& & $i+=2;& & if($i==10)echo mb_convert_encoding($a,'utf8','ucs2').'. ';& & }echo$i.'. ';echo(microtime(true)-$t1);echo'&br&';$s=mb_convert_encoding($s,'utf-16','ucs-2');mb_internal_encoding('utf-16');$t1=microtime(true);$i=0;while($i&mb_strlen($s)){& & $a=mb_substr($s,$i,2);& & $i+=2;& & if($i==10)echo mb_convert_encoding($a,'utf8','utf-16').'. ';& & }echo$i.'. ';echo(microtime(true)-$t1);?&output:?х. 1?х. 1?х. 10229282
If you want to convert japanese to ISO-2022-JP it is highly recommended to use ISO-2022-JP-MS as the target encoding instead. This includes the extended character set and avoids ? in the text. For example the often used "1 in a circle" ① will be correctly converted then.
It appears that when dealing with an unknown "from encoding" the function will both throw an E_WARNING and proceed to convert the string from ISO-8859-1 to the "to encoding".
For the php-noobs (like me) - working with flash and php.Here's a simple snippet of code that worked great for me, getting php to show special Danish characters, from a Flash email form:&?php$escName = mb_convert_encoding($_POST["Name"], "ISO-8859-1", "UTF-8");$escMessage = mb_convert_encoding($_POST["Message"], "ISO-8859-1", "UTF-8");?&
rodrigo at bb2 dot co dot jp wrote that inconv works better than mb_convert_encoding, I find that when converting from uft8 to shift_jis $conv_str = mb_convert_encoding($str,$toCS,$fromCS); works while$conv_str = iconv($fromCS,$toCS.'//IGNORE',$str); removes tildes from $str.
Clean a string for use as filename by simply replacing all unwanted characters with underscore (ASCII converts to 7bit). It removes slightly more chars than necessary. Hope its useful. $fileName = 'Test:!"$%&/()=?????ü&&';echo strtr(mb_convert_encoding($fileName,'ASCII'), & & ' ,;:?*#!§$%&/(){}&&=`?|\\\'"', & & '____________________________');
For those who can?t use mb_convert_encoding() to convert from one charset to another as a metter of lower version of php, try iconv().I had this problem converting to japanese charset:$txt=mb_convert_encoding($txt,'SJIS',$this-&encode);And I could fix it by using this:$txt = iconv('UTF-8', 'SJIS', $txt);Maybe it?s helpfull for someone else! ;)
When converting Japanese strings to ISO-2022-JP or JIS on PHP &= 5.2.1, you can use "ISO-2022-JP-MS" instead of them.Kishu-Izon (platform dependent) characters are converted correctly with the encoding, as same as with eucJP-win or with SJIS-win.
As an alternative to Johannes's suggestion for converting strings from other character sets to a 7bit representation while not just deleting latin diacritics, you might try this:&?php$text = iconv($from_enc, 'US-ASCII//TRANSLIT', $text);?&The only disadvantage is that it does not convert "?" to "ae", but it handles punctuation and other special characters better.-- David
many people below talk about using
&?php
& & mb_convert_encode($s,'HTML-ENTITIES','UTF-8');
?&
to convert non-ascii code into html-readable stuff.& Due to my webserver being out of my control, I was unable to set the database character set, and whenever PHP made a copy of my $s variable that it had pulled out of the database, it would convert it to nasty latin1 automatically and not leave it in it's beautiful UTF-8 glory.
So [insert korean characters here] turned into ?????.
I found myself needing to pass by reference (which of course is deprecated/nonexistent in recent versions of PHP)
so instead of
&?php
& & mb_convert_encode(&$s,'HTML-ENTITIES','UTF-8');
?&
which worked perfectly until I upgraded, so I had to use
&?php
& & call_user_func_array('mb_convert_encoding', array(&$s,'HTML-ENTITIES','UTF-8'));
?&
Hope it helps someone else out
Here's a tip for anyone using Flash and PHP for storing HTML output submitted from a Flash text field in a database or whatever.Flash submits its HTML special characters in UTF-8, so you can use the following function to convert those into HTML entity characters:function utf8html($utf8str){& return htmlentities(mb_convert_encoding($utf8str,"ISO-8859-1","UTF-8"));}
be careful when converting from iso-8859-1 to utf-8.even if you explicitly specify the character encoding of a page as iso-8859-1(via headers and strict xml defs), windows 2000 will ignore that and interpret it as whatever character set it has natively installed. for example, i wrote char #128 into a page, with char encoding iso-8859-1, and it displayed in internet explorer (& mozilla) as a euro symbol.it should have displayed a box, denoting that char #128 is undefined in iso-8859-1. The problem was it was displaying in "Windows: western europe" (my native character set).this led to confusion when i tried to convert this euro to UTF-8 via mb_convert_encoding()& IE displays UTF-8 correctly- and because PHP correctly converted #128 into a box in UTF-8, IE would show a box.so all i saw was mb_convert_encoding() converting a euro symbol into a box. It took me a long time to figure out what was going on.
Another sample of recoding without MultiByte enabling.(Russian koi-&win, if input in win-encoding already, function recode() returns unchanged string)&?php& function detect_encoding($str) {& & $win = 0;& & $koi = 0;& & for($i=0; $i&strlen($str); $i++) {& & & if( ord($str[$i]) &224 && ord($str[$i]) & 255) $win++;& & & if( ord($str[$i]) &192 && ord($str[$i]) & 223) $koi++;& & }& & if( $win & $koi ) {& & & return 1;& & } else return 0;& }& function koi_to_win($string) {& & $kw = array(128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183,& 184, 185, 186, 187, 188, 189, 190, 191, 254, 224, 225, 246, 228, 229, 244, 227, 245, 232, 233, 234, 235, 236, 237, 238, 239, 255, 240, 241, 242, 243, 230, 226, 252, 251, 231, 248, 253, 249, 247, 250, 222, 192, 193, 214, 196, 197, 212, 195, 213, 200, 201, 202, 203, 204, 205, 206, 207, 223, 208, 209, 210, 211, 198, 194, 220, 219, 199, 216, 221, 217, 215, 218);& & $wk = array(128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183,& 184, 185, 186, 187, 188, 189, 190, 191, 225, 226, 247, 231, 228, 229, 246, 250, 233, 234, 235, 236, 237, 238, 239, 240, 242,& 243, 244, 245, 230, 232, 227, 254, 251, 253, 255, 249, 248, 252, 224, 241, 193, 194, 215, 199, 196, 197, 214, 218, 201, 202, 203, 204, 205, 206, 207, 208, 210, 211, 212, 213, 198, 200, 195, 222, 219, 221, 223, 217, 216, 220, 192, 209);& & $end = strlen($string);& & $pos = 0;& & do {& & & $c = ord($string[$pos]);& & & if ($c&128) {& & & & $string[$pos] = chr($kw[$c-128]);& & & }& & } while (++$pos & $end);& & return $string;& }& function recode($str) {& & $enc = detect_encoding($str);& & if ($enc==1) {& & & $str = koi_to_win($str);& & }& & return $str;& }?&
To petruzanauticoyahoo?com!arIf you don't specify a source encoding, then it assumes the internal (default) encoding.& ? is a multi-byte character whose bytes in your configuration default (often iso-8859-1) would actually mean ?±.& mb_convert_encoding() is upgrading those characters to their multi-byte equivalents within UTF-8.Try this instead:&?phpprint mb_convert_encoding( "?", "UTF-8", "UTF-8" );?&Of course this function does no work (for the most part - it can actually be used to strip characters which are not valid for UTF-8).
If mb_convert_encoding doesn't work for you, and iconv gives you a headache, you might be interested in this free class I found. It can convert almost any charset to almost any other charset. I think it's wonderful and I wish I had found it earlier. It would have saved me tons of headache.I use it as a fail-safe, in case mb_convert_encoding is not installed. Download it from This is not my own library, so technically it's not spamming, right? ;)Hope this helps.
I\'d like to share some code to convert latin diacritics to their
traditional 7bit representation, like, for example,
- &,&,&,&,... to a,c,e,i,...
- & to ss
- &,&A,... to ae,Ae,...
- &,... to e,...
(mb_convert \&7bit\& would simply delete any offending characters).
I might have missed on your country\'s typographic
conventions--correct me then.
* @args string $text line of encoded text
*& & && string $from_enc (encoding type of $text, e.g. UTF-8, ISO-8859-1)
* @returns 7bit representation
*/
function to7bit($text,$from_enc) {
& & $text = mb_convert_encoding($text,\'HTML-ENTITIES\',$from_enc);
& & $text = preg_replace(
& & & & array(\'/&/\',\'/&(..)/\',
& & & & & && \'/&([aouAOU])/\',\'/&(.)[^;]*;/\'),
& & & & array(\'ss\',\&$1\&,\&$1\&.\'e\',\&$1\&),
& & & & $text);
& & return $
}&&
Enjoy :-)
Johannes
==
[EDIT BY danbrown AT php DOT net: Author provided the following update on 27-FEB-2012.]
==
An addendum to my &to7bit& function referenced below in the notes.
The function is supposed to solve the problem that some languages require a different 7bit rendering of special (umlauted) characters for sorting or other applications. For example, the German & ligature is usually written &ss& in 7bit context. Dutch & is typically rendered &ij& (not &y&).
The original function works well with word (alphabet) character entities and I've seen it used in many places. But non-word entities cause funny results:
E.g., &&& is rendered as &c&, &&& as &s& and &&& as &r&.
The following version fixes this by converting non-alphanumeric characters (also chains thereof) to '_'.
* @args string $text line of encoded text
*& & && string $from_enc (encoding type of $text, e.g. UTF-8, ISO-8859-1)
* @returns 7bit representation
*/
function to7bit($text,$from_enc) {
& & $text = preg_replace(/W+/,'_',$text);
& & $text = mb_convert_encoding($text,'HTML-ENTITIES',$from_enc);
& & $text = preg_replace(
& & & & array('/&/','/&(..)/',
& & & & & && '/&([aouAOU])/','/&/','/&(.)[^;]*;/'),
& & & & array('ss',&$1&,&$1&.'e','ij',&$1&),
& & & & $text);
& & return $
}&
Enjoy again,
Johannes
If you are trying to generate a CSV (with extended chars) to be opened at Exel for Mac, the only that worked for me was:
&?php mb_convert_encoding( $CSV, 'Windows-1252', 'UTF-8'); ?&
I also tried this:
&?php
iconv('MACINTOSH', 'UTF8', $CSV);
chr(255).chr(254).mb_convert_encoding( $CSV, 'UCS-2LE', 'UTF-8');
?&
But the first one didn't show extended chars correctly, and the second one, did't separe fields correctly
// mb_convert_encoding($input,'UTF-8','windows-874');& error : Illegal character encoding specified// so convert Thai to UTF-8 is better use iconv instead&?phpiconv("windows-874","UTF-8",$input);?&
When using the Windows Notepad text editor, it is important to note that when you select 'Save As' there is an Encoding selection dropdown. The default encoding is set to ANSI, with the other two options being Unicode and UTF-8. Since most text on the web is in UTF-8 format it could prove vital to save the .txt file with this encoding, since this function does not work on ANSI-encoded text.&灵魂歌后艾瑞莎.弗兰克林日出生于田纳西州的曼菲斯,2岁时举家迁往底特律。由于父亲是牧师,教堂在艾瑞莎的成长环境占有重要地位;孩提时候就参加唱诗班并弹奏钢琴,10岁时已担任独唱随父亲在美国各地表演,14岁之前在芝加哥名厂Chess旗下灌有一些宗教性录音。由于在宗教界小有名气,1960年,Columbia唱片公司经理也是制作人John
Hammond看上了18岁的艾瑞莎过人的音乐天赋将之签至旗下,但由于无法定位,艾瑞莎在Columbia的6年间录制不下12张唱片,或爵士或音乐剧歌曲或甜美的流行歌曲,却始终无法大红大紫,她的音乐事业似乎走入了死胡同。
1966年底另一制作人Jerry
Wexler说服艾瑞莎与Atlantic唱片公司签约,在他细心呵护下,艾瑞莎的命运从此改观。Wexler造就她的秘诀在于邀集好歌让艾瑞莎充分发挥其音域宽广具爆发力的歌唱天赋,再佐以简单的节奏蓝调编排凸显她教堂式的钢琴弹奏,以期不让繁复的伴奏掩盖了拥有金嗓子的艾瑞莎。此法果然奏效,1967
年Atlantic旗下推出的首支单曲&I Never Loved A
Man&一举进入美国流行榜前10名,并拿下节奏蓝调榜的冠军,此后开始了艾瑞莎灿烂的演唱生涯。
1967年由另一位灵魂歌手Otis
Redding所作的第二首单曲&Respect&更使艾瑞莎声名大噪,这首歌推出后不久即成黑人世界的国歌之一,时值黑白种族问题白热化的六O年代末,
&Respect&的出现就像是为黑人人权运动代言一般;由于艾瑞莎的成功,&Respect&也成为女性自觉运动的代表作,因此艾瑞莎.福兰克林最新这张两张一套精选辑便以此为标题「Respect
- The Very Best Of Aretha
Franklin」。是艾瑞莎.弗兰克林巅峰时期,以灵魂、福音式曲风以及女性自觉的歌曲内容掳获广大歌迷的心,畅销曲一首接一首,这时人们便以1968年一张专辑的名称「Aretha:
Soul」称呼她为「灵魂乐第一夫人」。精选辑中的第一片记录了她七O年代的多首畅销曲,例如
&Chain Of Fools&、&Share Your Love With
Me&、&Think&、&Spanish
Harlem&、&Until You Come Back To
Me&…都是节奏蓝调榜的冠军曲。
七十年代中晚期,流行榜上迪斯科肆虐使艾瑞莎的锋芒稍减,然而在节奏蓝调排行上依旧耀眼,不过1980
年她离开了Atlantic转与Arista唱片公司签约。精选辑中的第二片收录的多是她八O年代后的歌曲:和舞韵合唱团合唱的
&Sisters Are Doing It For
Themselves&、英国流行天王乔治.麦可对唱的 &I Knew
Waiting&(同为英美两地的流行冠军曲),还有&Freeway
Of Love&、&Who's Zooming
Who&都是其音乐事业第二次高峰(1985-87)的代表作。
三十多年的演唱生涯中艾瑞莎获奖无数,其中包括15座葛莱美奖,尤其节奏蓝调的项目似乎就专为她设置。另外终生成就奖、第一位女性艺人进入摇滚名人殿堂者也唯有艾瑞莎.弗兰克林!跨越流行与黑人音乐界线的她拥有七十多首流行单曲与近百首节奏蓝调单曲,除号称「灵魂乐第一夫人」外,「灵魂歌后」(Queen
Soul)的尊称更确切说明了艾瑞莎.弗兰克林在美国流行音乐的地位。
已投稿到:
以上网友发言只代表其个人观点,不代表新浪网的观点或立场。

我要回帖

更多关于 akh va quot shrine 的文章

 

随机推荐