您的位置:首页 > 其它

取得汉字拼音首字母,支持GBK大字符集

2014-06-07 11:09 363 查看
网站搜了很多源码,都不能得到“芙”这个字的拼音首字母,经过调查发现GB2312-80 把收录的汉字分成两级。第一级汉字是常用汉字,计 3755 个, 置于 16~55区,按汉语拼音字母/笔形顺序排列;第二级汉字是次常用汉字, 计 3008 个,置于 56~87 区,按部首/笔画顺序排列,所以大部分程序只能查到对一级汉字的声母,“芙”应该在二级汉字里。最终找到下面这篇博文,转载过来,并附上源码。

汉字的编码,需要大家先了解下。


GB2312

GB2312 码是中华人民共和国国家汉字信息交换用编码,全称《信息交换用汉字编码字符集——基本集》,由国家标准总局发布,1981年5月1日实施,通行于大陆。新加坡等地也使用此编码。

GB2312 收录简化汉字及符号、字母、日文假名等共 7445 个图形字符,其中汉字占 6763 个。GB2312 规定“对任意一个图形字符都采用两个字节表示,每个字节均采用七位编码表示”,习惯上称第一个字节为“高字节”,第二个字节为“低字节”。

GB2312 将代码表分为 94 个区,对应第一字节;每个区 94 个位,对应第二字节,两个字节的值分别为区号值和位号值加 32(2OH),因此也称为区位码。01-09 区为符号、数字区,16-87 区为汉字区,10-15 区、88-94 区是有待进一步标准化的空白区。GB2312 将收录的汉字分成两级:第一级是常用汉字计 3755 个,置于 16-55 区,按汉语拼音字母/笔形顺序排列;第二级汉字是次常用汉字计 3008 个,置于 56-87 区,按部首/笔画顺序排列。故而GB2312最多能表示
6763 个汉字。

GB2312 的编码范围为 2121H-777EH,与 ASCII 有重叠,通行方法是将 GB 码两个字节的最高位置 1 以示区别。


GBK

GB2312 仅收汉字 6763 个,这大大少于现有汉字,随着时间推移及汉字文化的不断延伸推广,有些原来很少用的字,现在变成了常用字,例如:朱镕基的“镕”字,未收入 GB2312-80,现在大陆的报业出刊只得使用(金+容)、(金容)、(左金右容)等来表示,形式不一而同,这使得表示、存储、输入、处理都非常不方便,对于搜索引擎等软件的构造来说也不是好消息,而且这种表示没有统一标准。从我们对人民日报 98 年数据的处理过程中,得出这样的经验:回填外字最困难的就是如何得到这种表示方法的集合。

为了解决这些问题,以及配合 UNICODE 的实施,全国信息技术化技术委员会于1995年12月1日《汉字内码扩展规范》。GBK 向下与 GB2312 完全兼容,向上支持 ISO 10646 国际标准,在前者向后者过渡过程中起到的承上启下的作用。GBK 亦采用双字节表示,总体编码范围为 8140-FEFE 之间,首字节在 81-FE 之间,尾字节在 40-FE 之间,剔除 XX7F 一条线。

GBK 共收入 21886 个汉字和图形符号,包括:

* GB2312 中的全部汉字、非汉字符号。

* BIG5 中的全部汉字。

* 与 ISO 10646 相应的国家标准 GB13000 中的其它 CJK 汉字,以上合计 20902 个汉字。

* 其它汉字、部首、符号,共计 984 个。

微软公司自 Windows 95 简体中文版开始支持GBK代码,但目前的多数搜索引擎都不能很好地支持 GBK 汉字。

GBK 编码区分三部分:

* 汉字区,包括:
GBK/2:OXBOA1-F7FE, 收录 GB2312 汉字 6763 个,按原序排列;
GBK/3:OX8140-AOFE,收录 CJK 汉字 6080 个;
GBK/4:OXAA40-FEAO,收录 CJK 汉字和增补的汉字 8160 个。

* 图形符号区,包括:
GBK/1:OXA1A1-A9FE,除 GB2312 的符号外,还增补了其它符号
GBK/5:OXA840-A9AO,扩除非汉字区。

* 用户自定义区:

即 GBK 区域中的空白区,用户可以自己定义字符。

package com;

import java.io.UnsupportedEncodingException;

/**
* 取得汉字拼音码.
* 支持GBK大字符集.
* @author Zhao Honghui
* @version 1.0
*/
public class GetPy {

private static final String GB_2312 =
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbp" +
"bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbpbbbbbbbbbbbbbbbbbb" +
"bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb" +
"pbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb" +
"bbbbbbbbbbbbbbbbbbbbcccccccccccccccccccccccccccccc" +
"ccccccccccccccccccccccccccccccccccczcccccccccccccc" +
"ccccccccccccccccccccccccccccccccccccsccccccccccccc" +
"cccccccccccccccccccccccccccccccccccccccccczccccccc" +
"cccccccccccccccccccccccccccccccccccccccccccccccccc" +
"cccddddddddddddddddddddddddddddddddddddddddddddddd" +
"dddddddddddddddddddddzdddddddddddddddddddddddddddd" +
"dddddddddddddddddddddddddddddddtdddddddddddddddddd" +
"dddddddddddddddddddddddddddddddddddddeeeeeeeeeeeee" +
"eeeeeeeeefffffffffffffffffffffffffffffffffffffffff" +
"ffffffffffffffffffffffffffffffffffffffffffffffffff" +
"fffffffffffffpffffffffffffffffffffgggggggggggggggg" +
"ggggggggggggggggggghggggggggggggghgggggggggggggggg" +
"gggggggggggggggggggggggggggggggggggggggggggggggggg" +
"ggggggggggggggggggggggggggggggggggggggghhhhhhhhhhh" +
"hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhmhhhhhhhhhhh" +
"hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh" +
"hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh" +
"hhhhhhhhhhhhhhhhhhhhjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj" +
"jjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj" +
"jjjjjjjjjjjjjjkjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj" +
"jjjjjjjyjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj" +
"jjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj" +
"jjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj" +
"jjjjjjjjjjjjjjjkkkgkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkh" +
"kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk" +
"kkkkkkkkkkkkkkklllllllllllllllllllllllllllllllllll" +
"llllllllllllllllllllllllllllllllllllllllllllllllll" +
"llllllllllllllllllllllllllllllllllllllllllllllllll" +
"llllllllllllllllllllllllllllllllllllllllllllllllll" +
"llllllllllllllllllllllllllllllllllllllllllllllllll" +
"lllllllllllllmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm" +
"mmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm" +
"mmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm" +
"mmmmmmmmmmmmmmnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn" +
"nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnooooo" +
"oooppppppppppppppppppppppppppppppppppppppppppppppp" +
"pppppppppppppppppppppppppppppppppppppppppppppppppp" +
"ppppppppppppppppppppppppbqqqqqqqqqqqqqqqqqqqqqqqqq" +
"qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq" +
"qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq" +
"qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqrrrrrrrrrrrrrrrrrr" +
"rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrsssssssss" +
"ssssssssssssssssssssssssssssssssssssssssssssssssss" +
"ssssssssssssssssssssssssssssssssssssssssssssssssss" +
"ssssssssssssssssssssssssssssssssssssssssssssssssss" +
"ssssssssssssssssssssssssssssssssssssssssssssssssss" +
"sssssssssssssssssssssssssssssssssssssssssssssssssx" +
"sssssssssssssssssssssssssssttttttttttttttttttttttt" +
"tttttttttttttttttttttttttttttttttttttttttttttttttt" +
"tttttttttttttttttttttttttttttttttttttttttttttttttt" +
"tttttttttttttttttttttttttttttttttwwwwwwwwwwwwwwwww" +
"wwwwwwwwwwwwwwwwww
ff4a
wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww" +
"wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww" +
"wwwxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxsx" +
"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" +
"xxxxxxxxxxxxxxxxxxxxxjxxxxxxxxxxxxxxxxxxxxxxxxxxxx" +
"xxxxxhxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxcxxxxxxxxx" +
"xxxxxxxxxxxxxxxxxxxxxxxxxxyyyyyyyyyyyyyyyyyyyyyyyy" +
"yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy" +
"yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy" +
"yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy" +
"yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy" +
"yyyyyyyyyyyyyyyyyyyyyyyyxyyyyyyyyyyyyyyyyyyyyyyyyy" +
"yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyzzzzzzzzzzzzzzzzzz" +
"zzzzzzzzzzzzzzzzzzzzzczzzzzzzzzzzzzzzzzzzzzzzzzzzz" +
"zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz" +
"zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz" +
"zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz" +
"zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz" +
"zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz" +
"zzzzz cjwgnspgcgnesypbtyyzdxykygtdjnnjqmbsjzsc" +
"yjsyyfpgkbzgylywjkgkljywkpjqhytwddzlsymrypywwcckzn" +
"kyygttngjnykkzytcjnmcylqlypysfqrpzslwbtgkjfyxjwzlt" +
"bncxjjjjtxdttsqzycdxxhgckbphffsswybgmxlpbylllhlxst" +
"zmyjhsojnghdzqyklgjhsgqzhxqgkxzzwyscscjxyeyxadzpmd" +
"ssmzjzqjyzcdjzwqjbyzbjgznzcpwhwxhqkmwfbpbydtjzzkxx" +
"ylygxfptyjyyzpszlfchmqshgmxxsxjyqdcsbbqbefsjyhxwgz" +
"kpylqbgldlcdtnmaeddkssngycsgxlyzaypnptsdkdylhgymyl" +
"cxpycjndqjwxqxfyyfjlejpzrxccqwqqsbzkymgplbmjrqcfln" +
"ymyqmtqyrbcjthztqfrxqhxmqjcjlyxgjmshzkbswyemyltxfs" +
"ydsglycjqxsjnqbsctyhbftdcyjdjyyghqfsxwckqkxebptlpx" +
"jzsrmebwhjlpjslyysmdxlclqkxlhxjrzjmfqhxhwywsbhtrxx" +
"glhqhfnmgykldyxzpylggtmtcfpnjjzyljtyanjgbjplqgszyq" +
"yaxbkysecjsznslyzhzxlzcghpxzhznytdsbcjkdlzayfmytle" +
"bbgqyzkggldndnyskjshdlyxbcgyxypkdjmmzngmmclgezszxz" +
"jfznmlzzthcsydbdllscddnlkjykjsycjlkwhqasdknhcsgaeh" +
"daashtcplcpqybsdmpjlpzjoqlcdhjxysprchnwjnlhlyyqyhw" +
"zptczgwwmzffjqqqqyxaclbhkdjxdgmmydqxzllsygxgkjrywz" +
"wyclzmssjzldbydcpcxyhlxchyzjqsfqagmnyxpfrkssbjlyxy" +
"syglnscmhcwwmnzjjlxxhchsyzsttxrycyxbyhcsmxjsznpwgp" +
"xxtaybgajcxlysdccwzocwkccsbnhcpdyznfcyytyckxkybsqk" +
"kytqqxfcwchcykelzqbsqyjqcclmthsywhmktlkjlycxwheqqh" +
"tqkjpqsqscfymmdmgbwhwlgsllystlmlxpthmjhwljzyhzjxht" +
"xjlhxrswlwzjcbxmhzqxsdzpsgfcsglsxymqshxpjxwmyqksmy" +
"plrthbxftpmhyxlchlhlzylxgsssstclsldclrpbhzhxyyfhbb" +
"gdmycnqqwlqhjjzywjzyejjdhpblqxtqkwhlchqxagtlxljxms" +
"ljhtzkzjecxjcjnmfbycsfywybjzgnysdzsqyrsljpclpwxsdw" +
"ejbjcbcnaytwgmpapclyqpclzxsbnmsggfnzjjbzsfzyndxhpl" +
"qkzczwalsbccjxjyzgwkypsgxfzfcdkhjgxtlqfsgdslqwzkxt" +
"mhsbgzmjzrglyjbpmlmsxlzjqzhzyjczydjwfmjklddpmjegxy" +
"hylxhlqyqhkycwcjmyyxnatjhyccxzpcqlbzwwytwsqcmlpmyr" +
"jcccxfpznzzljplxxyztzlgdltcklyrzzgqtkjhhgjljaxfgfj" +
"zslcfdqzlclgjdjcsnzlljpjqdcclcjxmyzftsxgcgsbrzxjqq" +
"ctzhgyqtjqqlzxjylylncyamcstylpdjbyregklzyzhlyszqlz" +
"nwczcllwjqjjjkdgjzolbbzppglghtgzxyjhzmycnqsycyhbhg" +
"xkamtxyxnbskyzzgjzlqjtfcjxdygjqjjpmgwgjjjpkqsbgbmm" +
"cjssclpqpdxcdyykyfcjddyygywrhjrtgznyqldkljszzgzqzj" +
"gdykshpzmtlcpwnjyfyzdjcnmwescyglbtzcgmssllyxqsxxbs" +
"jsbbsgghfjlypmzjnlyywdqshzxtyywhmcyhywdbxbtlmsyyyf" +
"sxjchtxxlhjhfssxzqhfzmzcztqcxzxrttdjhnnyzqqmtqdmmz" +
" ytxmjgdxcdyzbffallztdltfxmxqzdngwqdbdczjdxbzgsqqd" +
"djcmbkzffxmkdmdsyyszcmljdsynsprskmkmpcklgdbqtfzswt" +
"fgglyplljzhgjjgypzltcsmcnbtjbqfkdhpyzgkpbbymtdssxt" +
"bnpdkleycjnyddykzddhqhsdzsctarlltkzlgecllkjlqjaqnb" +
"dkkghpjxzqksecshalqfmmgjnlyjbbtmlyzxdxjpldlpcqdhzy" +
"cbzsczbzmsljflkrzjsnfrgjhxpdhyjybzgdlqcsezgxlblhyx" +
"twmabchecmwyjyzlljjyhlgbdjlslygkdzpzxjyyzlwcxszfgw" +
"yydlyhcljscmbjhblyzlycblydpdqysxqzbytdkyxlyycnrjmp" +
"dqgklcljbcxbjddbblblczqrppxjcjlzcshltoljnmdddlngka" +
"thqhjhykheznmshrphqqjchgmfprxhjgdychgklyrzqlcyqjnz" +
"sqtkqjymszxwlcfqqqxyfggyptqwlmcrnfkkfsyylybmqammmy" +
"xctpshcptxxzzsmphpshmclmldqfyqxszyjdjjzzhqpdszglst" +
"jbckbxyqzjsgpsxqzqzrqtbdkwxzkhhgflbcsmdldgdzdblzyy" +
"cxnncsybzbfglzzxswmsccmqnjqsbdqsjtxxmbltxcclzshzcx" +
"rqjgjylxzfjphymzqqydfqjqlzznzjcdgzygztxmzysctlkpht" +
"xhtlbjxjlxscdqxcbbtjfqzfsltjbtkqbxxjjljchczdbzjdcz" +
"jdcprnpqcjpfczlclzxzdmxmphjsgzgszzqlylwtjpfsyaxmcj" +
"btzyycwmytzsjjlqcqlwzmalbxyfbpnlsfhtgjwejjxxglljst" +
"gshjqlzfkcgnndszfdeqfhbsaqtgylbxmmygszldydqmjjrgbj" +
"tkgdhgkblqkbdmbylxwcxyttybkmrtjzxqjbhlmhmjjzmqasld" +
"cyxyqdlqcafywyxqhz";

private static final String GBK_3 =
"ksxsm sdqlybjjjgczbjfya jhphsyzgj sn xy ng" +
" lggllyjds yssgyqyd xjyydldwjjwbbftbxthhbczcrfm" +
"qwyfcwdzpyddwyxjajpsfnzyjxxxcxnnxxzzbpysyzhmzbqbzc" +
"ycbxqsbhhxgfmbhhgqcxsthlygymxalelccxzrcsd njjtzzcl" +
"jdtstbnxtyxsgkwyflhjqspxmxxdc lshxjbcfybyxhczbjyzl" +
"wlcz gtsmtzxpqglsjfzzlslhdzbwjncjysnycqrzcwybtyftw" +
"ecskdcbxhyzqyyxzcffzmjyxxsdcztbzjwszsxyrnygmdthjxs" +
"qqccsbxrytsyfbjzgclyzzbszyzqscjhzqydxlbpjllmqxtydz" +
"sqjtzplcgqtzwjbhcjdyfxjelbgxxmyjjqfzasyjnsydk jcjs" +
"zcbatdclnjqmwnqncllkbybzzsyhjqltwlccxthllzntylnzxd" +
"dtcenjyskkfksdkghwnlsjt jymrymzjgjmzgxykymsmjklfxm" +
"tghpfmqjsmtgjqdgyalcmzcsdjlxdffjc f ffkgpkhrcjqcj" +
"dwjlfqdmlzbjjscgckdejcjdlzyckscclfcq czgpdqzjj hdd" +
"wgsjdkccctllpskghzzljlgjgjjtjjjzczmlzyjkxzyzmljkyw" +
"xmkjlkjgmclykjqlblkmdxwyxysllpsjqjqxyqfjtjdmxxllcr" +
"qyjb xgg pjygegdjgnjyjkhqfqzkhyghdgllsdjjxkyoxnzsx" +
"wwxdcskxxjyqscsqkjexsyzhydz ptqyzmtstzfsyldqagylcq" +
"lyyyhlrq ldhsssadsjbrszxsjyrcgqc hmmxzdyohycqgphhy" +
"nxrhgjlgwqwjhcstwasjpmmrdsztxyqpzxyhyqxtpbfyhhdwzb" +
"txhqeexzxxkstexgltxydn hyktmzhxlplbmlsfhyyggbhyqt" +
"xwlqczydqdq gd lls zwjqwqajnytlxanzdecxzwwsgqqdyzt" +
"chyqzlxygzglydqtjtadyzzcwyzymhyhyjzwsxhzylyskqysbc" +
"yw xjzgtyxqsyhxmchrwjpwxzlwjs sgnqbalzzmtjcjktsax" +
"ljhhgoxzcpdmhgtysjxhmrlxjkxhmqxctxwzbkhzccdytxqhlx" +
"hyx syydz znhxqyaygypdhdd pyzndltwxydpzjjcxmtlhbyn" +
"yymhzllhnmylllmdcppxmxdkycydltxchhznaclcclylzsxzjn" +
"zln lhyntkyjpychegttgqrgtgyhhlgcwyqkpyyyttttlhylly" +
"ttplkyzqqzdq nmjzxyqmktfbjdjjdxbtqzgtsyflqgxblzfh" +
" zadpmjhlccyhdzfgydgcyxs hd d axxbpbyyaxcqffqyjxdl" +
"jjzl bjydyqszwjlzkcdtctbkdyzdqjnkknjgyeglfykasntch" +
"blwzbymjnygzyheyfjmctyfzjjhgck lxhdwxxjkyykssmwctq" +
"zlpbzdtwzxzag kwxl lspbclloqmmzslbczzkdcz xgqqdcyt" +
"zqwzqssfpktfqdcdshdtdwfhtdy jaqqkybdjyxtlj drqxxxa" +
"ydrjlklytwhllrllcxylbw z zzhkhxksmdsyyjpzbsqlcxxn" +
"xwmdq gqmmczjgttybhyjbetpjxdqhkzbhfdxkawtwajldyjsf" +
"hblddqjncxfjhdfjjwzpkzypcyzynxff ydbzznytxzembsehx" +
"fzmbflzrsymzjrdjgxhjgjjnzzxhgxhymlpeyyxtgqshxssxmf" +
"mkcctxnypszhzptxwywxyysljsqxzdleelmcpjclxsqhfwwtff" +
"tnqjjjdxhwlyznflnkyyjldx hdynrjtywtrmdrqhwqcmfjdyz" +
"hmyyxjwzqtxtlmrspwwchjb xygcyyrrlmpymkszyjrmysntpl" +
"nbpyyxmykyngjzznlzhhanmpgwjdzmxxmllhgdzxyhxkrycjmf" +
"fxyhjfssqlxxndyca nmtcjcyprrnytyqym sxndlylyljnlxy" +
"shqmllyzljzxstyzsmcqynzlxbnnylrqtryyjzzhsytxcqgxzs" +
"shmkczyqhzjnbh qsnjnzybknlqhznswxkhjyybqlbfl p bkq" +
"zxsddjmessmlxxkwnmwwwydkzggtggxbjtdszxnxwmlptfxlcx" +
"jjljzxnwxlyhhlrwhsc ybyawjjcwqqjzzyjgxpltzftpakqpt" +
"lc xtx hklefdleegqymsawhmljtwyqlyjeybqfnlyxrdsctg" +
"gxyyn kyqctlhjlmkkcgygllldzydhzwpjzkdyzzhyyfqytyzs" +
"ezzlymhjhtwyzlkyywzcskqqtdxwctyjklwqbdqyncs szjlkc" +
"dcdtlzzacqqzzddxyplxzbqjylzllqdzqgyjyjsyxnyyynyjxk" +
"xdazwrdljyyynjlxllhxjcykynqcclddnyyykyhhjcl pb qzz" +
"yjxj fzdnfpzhddwfmyypqjrssqzsqdgpzjwdsjdhzxwybp gp" +
"tmjthzsbgzmbjczwbbzmqcfmbdmcjxljbgjtz mqdyxjzyctyz" +
"tzxtgkmybbcljssqymscx jeglxszbqjjlyxlyctsxmcwfa kb" +
"qllljyxtyltxdphnhfqyzyes sdhwdjbsztfd czyqsyjdzjqp" +
"bs j fbkjbxtkqhmkwjjlhhyyyyywyycdypczyjzwdlfwxwzzj" +
"cxcdjzczlxjjtxbfwpxzptdzbccyhmlxbqlrtgrhqtlf mwwjx" +
"jwcysctzqhxwxkjybmpkbnzhqcdtyfxbyxcbhxpsxt m sxlhk" +
"mzxydhwxxshqhcyxglcsqypdh my ypyyykzljqtbqxmyhcwll" +
"cyl ewcdcmlggqktlxkgndgzyjjlyhqdtnchxwszjydnytcqcb" +
"hztbxwgwbxhmyqsycmqkaqyncs qhysqyshjgjcnxkzycxsbxx" +
"hyylstyxtymgcpmgcccccmztasgqzjlosqylstmqsqdzljqqyp" +
"lcycztcqqpbqjclpkhz yyxxdtddsjcxffllxmlwcjcxtspyxn" +
"dtjsjwxqqjskyylsjhaykxcyydmamdqmlmczncybzkkyflmcsc" +
"lhxrcjjgslnmtjzzygjddzjzk qgjyyxzxxqhheytmdsyyyqlf" +
" zzdywhscyqwdrxqjyazzzdywbjwhyqszywnp azjbznbyzzy" +
"hnscpjmqcy zpnqtbzjkqqhngccxchbzkddnzhjdrlzlsjljyx" +
"ytbgtcsqmnjpjsrxcfjqhtpzsyjwbzzzlstbwwqsmmfdwjyzct" +
"bwzwqcslqgdhqsqlyzlgyxydcbtzkpj gm pnjkyjynhpwsnsz" +
"zxybyhyzjqjtllcjthgdxxqcbywbwzggqrqzssnpkydznxqxjm" +
"y dstzplthzwxwqtzenqzw ksscsjccgptcslccgllzxczqthn" +
"jgyqznmckcstjskbjygqjpldxrgzyxcxhgdnlzwjjctsbcjxbf" +
"zzpqdhjtywjynlzzpcjdsqjkdxyajyemmjtdljyryynhjbngzj" +
"kmjxltbsllrzylcscnxjllhyllqqqlxymswcxsljmc zlnsdwt" +
"jllggjxkyhbpdkmmscsgxjcsdybxdndqykjjtxdygmzzdzslo " +
"yjsjzdlbtxxxqqjzlbylwsjjyjtdzqqzzzzjlzcdzjhpl qplf" +
"fjzysj zfpfzksyjjhxttdxcysmmzcwbbjshfjxfqhyzfsjybx" +
"pzlhmbxhzxfywdab lktshxkxjjzthgxh jxkzxszzwhwtzzzs" +
"nxqzyawlcwxfxyyhxmyyswqmnlycyspjkhwcqhyljmzxhmcnzh" +
"hxcltjplxyjhdyylttxfszhyxxsjbjyayrmlckd yhlrlllsty" +
"zyyhscszqxkyqfpflk ntljmmtqyzwtlll s rbdmlqjbcc qy" +
"wxfzrzdmcyggzjm mxyfdxc shxncsyjjmpafyfnhyzxyezy " +
"sdl zztxgfmyyysnbdnlhpfzdcyfssssn zzdgpafbdbzszbsg" +
"cyjlm z yxqcyxzlckbrbrbzcycjzeeyfgzlyzsfrtkqsxdcm" +
"z jl xscbykjbbrxllfqwjhyqylpzdxczybdhzrbjhwnjtjxl" +
"kcfssdqyjkzcwjl b tzlltlqblcqqccdfpphczlyygjdgwcf" +
"czqyyyqyrqzslszfcqnwlhjcjjczkypzzbpdc jgx gdz f" +
"gpsysdfwwjzjyxyyjyhwpbygxrylybhkjksftzmmkhtyysyyzp" +
"yqydywmtjjrhl tw bjycfnmgjtysyzmsjyjhhqmyrszwtr" +
"tzsskx gqgsptgcznjjcxmxgzt ydjz lsdglhyqgggthszpyj" +
"hhgnygkggmdzylczlxqstgzslllmlcskbljzzsmmytpzsqjcj " +
" zxzzcpshkzsxcdfmwrllqxrfzlysdctmxjthjntnrtzfqyhqg" +
"llg sjdjj tqjlnyhszxcgjzypfhdjspcczhjjjzjqdyb ss" +
"lyttmqtbhjqnnygjyrqyqmzgcjkpd gmyzhqllsllclmholzgd" +
"yyfzsljc zlylzqjeshnylljxgjxlyjyyyxnbzljsszcqqzjyl" +
"lzldj llzllbnyl hxxccqkyjxxxklkseccqkkkcgyyxywtqoh" +
"thxpyxx hcyeychbbjqcs szs lzylgezwmysx jqqsqyyycmd" +
"zywctjsycjkcddjlbdjjzqysqqxxhqjohdyxgmajpchcpljsmt" +
"xerxjqd pjdbsmsstktssmmtrzszmldj rn sqxqydyyzbdsln" +
"fgpzmdycwfdtmypqwytjzzqjjrjhqbhzpjhnxxyydyhhnmfcpb" +
"zpzzlzfmztzmyftskyjyjzhbzzygh pzcscsjssxfjgdyzyhzc" +
"whcsexfqzywklytmlymqpxxskqjpxzhmhqyjs cjlqwhmybdhy" +
"ylhlglcfytlxcjscpjskphjrtxteylssls yhxscznwtdwjslh" +
"tqdjhgydphcqfzljlzptynlmjllqyshhylqqzypbywrfy js y" +
"p yrhjnqtfwtwrchygmm yyhsmzhngcelqqmtcwcmpxjjfyysx" +
"ztybmstsyjdtjqtlhynpyqzlcxznzmylflwby jgsylymzctdw" +
"gszslmwzwwqzsayysssapxwcmgxhxdzyjgsjhygscyyxhbbzjk" +
"ssmalxycfygmqyjycxjlljgczgqjcczotyxmtthlwtgfzkpzcx" +
"kjycxctjcyh xsgckxzpsjpxhjwpjgsqxxsdmrszzyzwsykyzs" +
"hbcsplwsscjhjlchhylhfhhxjsx lnylsdhzxysxlwzyhcldyh" +
"zmdyspjtqznwqpsswctst zlmssmnyymjqjzwtyydchqlxkwbg" +
"qybkfc jdlzllyylszydwhxpsbcmljscgbhxlqrljxysdwxzsl" +
"df hlslymjljylyjcdrjlfsyjfnllcqyqfjy szlylmstdjcyh" +
"zllnwlxxygyygxxhhzzxczqzfnwpypkpypmlgxgg dxzzkzfbx" +
"xlzptytswhzyxhqhxxxywzyswdmzkxhzphgchj lfjxptzthly" +
"xcrhxshxkjxxzqdcqyl jlkhtxcwhjfwcfpqryqxyqy gpggsc" +
"sxngkchkzxhflxjbyzwtsxxncyjjmwzjqrhfqsyljzgynslgtc" +
"ybyxxwyhhxynsqymlywgyqbbzljlpsytjzhyzwlrorjkczjxxy" +
"xchdyxyxxjddsqfxyltsfxlmtyjmjjyyxltcxqzqhzlyyxzh n" +
"lrhxjcdyhlbrlmrllaxksllljlxxxlycry lccgjcmtlzllyzz" +
"pcw jyzeckzdqyqpcjcyzmbbcydcnltrmfgyqbsygmdqqzmkql" +
"pgtbqcjfkjcxbljmswmdt ldlppbxcwkcbjczhkphyyhzkzmp" +
"jysylpnyyxdb";

private static final String GBK_4 =
"kxxmzjxsttdzxxbzyshjpfxpqbyljqkyzzzwl zgfwyctjxjpy" +
"yspmsmydyshqy zchmjmcagcfbbhplxtyqx djgxdhkxxnbhrm" +
"lnjsltsmrnlxqjyzlsqglbhdcgyqyyhwfjybbyjyjjdpqyapfx" +
"cgjscrssyz lbzjjjlgxzyxyxsqkxbxxgcxpld wetdwwcjmbt" +
"xchxyxxfxllj fwdpzsmylmwytcbcecblgdbqzqfjdjhymcxtx" +
"drmjwrh xcjzylqdyhlsrsywwzjymtllltqcjzbtckzcyqjzqa" +
"lmyhwwdxzxqdllqsgjfjljhjazdjgtkhsstcyjfpszlxzxrwgl" +
"dlzr lzqtgslllllyxxqgdzybphl x bpfd hy jcc dmzpp" +
"z cyqxldozlwdwyythcqsccrsslfzfp qmbjxlmyfgjb m jwd" +
"n mmjtgbdzlp hsymjyl hdzjcctlcl ljcpddqdsznbgzxxcx" +
"qycbzxzfzfjsnttjyhtcmjxtmxspdsypzgmljtycbmdkycsz z" +
"yfyctgwhkyjxgyclndzscyzssdllqflqllxfdyhxggnywyllsd" +
"lbbjcyjzmlhl xyyytdlllb b bqjzmpclmjpgehbcqax hhhz" +
"chxyhjaxhlphjgpqqzgjjzzgzdqybzhhbwyffqdlzljxjpalxz" +
"daglgwqyxxxfmmsypfmxsyzyshdzkxsmmzzsdnzcfp ltzdnmx" +
"zymzmmxhhczjemxxksthwlsqlzllsjphlgzyhmxxhgzcjmhxtx" +
"fwkmwkdthmfzzydkmsclcmghsxpslcxyxmkxyah jzmcsnxyym" +
"mpmlgxmhlmlqmxtkzqyszjshyzjzybdqzwzqkdjlfmekzjpezs" +
"wjmzyltemznplplbpykkqzkeqlwayyplhhaq jkqclhyxxmlyc" +
"cyskg lcnszkyzkcqzqljpmzhxlywqlnrydtykwszdxddntqd" +
"fqqmgseltthpwtxxlwydlzyzcqqpllkcc ylbqqczcljslzjxd" +
"dbzqdljxzqjyzqkzljcyqdypp pqykjyrpcbymxkllzllfqpyl" +
"llmsglcyrytmxyzfdzrysyztfmsmcl ywzgxzggsjsgkdtggzl" +
"ldzbzhyyzhzywxyzymsdbzyjgtsmtfxqyjssdgslnndlyzzlrx" +
"trznzxnqfmyzjzykbpnlypblnzz jhtzkgyzzrdznfgxskgjtt" +
"yllgzzbjzklplzylxyxbjfpnjzzxcdxzyxzggrs jksmzjlsjy" +
"wq yhqjxpjzt lsnshrnypzt wchklpszlcyysjylybbwzpdwg" +
"cyxckdzxsgzwwyqyytctdllxwkczkkcclgcqqdzlqcsfqchqhs" +
"fmqzlnbbshzdysjqplzcd cwjkjlpcmz jsqyzyhcpydsdzngq" +
"mbsflnffgfsm q lgqcyybkjsrjhzldcftlljgjhtxzcszztjg" +
"gkyoxblzppgtgyjdhz zzllqfzgqjzczbxbsxpxhyyclwdqjjx" +
"mfdfzhqqmqg yhtycrznqxgpdzcszcljbhbzcyzzppyzzsgyhc" +
"kpzjljnsc sllxb mstldfjmkdjslxlsz p pgjllydszgql l" +
"kyyhzttnt tzzbsz ztlljtyyll llqyzqlbdzlslyyzyfszs" +
"nhnc bbwsk rbc zm gjmzlshtslzbl q xflyljqbzg st" +
"bmzjlxfnb xjztsfjmssnxlkbhsjxtnlzdntljjgzjyjczxygy" +
"hwrwqnztn fjszpzshzjfyrdjfcjzbfzqchzxfxsbzqlzsgyft" +
"zdcszxzjbqmszkjrhyjzckmjkhchgtxkjqalxbxfjtrtylxjhd" +
"tsjx j jjzmzlcqsbtxhqgxtxxhxftsdkfjhzxjfj zcdlllt" +
"qsqzqwqxswtwgwbccgzllqzbclmqqtzhzxzxljfrmyzflxys x" +
"xjk xrmqdzdmmyxbsqbhgcmwfwtgmxlzpyytgzyccddyzxs g " +
"yjyznbgpzjcqswxcjrtfycgrhztxszzt cbfclsyxzlzqmzlmp" +
" lxzjxslbysmqhxxz rxsqzzzsslyflczjrcrxhhzxq dshjsj" +
"jhqcxjbcynsssrjbqlpxqpymlxzkyxlxcjlcycxxzzlxlll hr" +
"zzdxytyxcxff bpxdgygztcqwyltlswwsgzjmmgtjfsgzyafsm" +
"lpfcwbjcljmzlpjjlmdyyyfbygyzgyzyrqqhxy kxygy fsfsl" +
"nqhcfhccfxblplzyxxxkhhxshjzscxczwhhhplqalpqahxdlgg" +
"gdrndtpyqjjcljzljlhyhyqydhz zczywteyzxhsl jbdgwxpc" +
" tjckllwkllcsstknzdnqnttlzsszyqkcgbhcrrychfpfyrwq" +
"pxxkdbbbqtzkznpcfxmqkcypzxehzkctcmxxmx nwwxjyhlstm" +
"csqdjcxctcnd p lccjlsblplqcdnndscjdpgwmrzclodansyz" +
"rdwjjdbcxwstszyljpxloclgpcjfzljyl c cnlckxtpzjwcyx" +
"wfzdknjcjlltqcbxnw xbxklylhzlqzllzxwjljjjgcmngjdzx" +
"txcxyxjjxsjtstp ghtxdfptffllxqpk fzflylybqjhzbmddb" +
"cycld tddqlyjjwqllcsjpyyclttjpycmgyxzhsztwqwrfzhjg" +
"azmrhcyy ptdlybyznbbxyxhzddnh msgbwfzzjcyxllrzcyxz" +
"lwjgcggnycpmzqzhfgtcjeaqcpjcs dczdwldfrypysccwbxgz" +
"mzztqscpxxjcjychcjwsnxxwjn mt mcdqdcllwnk zgglcczm" +
"lbqjqdsjzzghqywbzjlttdhhcchflsjyscgc zjbypbpdqkxwy" +
"yflxncwcxbmaykkjwzzzrxy yqjfljphhhytzqmhsgzqwbwjdy" +
"sqzxslzyymyszg x hysyscsyznlqyljxcxtlwdqzpcycyppnx" +
"fyrcmsmslxglgctlxzgz g tc dsllyxmtzalcpxjtjwtcyyjb" +
"lbzlqmylxpghdlssdhbdcsxhamlzpjmcnhjysygchskqmc lwj" +
"xsmocdrlyqzhjmyby lyetfjfrfksyxftwdsxxlysjslyxsnxy" +
"yxhahhjzxwmljcsqlkydztzsxfdxgzjksxybdpwnzwpczczeny" +
"cxqfjykbdmljqq lxslyxxylljdzbsmhpsttqqwlhogyblzzal" +
"xqlzerrqlstmypyxjjxqsjpbryxyjlxyqylthylymlkljt llh" +
"fzwkhljlhlj klj tlqxylmbtxchxcfxlhhhjbyzzkbxsdqc j" +
"zsyhzxfebcqwyyjqtzyqhqqzmwffhfrbntpcjlfzgppxdbbztg" +
" gchmfly xlxpqsywmngqlxjqjtcbhxspxlbyyjddhsjqyjxll" +
"dtkhhbfwdysqrnwldebzwcydljtmxmjsxyrwfymwrxxysztzzt" +
"ymldq xlyq jtscxwlprjwxhyphydnxhgmywytzcs tsdlwdcq" +
"pyclqyjwxwzzmylclmxcmzsqtzpjqblgxjzfljjytjnxmcxs c" +
"dl dyjdqcxsqyclzxzzxmxqrjhzjphfljlmlqnldxzlllfypny" +
"ysxcqqcmjzzhnpzmekmxkyqlxstxxhwdcwdzgyyfpjzdyzjzx " +
"rzjchrtlpyzbsjhxzypbdfgzzrytngxcqy b cckrjjbjerzgy" +
" xknsjkljsjzljybzsqlbcktylccclpfyadzyqgk tsfc xdk" +
"dyxyfttyh wtghrynjsbsnyjhkllslydxxwbcjsbbpjzjcjdz" +
"bfxxbrjlaygcsndcdszblpz dwsbxbcllxxlzdjzsjy lyxfff" +
"bhjjxgbygjpmmmpssdzjmtlyzjxswxtyledqpjmygqzjgdblqj" +
"wjqllsdgytqjczcjdzxqgsgjhqxnqlzbxsgzhcxy ljxyxydfq" +
"qjjfxdhctxjyrxysqtjxyebyyssyxjxncyzxfxmsyszxy schs" +
"hxzzzgzcgfjdltynpzgyjyztyqzpbxcbdztzc zyxxyhhsqxsh" +
"dhgqhjhgxwsztmmlhyxgcbtclzkkwjzrclekxtdbcykqqsayxc" +
"jxwwgsbhjyzs csjkqcxswxfltynytpzc czjqtzwjqdzzzqz" +
"ljjxlsbhpyxxpsxshheztxfptjqyzzxhyaxncfzyyhxgnxmywx" +
"tcspdhhgymxmxqcxtsbcqsjyxxtyyly pclmmszmjzzllcogxz" +
"aajzyhjmzxhdxzsxzdzxleyjjzjbhzmzzzqtzpsxztdsxjjlny" +
"azhhyysrnqdthzhayjyjhdzjzlsw cltbzyecwcycrylcxnhzy" +
"dzydtrxxbzsxqhxjhhlxxlhdlqfdbsxfzzyychtyyjbhecjkgj" +
"fxhzjfxhwhdzfyapnpgnymshk mamnbyjtmxyjcthjbzyfcgty" +
"hwphftwzzezsbzegpbmtskftycmhbllhgpzjxzjgzjyxzsbbqs" +
"czzlzccstpgxmjsftcczjz djxcybzlfcjsyzfgszlybcwzzby" +
"zdzypswyjgxzbdsysxlgzybzfyxxxccxtzlsqyxzjqdcztdxzj" +
"jqcgxtdgscxzsyjjqcc ldqztqchqqjzyezwkjcfypqtynlmkc" +
"qzqzbqnyjddzqzxdpzjcdjstcjnxbcmsjqmjqwwjqnjnlllwqc" +
"qqdzpzydcydzcttf znztqzdtjlzbclltdsxkjzqdpzlzntjxz" +
"bcjltqjldgdbbjqdcjwynzyzcdwllxwlrxntqqczxkjld tdgl" +
" lajjkly kqll dz td ycggjyxdxfrskstqdenqmrkq hgkd" +
"ldazfkypbggpzrebzzykyqspegjjglkqzzzslysywqzwfqzylz" +
"zlzhwcgkyp qgnpgblplrrjyxcccyyhsbzfybnyytgzxylxczw" +
"h zjzblfflgskhyjzeyjhlplllldzlyczblcybbxbcbpnnzc r" +
" sycgyy qzwtzdxtedcnzzzty hdynyjlxdjyqdjszwlsh lbc" +
"zpyzjyctdyntsyctszyyegdw ycxtscysmgzsccsdslccrqxyy" +
"elsm xztebblyylltqsyrxfkbxsychbjbwkgskhhjh xgnlycd" +
"lfyljgbxqxqqzzplnypxjyqymrbsyyhkxxstmxrczzywxyhymc" +
"l lzhqwqxdbxbzwzmldmyskfmklzcyqyczqxzlyyzmddz ftqp" +
"czcyypzhzllytztzxdtqcy ksccyyazjpcylzyjtfnyyynrs y" +
"lmmnxjsmyb sljqyldzdpqbzzblfndsqkczfywhgqmrdsxycyt" +
"xnq jpyjbfcjdyzfbrxejdgyqbsrmnfyyqpghyjdyzxgr htk " +
"leq zntsmpklbsgbpyszbydjzsstjzytxzphsszsbzczptqfzm" +
"yflypybbjgxzmxxdjmtsyskkbzxhjcelbsmjyjzcxt mljshrz" +
"zslxjqpyzxmkygxxjcljprmyygadyskqs dhrzkqxzyztcghyt" +
"lmljxybsyctbhjhjfcwzsxwwtkzlxqshlyjzjxe mplprcglt " +
"zztlnjcyjgdtclklpllqpjmzbapxyzlkktgdwczzbnzdtdyqzj" +
"yjgmctxltgcszlmlhbglk njhdxphlfmkyd lgxdtwzfrjejz" +
"tzhydxykshwfzcqshknqqhtzhxmjdjskhxzjzbzzxympagjmst" +
"bxlskyynwrtsqlscbpspsgzwyhtlksssw hzzlyytnxjgmjszs" +
"xfwnlsoztxgxlsmmlbwldszylkqcqctmycfjbslxclzzclxxks" +
"bjqclhjpsqplsxxckslnhpsfqqytxy jzlqldtzqjzdyydjnzp" +
"d cdskjfsljhylzsqzlbtxxdgtqbdyazxdzhzjnhhqbyknxjjq" +
"czmlljzkspldsclbblzkleljlbq ycxjxgcnlcqplzlznjtzlx" +
"yxpxmyzxwyczyhzbtrblxlcczjadjlmmmsssmybhb kkbhrsxx" +
"jmxsdynzpelbbrhwghfchgm klltsjyycqltskywyyhywxbxq" +
"ywbawykqldq tmtkhqcgdqktgpkxhcpthtwthkshthlxyzyyda" +
"spkyzpceqdltbdssegyjq xcwxssbz dfydlyjcls yzyexcyy" +
"sdwnzajgyhywtjdaxysrltdpsyxfnejdy lxllqzyqqhgjhzyc" +
"shwshczyjxllnxzjjn fxmfpycyawddhdmczlqzhzyztldywll" +
"hymmylmbwwkxydtyldjpyw xjwmllsafdllyflb bqtzcqlj" +
"tfmbthydcqrddwr qnysnmzbyytbjhp ygtjahg tbstxkbtzb" +
"kldbeqqhqmjdyttxpgbktlgqxjjjcthxqdwjlwrfwqgwqhckry" +
"swgftgygbxsd wdfjxxxjzlpyyypayxhydqkxsaxyxgskqhykf" +
"dddpplcjlhqeewxksyykdbplfjtpkjltcyyhhjttpltzzcdlsh" +
"qkzjqyste eywyyzy xyysttjkllpwmcyhqgxyhcrmbxpllnqt" +
"jhyylfd fxzpsftljxxjbswyysksflxlpplbbblbsfxyzsylff" +
"fscjds tztryysyffsyzszbjtbctsbsdhrtjjbytcxyje xbne" +
"bjdsysykgsjzbxbytfzwgenhhhhzhhtfwgzstbgxklsty mtmb" +
"yxj skzscdyjrcwxzfhmymcxlzndtdh xdjggybfbnbpthfjaa" +
"xwfpxmyphdttcxzzpxrsywzdlybbjd qwqjpzypzjznjpzjlzt" +
" fysbttslmptzrtdxqsjehbzyj dhljsqmlhtxtjecxslzzspk" +
"tlzkqqyfs gywpcpqfhqhytqxzkrsg gsjczlptxcdyyzss qz" +
"slxlzmycbcqbzyxhbsxlzdltcdjtylzjyyzpzylltxjsjxhlbr" +
"ypxqzskswwwygyabbztqktgpyspxbjcmllxztbklgqkq lsktf" +
"xrdkbfpftbbrfeeqgypzsstlbtpszzsjdhlqlzpmsmmsxlqqnk" +
"nbrddnxxdhddjyyyfqgzlxsmjqgxytqlgpbqxcyzy drj gtdj" +
"yhqshtmjsbwplwhlzffny gxqhpltbqpfbcwqdbygpnztbfzj" +
"gsdctjshxeawzzylltyybwjkxxghlfk djtmsz sqynzggswqs" +
"phtlsskmcl yszqqxncjdqgzdlfnykljcjllzlmzjn scht" +
"hxzlzjbbhqzwwycrdhlyqqjbeyfsjxwhsr wjhwpslmssgztt" +
"ygyqqwr lalhmjtqjcmxqbjjzjxtyzkxbyqxbjxshzssfjlxmx" +
" fghkzszggylcls rjyhslllmzxelgl xdjtbgyzbpktzhkzj" +
"yqsbctwwqjpqwxhgzgdyfljbyfdjf hsfmbyzhqgfwqsyfyjgp" +
"hzbyyzffwodjrlmftwlbzgycqxcdj ygzyyyyhy xdwegazyhx" +
"jlzythlrmgrxxzcl ljjtjtbwjybjjbxjjtjteekhwslj lp" +
"sfyzpqqbdlqjjtyyqlyzkdksqj yyqzldqtgjj js cmraqth" +
"tejmfctyhypkmhycwj cfhyyxwshctxrljhjshccyyyjltktty" +
"tmxgtcjtzaxyoczlylbszyw jytsjyhbyshfjlygjxxtmzyylt" +
"xxypzlxyjzyzyybnhmymdyylblhlsyygqllscxlxhdwkqgyshq" +
"ywljyyhzmsljljxcjjyy cbcpzjmylcqlnjqjlxyjmlzjqlycm" +
"hcfmmfpqqmfxlmcfqmm znfhjgtthkhchydxtmqzymyytyyyzz" +
"dcymzydlfmycqzwzz mabtbcmzzgdfycgcytt fwfdtzqssstx" +
"jhxytsxlywwkxexwznnqzjzjjccchyyxbzxzcyjtllcqxynjyc" +
"yycynzzqyyyewy czdcjyhympwpymlgkdldqqbchjxy " +
" " +
" sypszsjczc cqytsjljjt ";

/**
* 获取GBK字的拼音的首字母
* 由于数据较大,完整的GBK编码表按GBK规范分成3部分
* GBK/2为与GB2312兼容的国标汉字部分,GBK/3和GBK/4为扩展汉字部分
* 每一部分都有自己的地址计算公式
* 若输入是acsii则返回同样的acsii
* 若输入是中文字符则返回拼音的首字母
* 若输入是中文字符但是该字符不知道如何发音,则返回空字符
* @param hzString
* @return String
* @throws UnsupportedEncodingException
*/
public static String getGBKpy(String hzString) throws UnsupportedEncodingException {
/*
* 效率:处理大字符串(字符串有132055个byte,即70577个char)1000次,消耗时间44.474S.
*/
if (hzString == null || hzString.length() == 0)
return "";
int pyi, len, no;
int ch1code = 0, ch2code = 0;
char ch1, ch2;

StringBuffer pyBuffer = new StringBuffer();
byte eB[] = hzString.getBytes("GBK");
len = eB.length;

//开始计算
pyi = 0;
while (pyi < len) {
ch1 = (char) eB[pyi];
pyi = pyi + 1;
ch1code = ch1;
if (ch1code > 0 && ch1code < 129) {
//普通的acsii
pyBuffer.append(ch1);
continue;
} else {
//GBK字符
ch1 = (char) (256 + (int) ch1);
if (eB[pyi] < 0) {
ch2 = (char) (256 + (int) eB[pyi]);
} else {
ch2 = (char) eB[pyi];
}
pyi = pyi + 1;
if (pyi > len)
break;
}
ch1code = ch1;
ch2code = ch2;
if (ch1code <= 254 && ch1code >= 170) {
//优先处理GB-2312汉字.
if (ch2code > 160) {
//查找GB-2312
no = (ch1code - 176) * 94 + (ch2code - 160);
pyBuffer.append(GB_2312.charAt(no - 1));
} else {
//查找GBK_4
no = (ch1code - 170) * 97 + (ch2code - 63);
pyBuffer.append(GBK_4.charAt(no - 1));
}
}else if (ch1code <= 160 && ch1code >= 129) {
//查找GBK_3
no = (ch1code - 129) * 191 + (ch2code - 63);
pyBuffer.append(GBK_3.charAt(no - 1));
} else {
//不是GBK汉字
continue;
}
}
return pyBuffer.toString().trim().toLowerCase();
}

public static void main(String[] args) throws Exception{
System.out.println(GetPy.getGBKpy("光谷金融港镕軍國"));
}
}
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: