native2ascii解决java国际化问题
2006-06-07 08:58
453 查看
JDK中带了一个实用的程序native2ascii, 它可以根据指定的编码集在
本地字符和UNICODE字符之间进行转换. 本文给出了一些范例, 介绍了
怎样去使用这个程序.
1. 交互式:
直接敲入native2ascii, 程序光标会停在下一行,这时你可以直接敲入想
要转换的字符,例如输入"中国", 再按回车, 屏幕就会显示对应的
UNICODE编码 "u4e2du56fd"
按Ctrl +C 退出程序
2. 完整模式
1) Native char -> Unicode char
native2ascii -encodingXXX input_file output_file
先说说encoding这个参数: 在windows平台下就是代码页,例如
936表示简体中文,950表示繁体中文,949 表示韩文等等.
可以通过注册表键值查询到当前OS所支持的全部代码页:
HKEY_CLASSES_ROOT/MIME/Database/Codepage
example:
native2ascii -encoding MS936
c:/temp1.txt c:/temp2.txt
2) Unicode char -> Native char
native2ascii -reverse -encodingXXX input_file output_file
Example:
native2ascii -reverse -encoding MS936
c:/temp1.txt c:temp2.txt
注意: 如果指定了encoding, 要保证转换后的文件能正确显示,
需要安装对应的字体.
下面列举各个国家的编码代号
[/code]
If
Perform the reverse operation: convert a file with Latin-1 and/or Unicode encoded characters to one with native-encoded characters.
Specify the encoding name which is used by the conversion procedure. The default encoding is taken from System property
本地字符和UNICODE字符之间进行转换. 本文给出了一些范例, 介绍了
怎样去使用这个程序.
1. 交互式:
直接敲入native2ascii, 程序光标会停在下一行,这时你可以直接敲入想
要转换的字符,例如输入"中国", 再按回车, 屏幕就会显示对应的
UNICODE编码 "u4e2du56fd"
按Ctrl +C 退出程序
2. 完整模式
1) Native char -> Unicode char
native2ascii -encodingXXX input_file output_file
先说说encoding这个参数: 在windows平台下就是代码页,例如
936表示简体中文,950表示繁体中文,949 表示韩文等等.
可以通过注册表键值查询到当前OS所支持的全部代码页:
HKEY_CLASSES_ROOT/MIME/Database/Codepage
example:
native2ascii -encoding MS936
c:/temp1.txt c:/temp2.txt
2) Unicode char -> Native char
native2ascii -reverse -encodingXXX input_file output_file
Example:
native2ascii -reverse -encoding MS936
c:/temp1.txt c:temp2.txt
注意: 如果指定了encoding, 要保证转换后的文件能正确显示,
需要安装对应的字体.
下面列举各个国家的编码代号
SYNOPSIS
native2ascii [options] [inputfile [outputfile]]
[/code]
DESCRIPTION
The Java compiler and other Java tools can only process files which contain Latin-1 and/or Unicode-encoded (/udddd notation) characters.native2asciiconverts files which contain other character encodings into files containing Latin-1 and/or Unicode-encoded charaters.
If
outputfileis omitted, standard output is used for output. If, in addition,
inputfileis omitted, standard input is used for input.
OPTIONS
-reverse
Perform the reverse operation: convert a file with Latin-1 and/or Unicode encoded characters to one with native-encoded characters.
-encoding encoding_name
Specify the encoding name which is used by the conversion procedure. The default encoding is taken from System property
file.encoding. The
encoding_namestring must be a string taken from the first column of the table below.
------------------------------------------------------------- Converter Description Class ------------------------------------------------------------- 8859_1 ISO 8859-1 8859_2 ISO 8859-2 8859_3 ISO 8859-3 8859_4 ISO 8859-4 8859_5 ISO 8859-5 8859_6 ISO 8859-6 8859_7 ISO 8859-7 8859_8 ISO 8859-8 8859_9 ISO 8859-9 Big5 Big5, Traditional Chinese CNS11643 CNS 11643, Traditional Chinese Cp037 USA, Canada(Bilingual, French), Netherlands, Portugal, Brazil, Australia Cp1006 IBM AIX Pakistan (Urdu) Cp1025 IBM Multilingual Cyrillic: Bulgaria, Bosnia, Herzegovinia, Macedonia(FYR) Cp1026 IBM Latin-5, Turkey Cp1046 IBM Open Edition US EBCDIC Cp1097 IBM Iran(Farsi)/Persian Cp1098 IBM Iran(Farsi)/Persian (PC) Cp1112 IBM Latvia, Lithuania Cp1122 IBM Estonia Cp1123 IBM Ukraine Cp1124 IBM AIX Ukraine Cp1125 IBM Ukraine (PC) Cp1250 Windows Eastern European Cp1251 Windows Cyrillic Cp1252 Windows Latin-1 Cp1253 Windows Greek Cp1254 Windows Turkish Cp1255 Windows Hebrew Cp1256 Windows Arabic Cp1257 Windows Baltic Cp1258 Windows Vietnamese Cp1381 IBM OS/2, DOS People's Republic of China (PRC) Cp1383 IBM AIX People's Republic of China (PRC) Cp273 IBM Austria, Germany Cp277 IBM Denmark, Norway Cp278 IBM Finland, Sweden Cp280 IBM Italy Cp284 IBM Catalan/Spain, Spanish Latin America Cp285 IBM United Kingdom, Ireland Cp297 IBM France Cp33722 IBM-eucJP - Japanese (superset of 5050) Cp420 IBM Arabic Cp424 IBM Hebrew Cp437 MS-DOS United States, Australia, New Zealand, South Africa Cp500 EBCDIC 500V1 Cp737 PC Greek Cp775 PC Baltic Cp838 IBM Thailand extended SBCS Cp850 MS-DOS Latin-1 Cp852 MS-DOS Latin-2 Cp855 IBM Cyrillic Cp857 IBM Turkish Cp860 MS-DOS Portuguese Cp861 MS-DOS Icelandic Cp862 PC Hebrew Cp863 MS-DOS Canadian French Cp864 PC Arabic Cp865 MS-DOS Nordic Cp866 MS-DOS Russian Cp868 MS-DOS Pakistan Cp869 IBM Modern Greek Cp870 IBM Multilingual Latin-2 Cp871 IBM Iceland Cp874 IBM Thai Cp875 IBM Greek Cp918 IBM Pakistan(Urdu) Cp921 IBM Latvia, Lithuania (AIX, DOS) Cp922 IBM Estonia (AIX, DOS) Cp930 Japanese Katakana-Kanji mixed with 4370 UDC, superset of 5026 Cp933 Korean Mixed with 1880 UDC, superset of 5029 Cp935 Simplified Chinese Host mixed with 1880 UDC, superset of 5031 Cp937 Traditional Chinese Host miexed with 6204 UDC, superset of 5033 Cp939 Japanese Latin Kanji mixed with 4370 UDC, superset of 5035 Cp942 Japanese (OS/2) superset of 932 Cp948 OS/2 Chinese (Taiwan) superset of 938 Cp949 PC Korean Cp950 PC Chinese (Hong Kong, Taiwan) Cp964 AIX Chinese (Taiwan) Cp970 AIX Korean EUCJIS JIS, EUC Encoding, Japanese GB2312 GB2312, EUC encoding, Simplified Chinese GBK GBK, Simplified Chinese ISO2022CN ISO 2022 CN, Chinese ISO2022CN_CNS CNS 11643 in ISO-2022-CN form, T. Chinese ISO2022CN_GB GB 2312 in ISO-2022-CN form, S. Chinese ISO2022KR ISO 2022 KR, Korean JIS JIS, Japanese JIS0208 JIS 0208, Japanese KOI8_R KOI8-R, Russian KSC5601 KS C 5601, Korean MS874 Windows Thai MacArabic Macintosh Arabic MacCentralEurope Macintosh Latin-2 MacCroatian Macintosh Croatian MacCyrillic Macintosh Cyrillic MacDingbat Macintosh Dingbat MacGreek Macintosh Greek MacHebrew Macintosh Hebrew MacIceland Macintosh Iceland MacRoman Macintosh Roman MacRomania Macintosh Romania MacSymbol Macintosh Symbol MacThai Macintosh Thai MacTurkish Macintosh Turkish MacUkraine Macintosh Ukraine SJIS Shift-JIS, Japanese UTF8 UTF-8
相关文章推荐
- 用Java解决国际化问题
- 用Java解决国际化问题
- 用Java解决国际化问题
- 用Java解决国际化问题
- 用Java解决国际化问题
- 用Java解决国际化问题
- 初接触JAVA国际化问题-解决JAVA中文字符乱码
- JAVA _Save could not be completed.MyEclipse国际化的问题解决
- 用Java解决国际化问题
- Java调用Python返回乱码问题解决
- java下的mysql数据库插入越插越慢的问题解决(百万数据量级别)
- Java实现用传统分治法解决矩阵相乘问题
- java中将汉字转拼音,解决pinyin4j多音节问题
- Java 关于eclipse导入项目发生的问题及解决方法(推荐)
- [Java][]Maven]mvn eclipse:eclipse导入jar失败、直接导入工程的问题解决办法
- 解决java在调用存储过程中需要传递clob字段时存在问题
- Java中解决访问地址中包含空格和中文路径的问题
- java开发过程中对于乱码问题的解决方法
- 如何解决maven搭建项目的时候,src/main/java无法建立的问题,提示信息The folder is already a source folder.(文件夹已经是源文件夹。)
- Java 中文问题的解决 mysql, oracle, servlet, jsp