您的位置:首页 > 编程语言 > Java开发

在Java中,GB码和Unicode码的互转问题,欢迎讨论

2007-05-09 16:58 344 查看
在Java中,编码格式非常重要,下面是一段GB码和Unicode码互转的代码,欢迎指出优缺点及需要改进的地方。

/**
* @author Administrator
*
*/
import java.io.*;
import java.util.Hashtable;

class GBUnicode{
byte high[]=new byte[6763],low[]=new byte[6763];
char unichar[]=new char[6763];
Hashtable UniGB;

public GBUnicode(String table_file)throws IOException
{
//BufferedInputStream tables=new BufferedInputStream (new FileInputStream(table_file));
DataInputStream tables=new DataInputStream (new FileInputStream(table_file));
int i,n=0;
byte b,bl,bh,num[]=new byte[20];

UniGB=new Hashtable(7000,1);
while (n<6763 ){
do{
bh=(byte)tables.read();
}while ((char)bh<=' '); //find first non-blank char
bl=(byte)tables.read();
high
=bh;
low
=bl;
do{
b=(byte)tables.read();
}while (b!=(byte)':'); //find ':'
do{
b=(byte)tables.read();
}while ((char)b<=' '); //find next non-blank char to read as number
i=0;
while ((char)b>='0' && (char)b<='9'){
num[i++]=b;
b=(byte)tables.read();
}
unichar
=(char)Integer.parseInt(new String(num,0,0,i));
if (UniGB.get(new Character(unichar
))!= null)
System.out.println("Duplicated : "+unichar
);
UniGB.put(new Character(unichar
),new Integer(n));
n=n+1;
}
tables.close();
}

private int getGBindex(byte high,byte low){
int i,j;
i=high-(byte)0xb0;
j=low-(byte)0xa1;
if (i <39) {// L1 Chinese
if (j<0 || j>94)
return -1;
return (i*94+j);
}
else if (i==39) {//one of the last 89 L1 Chinese
if (j<0 || j>89)
return -1;
return (i*94+j);
}
else {//L2 Chinese
if (j<0 || j>94)
return -1;
return (i*94+j-5);
}
}

public byte[] Uni2GB(char unicode) {

Integer index=(Integer)UniGB.get(new Character(unicode));
if (index==null)
return null;
byte ch[]=new byte[2];
ch[0]=high[index.intValue()];
ch[1]=low[index.intValue()];
return ch;
}

public char GB2Uni(byte high, byte low) {
int index=getGBindex(high,low);
if (index ==-1) //not GB Chinese
return 0;
return(unichar[index]);
}
}
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: