您的位置:首页 > 编程语言 > Java开发

【Java】基础知识巩固(char和String)&&示例(一)

2017-03-27 23:02 309 查看
最近在项目上使用replaceAll()函数去掉小数点的时候,发现并没有得到自己想要的结果。之后便记录下自己遇到的问题,今天正好有空,不用上班(开心~),顺便整理一下此处遇到的问题。博客已经一个多星期没有更新了,终于又开始了!

先放测试代码!

下面的代码主要分为三部分

(1)replace和replaceAll的区别

(2)关于char类型数据的使用

(3)关于String类的理解

代码:

package test;

public class replaceTest {
public static void main(String[] args) {
/*第一部分:关于replace和replaceAll的测试(涉及知识点:CharSequence,Pattern,Matcher。具体区别查看源码)*/
String literal_s="\\ab\\.";
System.out.println(literal_s.replace("\\.", ""));//输出为\ab,替换的是\.而不是小数点
System.out.println(literal_s.replaceAll("\\.", ""));//输出为\ab\,替换的是.
String s="192.168.102.1";
System.out.println(s.replace(".", ""));//输出为1921681021
System.out.println(s);//输出为:192.168.102.1,说明经过上面一行代码的执行之后改变的不是本身,而是副本
System.out.println(s.replaceAll(".", ""));//输出结果为:(空),什么都没有,因为这里是正则表达式中的.,.在正则表达式中代表任何字符,因此全部被替换
String[] split_arr=s.split(".");//replaceAll(String regex, String replacement)和split(String regex)这里的参数指的是正则表达式的字符串,对于.,是特殊字符,在正则表达式里面代表任何字符,
System.out.println(s.replaceAll("\\.", ""));//输出结果为:1921681021
/*第二部分:关于char类型的理解*/
char c='\\';
//      char b='\';//此种写法错误
char d='/';
String zhuanyi_s="\\a";//长度为2
System.out.println(zhuanyi_s.length());//输出为2
char[] c_arr=zhuanyi_s.toCharArray();//数组内容为:[\,a]
/*第三部分:关于String的理解(涉及知识点:栈,堆,常量池,引用,值)*/
String a="a";
String a_1="a";//
System.out.println(a==a_1);//true,指向同一个内存地址,共享一个字符串常量池
String b_obj=new String("test");
String b_obj_copy=new String("test");
System.out.println(b_obj==b_obj_copy);//false,指向不同地址
}
}


1、replace和replaceAll函数的区别

1.1、replace函数源码:

/**
* Replaces each substring of this string that matches the literal target
* sequence with the specified literal replacement sequence. The
* replacement proceeds from the beginning of the string to the end, for
* example, replacing "aa" with "b" in the string "aaa" will result in
* "ba" rather than "ab".
*
* @param  target The sequence of char values to be replaced
* @param  replacement The replacement sequence of char values
* @return  The resulting string
* @since 1.5
*/
public String replace(CharSequence target, CharSequence replacement) {
return Pattern.compile(target.toString(), Pattern.LITERAL).matcher(
this).replaceAll(Matcher.quoteReplacement(replacement.toString()));
}


从replace函数的内部来看,该处的target会被当作字面上的字符串,里面的字符不再具有正则表达式的特殊含义。Pattern.LITERAL的意思如下:

/**
* Enables literal parsing of the pattern.
*
* <p> When this flag is specified then the input string that specifies
* the pattern is treated as a sequence of literal characters.
* Metacharacters or escape sequences in the input sequence will be
* given no special meaning.
*
* <p>The flags CASE_INSENSITIVE and UNICODE_CASE retain their impact on
* matching when used in conjunction with this flag. The other flags
* become superfluous.
*
* <p> There is no embedded flag character for enabling literal parsing.
* @since 1.5
*/
public static final int LITERAL = 0x10;


这里该变量的含义就是,对于输入的target字符串当作字面上的字符串来理解,对于正则表达式中的某些字符串可能会有特殊含义,比如.在正则表达式中表示小数点(\.在java定义字符串的时候是”\\.”),但是如果用\.作为replace(CharSequence target, CharSequence replacement)中的target变量传入的时候,这个时候,该变量会被当做\.(两个字符)的意思,而不再是小数点的意思。因此从上面的测试代码中可以看到literal_s被replace之后,输出值为\ab,而不是\ab\,而被replaceAll的时候,结果为\ab\。

注:

上面提到的元字符就是正则表达式中的元字符,元字符的知识参考:

(1)正则表达式 - 元字符:http://www.runoob.com/regexp/regexp-metachar.html

(2)String,StringBuilder等类是实现CharSequence类,其中,CharSequence类的知识移步。

String之String和CharSequence、StringBuilder和StringBuffer的区别:http://www.fengfly.com/plus/view-214077-1.html

1.2、replaceAll函数源码

此处是String类的replaceAll函数,避免和Matcher类的replaceAll函数混淆

/**
* Replaces each substring of this string that matches the given <a
* href="../util/regex/Pattern.html#sum">regular expression</a> with the
* given replacement.
*
* <p> An invocation of this method of the form
* <i>str</i>{@code .replaceAll(}<i>regex</i>{@code ,} <i>repl</i>{@code )}
* yields exactly the same result as the expression
*
* <blockquote>
* <code>
* {@link java.util.regex.Pattern}.{@link
* java.util.regex.Pattern#compile compile}(<i>regex</i>).{@link
* java.util.regex.Pattern#matcher(java.lang.CharSequence) matcher}(<i>str</i>).{@link
* java.util.regex.Matcher#replaceAll replaceAll}(<i>repl</i>)
* </code>
* </blockquote>
*
*<p>
* Note that backslashes ({@code \}) and dollar signs ({@code $}) in the
* replacement string may cause the results to be different than if it were
* being treated as a literal replacement string; see
* {@link java.util.regex.Matcher#replaceAll Matcher.replaceAll}.
* Use {@link java.util.regex.Matcher#quoteReplacement} to suppress the special
* meaning of these characters, if desired.
*
* @param   regex
*          the regular expression to which this string is to be matched
* @param   replacement
*          the string to be substituted for each match
*
* @return  The resulting {@code String}
*
* @throws  PatternSyntaxException
*          if the regular expression's syntax is invalid
*
* @see java.util.regex.Pattern
*
* @since 1.4
* @spec JSR-51
*/
public String replaceAll(String regex, String replacement) {
return Pattern.compile(regex).matcher(this).replaceAll(replacement);
}


从上面的源码可以看到,此处的replaceAll的参数regex是正则表达式的字符串的意思,也就是说,此处的参数会被当作正则表达式来处理(而String类的replace函数则是直接当作字面意思来看待)对比上面的测试代码可以看出差别!

总结:

从上面的两个函数可以看出来,两个函数的内部都使用了Matcher类的replaceAll函数,只不过在String类的replace函数在调用Matcher类的replaceAll之前对replacement参数进行了处理,处理过程是用Matcher.quoteReplacement(replacement.toString()),也就是调用了Matcher类的quoteReplacement函数,该函数的源码,见下:

/**
* Returns a literal replacement <code>String</code> for the specified
* <code>String</code>.
*
* This method produces a <code>String</code> that will work
* as a literal replacement <code>s</code> in the
* <code>appendReplacement</code> method of the {@link Matcher} class.
* The <code>String</code> produced will match the sequence of characters
* in <code>s</code> treated as a literal sequence. Slashes ('\') and
* dollar signs ('$') will be given no special meaning.
*
* @param  s The string to be literalized
* @return  A literal string replacement
* @since 1.5
*/
public static String quoteReplacement(String s) {
if ((s.indexOf('\\') == -1) && (s.indexOf('$') == -1))
return s;
StringBuilder sb = new StringBuilder();
for (int i=0; i<s.length(); i++) {
char c = s.charAt(i);
if (c == '\\' || c == '$') {
sb.append('\\');
}
sb.append(c);
}
return sb.toString();
}


从上面的源码可以看出来,如果replacement包含了\和$字符串,则需要在前面添加\进行转义。

Matcher类的replaceAll函数源码如下:

/**
* Replaces every subsequence of the input sequence that matches the
* pattern with the given replacement string.
*
* <p> This method first resets this matcher.  It then scans the input
* sequence looking for matches of the pattern.  Characters that are not
* part of any match are appended directly to the result string; each match
* is replaced in the result by the replacement string.  The replacement
* string may contain references to captured subsequences as in the {@link
* #appendReplacement appendReplacement} method.
*
* <p> Note that backslashes (<tt>\</tt>) and dollar signs (<tt>$</tt>) in
* the replacement string may cause the results to be different than if it
* were being treated as a literal replacement string. Dollar signs may be
* treated as references to captured subsequences as described above, and
* backslashes are used to escape literal characters in the replacement
* string.
*
* <p> Given the regular expression <tt>a*b</tt>, the input
* <tt>"aabfooaabfooabfoob"</tt>, and the replacement string
* <tt>"-"</tt>, an invocation of this method on a matcher for that
* expression would yield the string <tt>"-foo-foo-foo-"</tt>.
*
* <p> Invoking this method changes this matcher's state.  If the matcher
* is to be used in further matching operations then it should first be
* reset.  </p>
*
* @param  replacement
*         The replacement string
*
* @return  The string constructed by replacing each matching subsequence
*          by the replacement string, substituting captured subsequences
*          as needed
*/
public String replaceAll(String replacement) {
reset();
boolean result = find();
if (result) {
StringBuffer sb = new StringBuffer();
do {
appendReplacement(sb, replacement);
result = find();
} while (result);
appendTail(sb);
return sb.toString();
}
return text.toString();
}


下面通过示例对Matcher类的replaceAll函数方法理解

代码示例:

/*Matcher中的replaceAll的理解*/
Pattern p1 = Pattern.compile("cat");
Matcher m1 = p1.matcher("one cat two cats in the yard");
StringBuffer sb = new StringBuffer();
System.out.println(m1.replaceAll("\\$2"));//输出one $2 two $2s in the yard
System.out.println(m1.replaceAll("$2"));//运行错误,错误代码:java.lang.IndexOutOfBoundsException: No group 2,此处报错可以根据replaceAll函数里调用的appendReplacement函数查看其源代码,该异常即是该函数抛出


2、char的理解

char c='\\';//定义反斜杠字符(正确写法)
//      char b='\';//此种写法错误
char d='/';
String zhuanyi_s="\\a";//长度为2
char[] c_arr=zhuanyi_s.toCharArray();//数组内容为:[\,a]


从上面可以看出来,

(1)\表示的字符为反斜杠,长度为1。

(2)String类内部实现其实是一个char[]数组

根据源码,可以看到String类含有一个名为value的char[]私有成员,其中1个常用构造函数如下:

/**
* Initializes a newly created {@code String} object so that it represents
* the same sequence of characters as the argument; in other words, the
* newly created string is a copy of the argument string. Unless an
* explicit copy of {@code original} is needed, use of this constructor is
* unnecessary since Strings are immutable.
*
* @param  original
*         A {@code String}
*/
public String(String original) {
this.value = original.value;
this.hash = original.hash;
}


3、String类的理解

测试代码如下:

String a="a";
String a_new=new String("a");
System.out.println(a==a_new);//false
String a_1="a";//
System.out.println(a==a_1);//true,指向同一个内存地址,共享一个字符串常量池
String b_obj=new String("test");
String b_obj_copy=new String("test");
System.out.println(b_obj==b_obj_copy);//false,指向不同地址


对比上面的a_1和a两个引用,可以知道,这两个引用指向同一个地址,共享一个字符串常量

而通过b_obj和b_obj_copy知道,每次new一个对象时,两个引用指向不同地址(这里有个疑问,那么这两个引用是否是共用一个字符串呢?以后如果理解了,再来解释,先留个坑)。

结论:

(1)若两个变量直接赋值相同的字符串,则是同一个地址

(2)若两个变量都是利用构造函数生成,即使内容相同,也指向不同内存地址

一个疑问:

String类中的value成员是如何初始化的,怎么得到这个char数组的?

欢迎技术交流

wkang1993@outlook.com
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  java string char