您的位置：首页 > 编程语言 > Java开发

【Java】基础知识巩固（char和String）&&示例（一）

2017-03-27 23:02 309 查看

最近在项目上使用replaceAll（）函数去掉小数点的时候，发现并没有得到自己想要的结果。之后便记录下自己遇到的问题，今天正好有空，不用上班（开心~），顺便整理一下此处遇到的问题。博客已经一个多星期没有更新了，终于又开始了！

先放测试代码！

下面的代码主要分为三部分

（1）replace和replaceAll的区别

（2）关于char类型数据的使用

（3）关于String类的理解

代码：

package test;

public class replaceTest {
public static void main(String[] args) {
/*第一部分：关于replace和replaceAll的测试(涉及知识点：CharSequence，Pattern，Matcher。具体区别查看源码)*/
String literal_s="\\ab\\.";
System.out.println(literal_s.replace("\\.", ""));//输出为\ab,替换的是\.而不是小数点
System.out.println(literal_s.replaceAll("\\.", ""));//输出为\ab\,替换的是.
String s="192.168.102.1";
System.out.println(s.replace(".", ""));//输出为1921681021
System.out.println(s);//输出为：192.168.102.1,说明经过上面一行代码的执行之后改变的不是本身，而是副本
System.out.println(s.replaceAll(".", ""));//输出结果为：（空），什么都没有，因为这里是正则表达式中的.，.在正则表达式中代表任何字符，因此全部被替换
String[] split_arr=s.split(".");//replaceAll(String regex, String replacement)和split(String regex)这里的参数指的是正则表达式的字符串，对于.，是特殊字符，在正则表达式里面代表任何字符，
System.out.println(s.replaceAll("\\.", ""));//输出结果为：1921681021
/*第二部分：关于char类型的理解*/
char c='\\';
//      char b='\';//此种写法错误
char d='/';
String zhuanyi_s="\\a";//长度为2
System.out.println(zhuanyi_s.length());//输出为2
char[] c_arr=zhuanyi_s.toCharArray();//数组内容为：[\,a]
/*第三部分：关于String的理解（涉及知识点：栈，堆，常量池，引用，值）*/
String a="a";
String a_1="a";//
System.out.println(a==a_1);//true，指向同一个内存地址，共享一个字符串常量池
String b_obj=new String("test");
String b_obj_copy=new String("test");
System.out.println(b_obj==b_obj_copy);//false，指向不同地址
}
}

1、replace和replaceAll函数的区别

1.1、replace函数源码：

/**
* Replaces each substring of this string that matches the literal target
* sequence with the specified literal replacement sequence. The
* replacement proceeds from the beginning of the string to the end, for
* example, replacing "aa" with "b" in the string "aaa" will result in
* "ba" rather than "ab".
*
* @param  target The sequence of char values to be replaced
* @param  replacement The replacement sequence of char values
* @return  The resulting string
* @since 1.5
*/
public String replace(CharSequence target, CharSequence replacement) {
return Pattern.compile(target.toString(), Pattern.LITERAL).matcher(
this).replaceAll(Matcher.quoteReplacement(replacement.toString()));
}

从replace函数的内部来看，该处的target会被当作字面上的字符串，里面的字符不再具有正则表达式的特殊含义。Pattern.LITERAL的意思如下：

/**
* Enables literal parsing of the pattern.
*
* <p> When this flag is specified then the input string that specifies
* the pattern is treated as a sequence of literal characters.
* Metacharacters or escape sequences in the input sequence will be
* given no special meaning.
*
* <p>The flags CASE_INSENSITIVE and UNICODE_CASE retain their impact on
* matching when used in conjunction with this flag. The other flags
* become superfluous.
*
* <p> There is no embedded flag character for enabling literal parsing.
* @since 1.5
*/
public static final int LITERAL = 0x10;

这里该变量的含义就是，对于输入的target字符串当作字面上的字符串来理解，对于正则表达式中的某些字符串可能会有特殊含义，比如.在正则表达式中表示小数点（\.在java定义字符串的时候是”\\.”），但是如果用\.作为replace(CharSequence target, CharSequence replacement)中的target变量传入的时候，这个时候，该变量会被当做\.（两个字符）的意思，而不再是小数点的意思。因此从上面的测试代码中可以看到literal_s被replace之后，输出值为\ab,而不是\ab\，而被replaceAll的时候，结果为\ab\。

注：

上面提到的元字符就是正则表达式中的元字符，元字符的知识参考：

（1）正则表达式 - 元字符：http://www.runoob.com/regexp/regexp-metachar.html

（2）String，StringBuilder等类是实现CharSequence类，其中，CharSequence类的知识移步。

String之String和CharSequence、StringBuilder和StringBuffer的区别：http://www.fengfly.com/plus/view-214077-1.html

1.2、replaceAll函数源码

此处是String类的replaceAll函数，避免和Matcher类的replaceAll函数混淆

/**
* Replaces each substring of this string that matches the given <a
* href="../util/regex/Pattern.html#sum">regular expression</a> with the
* given replacement.
*
* <p> An invocation of this method of the form
* <i>str</i>{@code .replaceAll(}<i>regex</i>{@code ,} <i>repl</i>{@code )}
* yields exactly the same result as the expression
*
* <blockquote>
* <code>
* {@link java.util.regex.Pattern}.{@link
* java.util.regex.Pattern#compile compile}(<i>regex</i>).{@link
* java.util.regex.Pattern#matcher(java.lang.CharSequence) matcher}(<i>str</i>).{@link
* java.util.regex.Matcher#replaceAll replaceAll}(<i>repl</i>)
* </code>
* </blockquote>
*
*<p>
* Note that backslashes ({@code \}) and dollar signs ({@code $}) in the
* replacement string may cause the results to be different than if it were
* being treated as a literal replacement string; see
* {@link java.util.regex.Matcher#replaceAll Matcher.replaceAll}.
* Use {@link java.util.regex.Matcher#quoteReplacement} to suppress the special
* meaning of these characters, if desired.
*
* @param   regex
*          the regular expression to which this string is to be matched
* @param   replacement
*          the string to be substituted for each match
*
* @return  The resulting {@code String}
*
* @throws  PatternSyntaxException
*          if the regular expression's syntax is invalid
*
* @see java.util.regex.Pattern
*
* @since 1.4
* @spec JSR-51
*/
public String replaceAll(String regex, String replacement) {
return Pattern.compile(regex).matcher(this).replaceAll(replacement);
}

从上面的源码可以看到，此处的replaceAll的参数regex是正则表达式的字符串的意思，也就是说，此处的参数会被当作正则表达式来处理（而String类的replace函数则是直接当作字面意思来看待）对比上面的测试代码可以看出差别！

总结：

从上面的两个函数可以看出来，两个函数的内部都使用了Matcher类的replaceAll函数，只不过在String类的replace函数在调用Matcher类的replaceAll之前对replacement参数进行了处理，处理过程是用Matcher.quoteReplacement(replacement.toString())，也就是调用了Matcher类的quoteReplacement函数，该函数的源码，见下：

/**
* Returns a literal replacement <code>String</code> for the specified
* <code>String</code>.
*
* This method produces a <code>String</code> that will work
* as a literal replacement <code>s</code> in the
* <code>appendReplacement</code> method of the {@link Matcher} class.
* The <code>String</code> produced will match the sequence of characters
* in <code>s</code> treated as a literal sequence. Slashes ('\') and
* dollar signs ('$') will be given no special meaning.
*
* @param  s The string to be literalized
* @return  A literal string replacement
* @since 1.5
*/
public static String quoteReplacement(String s) {
if ((s.indexOf('\\') == -1) && (s.indexOf('$') == -1))
return s;
StringBuilder sb = new StringBuilder();
for (int i=0; i<s.length(); i++) {
char c = s.charAt(i);
if (c == '\\' || c == '$') {
sb.append('\\');
}
sb.append(c);
}
return sb.toString();
}

从上面的源码可以看出来，如果replacement包含了\和$字符串，则需要在前面添加\进行转义。

Matcher类的replaceAll函数源码如下：

/**
* Replaces every subsequence of the input sequence that matches the
* pattern with the given replacement string.
*
* <p> This method first resets this matcher.  It then scans the input
* sequence looking for matches of the pattern.  Characters that are not
* part of any match are appended directly to the result string; each match
* is replaced in the result by the replacement string.  The replacement
* string may contain references to captured subsequences as in the {@link
* #appendReplacement appendReplacement} method.
*
* <p> Note that backslashes (<tt>\</tt>) and dollar signs (<tt>$</tt>) in
* the replacement string may cause the results to be different than if it
* were being treated as a literal replacement string. Dollar signs may be
* treated as references to captured subsequences as described above, and
* backslashes are used to escape literal characters in the replacement
* string.
*
* <p> Given the regular expression <tt>a*b</tt>, the input
* <tt>"aabfooaabfooabfoob"</tt>, and the replacement string
* <tt>"-"</tt>, an invocation of this method on a matcher for that
* expression would yield the string <tt>"-foo-foo-foo-"</tt>.
*
* <p> Invoking this method changes this matcher's state.  If the matcher
* is to be used in further matching operations then it should first be
* reset.  </p>
*
* @param  replacement
*         The replacement string
*
* @return  The string constructed by replacing each matching subsequence
*          by the replacement string, substituting captured subsequences
*          as needed
*/
public String replaceAll(String replacement) {
reset();
boolean result = find();
if (result) {
StringBuffer sb = new StringBuffer();
do {
appendReplacement(sb, replacement);
result = find();
} while (result);
appendTail(sb);
return sb.toString();
}
return text.toString();
}

下面通过示例对Matcher类的replaceAll函数方法理解

代码示例：

/*Matcher中的replaceAll的理解*/
Pattern p1 = Pattern.compile("cat");
Matcher m1 = p1.matcher("one cat two cats in the yard");
StringBuffer sb = new StringBuffer();
System.out.println(m1.replaceAll("\\$2"));//输出one $2 two $2s in the yard
System.out.println(m1.replaceAll("$2"));//运行错误，错误代码：java.lang.IndexOutOfBoundsException: No group 2，此处报错可以根据replaceAll函数里调用的appendReplacement函数查看其源代码，该异常即是该函数抛出

2、char的理解

char c='\\';//定义反斜杠字符（正确写法）
//      char b='\';//此种写法错误
char d='/';
String zhuanyi_s="\\a";//长度为2
char[] c_arr=zhuanyi_s.toCharArray();//数组内容为：[\,a]

从上面可以看出来，

（1）\表示的字符为反斜杠，长度为1。

（2）String类内部实现其实是一个char[]数组

根据源码，可以看到String类含有一个名为value的char[]私有成员，其中1个常用构造函数如下：

/**
* Initializes a newly created {@code String} object so that it represents
* the same sequence of characters as the argument; in other words, the
* newly created string is a copy of the argument string. Unless an
* explicit copy of {@code original} is needed, use of this constructor is
* unnecessary since Strings are immutable.
*
* @param  original
*         A {@code String}
*/
public String(String original) {
this.value = original.value;
this.hash = original.hash;
}

3、String类的理解

测试代码如下：

String a="a";
String a_new=new String("a");
System.out.println(a==a_new);//false
String a_1="a";//
System.out.println(a==a_1);//true，指向同一个内存地址，共享一个字符串常量池
String b_obj=new String("test");
String b_obj_copy=new String("test");
System.out.println(b_obj==b_obj_copy);//false，指向不同地址

对比上面的a_1和a两个引用，可以知道，这两个引用指向同一个地址，共享一个字符串常量

而通过b_obj和b_obj_copy知道，每次new一个对象时，两个引用指向不同地址（这里有个疑问，那么这两个引用是否是共用一个字符串呢？以后如果理解了，再来解释，先留个坑）。

结论：

（1）若两个变量直接赋值相同的字符串，则是同一个地址

（2）若两个变量都是利用构造函数生成，即使内容相同，也指向不同内存地址

一个疑问：

String类中的value成员是如何初始化的，怎么得到这个char数组的？

欢迎技术交流

wkang1993@outlook.com

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签： java string char

相关文章推荐

新的分享

章节导航