String系列源码解析02 - AbstractStringBuilder详细介绍
2014-07-04 15:16
597 查看
String系列源码解析02 - AbstractStringBuilder详细介绍
确保字符数组的容量至少等于指定的最小值。如果当前字符数组容量小于最小容量参数,那么新的具有更大容量的内部数组会被分配。默认增加后的字符数组容量是:
2*原字符数组容量 + 2
这也是许多JDK中的类(比如。。。)容量扩展的增长方式。
private void ensureCapacityInternal(int minimumCapacity);
void expandCapacity(int minimumCapacity);
public char charAt(int index) {}
补充:代码点,我译为“码位值”。每个码位值实际上代表一个真正unicode字符。即unicode字符集上的码位值。为什么要这些码位相关的方法?源自1个java的char字符并不完全等于一个unicode的字符。char采用UCS-2编码是一种淘汰的UTF-16编码,最多65536种形态,也远少于当今unicode拥有11万字符的需求。java只好对后来新增的unicode字符用2个char拼出1个unicode字符。导致String中char的数量不等于unicode字符的数量。
public int codePointAt(int index) {}
public int codePointBefore(int index) {}
public int codePointCount(int beginIndex, int endIndex) {}
如果想获取i位置的代码点,则需要使用下面的方法:
public int offsetByCodePoints(int index, int codePointOffset) {}
附加对象(实际上是对象的字符串表示形式)
public AbstractStringBuilder append(Object obj) {}
附加String
public AbstractStringBuilder append(String str) {}
类属性:
AbstractStringBuilder有两个属性value和count,value是一个char数组,用于存储字符序列;count则表示所使用的字符数组的长度。/** * value是一个char数组,用于字符存储 */ char value[]; /** * count表示所使用的字符数量 */ int count;
构造函数:
/** * 无参构造器对于子类的序列化是必须的 */ AbstractStringBuilder() { } /** * 构造指定容量的AbstractStringBuilder */ AbstractStringBuilder(int capacity) { value = new char[capacity]; }
核心类方法:
1. ensureCapacity
public void ensureCapacity(int minimumCapacity);确保字符数组的容量至少等于指定的最小值。如果当前字符数组容量小于最小容量参数,那么新的具有更大容量的内部数组会被分配。默认增加后的字符数组容量是:
2*原字符数组容量 + 2
这也是许多JDK中的类(比如。。。)容量扩展的增长方式。
/** * Ensures that the capacity is at least equal to the specified minimum. * If the current capacity is less than the argument, then a new internal * array is allocated with greater capacity. The new capacity is the * larger of: * <ul> * <li>The <code>minimumCapacity</code> argument. * <li>Twice the old capacity, plus <code>2</code>. * </ul> * If the <code>minimumCapacity</code> argument is nonpositive, this * method takes no action and simply returns. * * @param minimumCapacity 期望扩展的最小容量 */ public void ensureCapacity(int minimumCapacity) { // 只有当minimumCapacity参数大于零的时候才执行 if (minimumCapacity > 0) ensureCapacityInternal(minimumCapacity); }
2. ensureCapacityInternal
该方法和ensureCapacity有相同的契约,但是它永远不会被同步。private void ensureCapacityInternal(int minimumCapacity);
/** * This method has the same contract as ensureCapacity, but is * never synchronized. */ private void ensureCapacityInternal(int minimumCapacity) { // 防止内存溢出 if (minimumCapacity - value.length > 0) expandCapacity(minimumCapacity); }
3. expandCapacity
这个方法是ensureCapacity扩展语义的具体实现,但是没有大小和同步检查。void expandCapacity(int minimumCapacity);
/** * This implements the expansion semantics of ensureCapacity with no * size check or synchronization. */ void expandCapacity(int minimumCapacity) { int newCapacity = value.length * 2 + 2; // 默认生成新的字符数组的容量 if (newCapacity - minimumCapacity < 0) /* * 默认生成新的字符数组的容量 大于 期望扩展的最小容量, * 那么默认生成新的字符数组的容量取期望扩展的最小容量 */ newCapacity = minimumCapacity; if (newCapacity < 0) { if (minimumCapacity < 0) // 内存溢出 throw new OutOfMemoryError(); newCapacity = Integer.MAX_VALUE; } value = Arrays.copyOf(value, newCapacity); // 使用Arrays工具类生成新的字符数组 }
4. charAt
返回这个字符序列特定索引值下的字符值(代码单元),直接返回字符数组的元素,非常简单。public char charAt(int index) {}
/** * Returns the <code>char</code> value in this sequence at the specified index. * The first <code>char</code> value is at index <code>0</code>, the next at index * <code>1</code>, and so on, as in array indexing. * <p> * The index argument must be greater than or equal to * <code>0</code>, and less than the length of this sequence. * * <p>If the <code>char</code> value specified by the index is a * <a href="Character.html#unicode">surrogate</a>, the surrogate * value is returned. * * @param index the index of the desired <code>char</code> value. * @return the <code>char</code> value at the specified index. * @throws IndexOutOfBoundsException if <code>index</code> is * negative or greater than or equal to <code>length()</code>. */ public char charAt(int index) { // 不合法的索引值 if ((index < 0) || (index >= count)) throw new StringIndexOutOfBoundsException(index); return value[index]; }
5. codePointAt
返回这个字符序列指定索引下的unicode代码点(Unicode code point),内部使用的是char的引用类型Character的静态方法返回unicode代码点。补充:代码点,我译为“码位值”。每个码位值实际上代表一个真正unicode字符。即unicode字符集上的码位值。为什么要这些码位相关的方法?源自1个java的char字符并不完全等于一个unicode的字符。char采用UCS-2编码是一种淘汰的UTF-16编码,最多65536种形态,也远少于当今unicode拥有11万字符的需求。java只好对后来新增的unicode字符用2个char拼出1个unicode字符。导致String中char的数量不等于unicode字符的数量。
public int codePointAt(int index) {}
/** * Returns the character (Unicode code point) at the specified * index. The index refers to <code>char</code> values * (Unicode code units) and ranges from <code>0</code> to * {@link #length()}<code> - 1</code>. * * <p> If the <code>char</code> value specified at the given index * is in the high-surrogate range, the following index is less * than the length of this sequence, and the * <code>char</code> value at the following index is in the * low-surrogate range, then the supplementary code point * corresponding to this surrogate pair is returned. Otherwise, * the <code>char</code> value at the given index is returned. * * @param index the index to the <code>char</code> values * @return the code point value of the character at the * <code>index</code> * @exception IndexOutOfBoundsException if the <code>index</code> * argument is negative or not less than the length of this * sequence. */ public int codePointAt(int index) { // 非法索引值 if ((index < 0) || (index >= count)) { throw new StringIndexOutOfBoundsException(index); } return Character.codePointAt(value, index); }
6. codePointBefore
返回指定索引前一位的unicode代码点,具体实现同codePointAt类似。public int codePointBefore(int index) {}
/** * Returns the character (Unicode code point) before the specified * index. The index refers to <code>char</code> values * (Unicode code units) and ranges from <code>1</code> to {@link * #length()}. * * <p> If the <code>char</code> value at <code>(index - 1)</code> * is in the low-surrogate range, <code>(index - 2)</code> is not * negative, and the <code>char</code> value at <code>(index - * 2)</code> is in the high-surrogate range, then the * supplementary code point value of the surrogate pair is * returned. If the <code>char</code> value at <code>index - * 1</code> is an unpaired low-surrogate or a high-surrogate, the * surrogate value is returned. * * @param index the index following the code point that should be returned * @return the Unicode code point value before the given index. * @exception IndexOutOfBoundsException if the <code>index</code> * argument is less than 1 or greater than the length * of this sequence. */ public int codePointBefore(int index) { int i = index - 1; if ((i < 0) || (i >= count)) { throw new StringIndexOutOfBoundsException(index); } return Character.codePointBefore(value, index); }
7. codePointCount
准确计算指定索引值范围内unicode代码点的数量,注意,并不是char的数量,这与length()方法不同,length()方法计算的是代码单元的数量。public int codePointCount(int beginIndex, int endIndex) {}
/** * Returns the number of Unicode code points in the specified text * range of this sequence. The text range begins at the specified * <code>beginIndex</code> and extends to the <code>char</code> at * index <code>endIndex - 1</code>. Thus the length (in * <code>char</code>s) of the text range is * <code>endIndex-beginIndex</code>. Unpaired surrogates within * this sequence count as one code point each. * * @param beginIndex the index to the first <code>char</code> of * the text range. * @param endIndex the index after the last <code>char</code> of * the text range. * @return the number of Unicode code points in the specified text * range * @exception IndexOutOfBoundsException if the * <code>beginIndex</code> is negative, or <code>endIndex</code> * is larger than the length of this sequence, or * <code>beginIndex</code> is larger than <code>endIndex</code>. */ public int codePointCount(int beginIndex, int endIndex) { if (beginIndex < 0 || endIndex > count || beginIndex > endIndex) { throw new IndexOutOfBoundsException(); } return Character.codePointCountImpl(value, beginIndex, endIndex-beginIndex); }
8. offsetByCodePoints
获取指定索引处的代码点偏移量,我个人的理解是根据代码单元的偏移量查找代码点的偏移量。如果想获取i位置的代码点,则需要使用下面的方法:
String greeting = "Hello"; int index = greeting.offsetByCodePoints(0,i); int cp = greeting.codePointAt(index);
public int offsetByCodePoints(int index, int codePointOffset) {}
/** * Returns the index within this sequence that is offset from the * given <code>index</code> by <code>codePointOffset</code> code * points. Unpaired surrogates within the text range given by * <code>index</code> and <code>codePointOffset</code> count as * one code point each. * * @param index the index to be offset * @param codePointOffset the offset in code points * @return the index within this sequence * @exception IndexOutOfBoundsException if <code>index</code> * is negative or larger then the length of this sequence, * or if <code>codePointOffset</code> is positive and the subsequence * starting with <code>index</code> has fewer than * <code>codePointOffset</code> code points, * or if <code>codePointOffset</code> is negative and the subsequence * before <code>index</code> has fewer than the absolute value of * <code>codePointOffset</code> code points. */ public int offsetByCodePoints(int index, int codePointOffset) { if (index < 0 || index > count) { throw new IndexOutOfBoundsException(); } return Character.offsetByCodePointsImpl(value, 0, count, index, codePointOffset); }9. append
附加对象(实际上是对象的字符串表示形式)
public AbstractStringBuilder append(Object obj) {}
/** * Appends the string representation of the <code>Object</code> * argument. * <p> * The argument is converted to a string as if by the method * <code>String.valueOf</code>, and the characters of that * string are then appended to this sequence. * * @param obj an <code>Object</code>. * @return a reference to this object. */ public AbstractStringBuilder append(Object obj) { return append(String.valueOf(obj)); // 使用string的valueOf方法获取对象的字符串表示形式 }10. append
附加String
public AbstractStringBuilder append(String str) {}
/** * Appends the specified string to this character sequence. * <p> * The characters of the <code>String</code> argument are appended, in * order, increasing the length of this sequence by the length of the * argument. If <code>str</code> is <code>null</code>, then the four * characters <code>"null"</code> are appended. * <p> * Let <i>n</i> be the length of this character sequence just prior to * execution of the <code>append</code> method. Then the character at * index <i>k</i> in the new character sequence is equal to the character * at index <i>k</i> in the old character sequence, if <i>k</i> is less * than <i>n</i>; otherwise, it is equal to the character at index * <i>k-n</i> in the argument <code>str</code>. * * @param str a string. * @return a reference to this object. */ public AbstractStringBuilder append(String str) { if (str == null) str = "null"; // 对象为null的处理方案 int len = str.length(); if (len == 0) return this; // 待附加的String长度为零时,不处理 int newCount = count + len; if (newCount > value.length) expandCapacity(newCount); // 如果附加后的容量大于此char数组的容量,则进行扩展 str.getChars(0, len, value, count); // 将String复制到此char数组,使用String的getChars方法 count = newCount; // 修改此char数组的count属性(所使用的字符数量) return this; }
未完待续。。。
相关文章推荐
- Java 集合系列13之 WeakHashMap详细介绍(源码解析)和使用示例
- Java 集合系列11之 Hashtable详细介绍(源码解析)和使用示例
- Java 集合系列05之 LinkedList详细介绍(源码解析)和使用示例
- 【转】Java 集合系列03之 ArrayList详细介绍(源码解析)和使用示例
- Java 集合系列16之 HashSet详细介绍(源码解析)和使用示例
- Java 集合系列16之 HashSet详细介绍(源码解析)和使用示例
- Java 集合系列 06 Stack详细介绍(源码解析)和使用示例
- Java 集合系列06之 Vector详细介绍(源码解析)和使用示例
- Java 集合系列07之 Stack详细介绍(源码解析)和使用示例
- Java 集合系列10之 HashMap详细介绍(源码解析)和使用示例
- Java 集合系列05之 LinkedList详细介绍(源码解析)和使用示例
- Java 集合系列11之 Hashtable详细介绍(源码解析)和使用示例
- Java 集合系列06之 Vector详细介绍(源码解析)和使用示例
- Java 集合系列07之 Stack详细介绍(源码解析)和使用示例
- Java 集合系列06之 Vector详细介绍(源码解析)和使用示例
- Java 集合系列05之 LinkedList详细介绍(源码解析)和使用示例
- Java 集合系列 03 ArrayList详细介绍(源码解析)和使用示例
- Java 集合系列03之 ArrayList详细介绍(源码解析)和使用示例
- Java 集合系列17之 TreeSet详细介绍(源码解析)和使用示例
- Java 集合系列13之 WeakHashMap详细介绍(源码解析)和使用示例