您的位置：首页 > 产品设计 > UI/UE

String系列源码解析02 - AbstractStringBuilder详细介绍

2014-07-04 15:16 597 查看

类属性：

AbstractStringBuilder有两个属性value和count，value是一个char数组，用于存储字符序列；count则表示所使用的字符数组的长度。

/**
* value是一个char数组，用于字符存储
*/
char value[];

/**
* count表示所使用的字符数量
*/
int count;

构造函数：

/**
* 无参构造器对于子类的序列化是必须的
*/
AbstractStringBuilder() {
}

/**
* 构造指定容量的AbstractStringBuilder
*/
AbstractStringBuilder(int capacity) {
value = new char[capacity];
}

核心类方法：

1. ensureCapacity

public void ensureCapacity(int minimumCapacity);

确保字符数组的容量至少等于指定的最小值。如果当前字符数组容量小于最小容量参数，那么新的具有更大容量的内部数组会被分配。默认增加后的字符数组容量是：

2*原字符数组容量 + 2

这也是许多JDK中的类（比如。。。）容量扩展的增长方式。

/**
* Ensures that the capacity is at least equal to the specified minimum.
* If the current capacity is less than the argument, then a new internal
* array is allocated with greater capacity. The new capacity is the
* larger of:
* <ul>
* <li>The <code>minimumCapacity</code> argument.
* <li>Twice the old capacity, plus <code>2</code>.
* </ul>
* If the <code>minimumCapacity</code> argument is nonpositive, this
* method takes no action and simply returns.
*
* @param   minimumCapacity   期望扩展的最小容量
*/
public void ensureCapacity(int minimumCapacity) {
// 只有当minimumCapacity参数大于零的时候才执行
if (minimumCapacity > 0)
ensureCapacityInternal(minimumCapacity);
}

2. ensureCapacityInternal

该方法和ensureCapacity有相同的契约，但是它永远不会被同步。
private void ensureCapacityInternal(int minimumCapacity);

/**
* This method has the same contract as ensureCapacity, but is
* never synchronized.
*/
private void ensureCapacityInternal(int minimumCapacity) {
// 防止内存溢出
if (minimumCapacity - value.length > 0)
expandCapacity(minimumCapacity);
}

3. expandCapacity

这个方法是ensureCapacity扩展语义的具体实现，但是没有大小和同步检查。
void expandCapacity(int minimumCapacity);

/**
* This implements the expansion semantics of ensureCapacity with no
* size check or synchronization.
*/
void expandCapacity(int minimumCapacity) {
int newCapacity = value.length * 2 + 2; // 默认生成新的字符数组的容量
if (newCapacity - minimumCapacity < 0)
/*
* 默认生成新的字符数组的容量 大于 期望扩展的最小容量，
* 那么默认生成新的字符数组的容量取期望扩展的最小容量
*/
newCapacity = minimumCapacity;
if (newCapacity < 0) {
if (minimumCapacity < 0) // 内存溢出
throw new OutOfMemoryError();
newCapacity = Integer.MAX_VALUE;
}
value = Arrays.copyOf(value, newCapacity); // 使用Arrays工具类生成新的字符数组
}

4. charAt

返回这个字符序列特定索引值下的字符值（代码单元），直接返回字符数组的元素，非常简单。
public char charAt(int index) {}

/**
* Returns the <code>char</code> value in this sequence at the specified index.
* The first <code>char</code> value is at index <code>0</code>, the next at index
* <code>1</code>, and so on, as in array indexing.
* <p>
* The index argument must be greater than or equal to
* <code>0</code>, and less than the length of this sequence.
*
* <p>If the <code>char</code> value specified by the index is a
* <a href="Character.html#unicode">surrogate</a>, the surrogate
* value is returned.
*
* @param      index   the index of the desired <code>char</code> value.
* @return     the <code>char</code> value at the specified index.
* @throws     IndexOutOfBoundsException  if <code>index</code> is
*             negative or greater than or equal to <code>length()</code>.
*/
public char charAt(int index) {
// 不合法的索引值
if ((index < 0) || (index >= count))
throw new StringIndexOutOfBoundsException(index);
return value[index];
}

5. codePointAt

返回这个字符序列指定索引下的unicode代码点（Unicode code point），内部使用的是char的引用类型Character的静态方法返回unicode代码点。

补充：代码点，我译为“码位值”。每个码位值实际上代表一个真正unicode字符。即unicode字符集上的码位值。为什么要这些码位相关的方法？源自1个java的char字符并不完全等于一个unicode的字符。char采用UCS-2编码是一种淘汰的UTF-16编码，最多65536种形态，也远少于当今unicode拥有11万字符的需求。java只好对后来新增的unicode字符用2个char拼出1个unicode字符。导致String中char的数量不等于unicode字符的数量。

public int codePointAt(int index) {}

/**
* Returns the character (Unicode code point) at the specified
* index. The index refers to <code>char</code> values
* (Unicode code units) and ranges from <code>0</code> to
* {@link #length()}<code> - 1</code>.
*
* <p> If the <code>char</code> value specified at the given index
* is in the high-surrogate range, the following index is less
* than the length of this sequence, and the
* <code>char</code> value at the following index is in the
* low-surrogate range, then the supplementary code point
* corresponding to this surrogate pair is returned. Otherwise,
* the <code>char</code> value at the given index is returned.
*
* @param      index the index to the <code>char</code> values
* @return     the code point value of the character at the
*             <code>index</code>
* @exception  IndexOutOfBoundsException  if the <code>index</code>
*             argument is negative or not less than the length of this
*             sequence.
*/
public int codePointAt(int index) {
// 非法索引值
if ((index < 0) || (index >= count)) {
throw new StringIndexOutOfBoundsException(index);
}
return Character.codePointAt(value, index);
}

6. codePointBefore

返回指定索引前一位的unicode代码点，具体实现同codePointAt类似。
public int codePointBefore(int index) {}

/**
* Returns the character (Unicode code point) before the specified
* index. The index refers to <code>char</code> values
* (Unicode code units) and ranges from <code>1</code> to {@link
* #length()}.
*
* <p> If the <code>char</code> value at <code>(index - 1)</code>
* is in the low-surrogate range, <code>(index - 2)</code> is not
* negative, and the <code>char</code> value at <code>(index -
* 2)</code> is in the high-surrogate range, then the
* supplementary code point value of the surrogate pair is
* returned. If the <code>char</code> value at <code>index -
* 1</code> is an unpaired low-surrogate or a high-surrogate, the
* surrogate value is returned.
*
* @param     index the index following the code point that should be returned
* @return    the Unicode code point value before the given index.
* @exception IndexOutOfBoundsException if the <code>index</code>
*            argument is less than 1 or greater than the length
*            of this sequence.
*/
public int codePointBefore(int index) {
int i = index - 1;
if ((i < 0) || (i >= count)) {
throw new StringIndexOutOfBoundsException(index);
}
return Character.codePointBefore(value, index);
}

7. codePointCount

准确计算指定索引值范围内unicode代码点的数量，注意，并不是char的数量，这与length()方法不同，length()方法计算的是代码单元的数量。

public int codePointCount(int beginIndex, int endIndex) {}

/**
* Returns the number of Unicode code points in the specified text
* range of this sequence. The text range begins at the specified
* <code>beginIndex</code> and extends to the <code>char</code> at
* index <code>endIndex - 1</code>. Thus the length (in
* <code>char</code>s) of the text range is
* <code>endIndex-beginIndex</code>. Unpaired surrogates within
* this sequence count as one code point each.
*
* @param beginIndex the index to the first <code>char</code> of
* the text range.
* @param endIndex the index after the last <code>char</code> of
* the text range.
* @return the number of Unicode code points in the specified text
* range
* @exception IndexOutOfBoundsException if the
* <code>beginIndex</code> is negative, or <code>endIndex</code>
* is larger than the length of this sequence, or
* <code>beginIndex</code> is larger than <code>endIndex</code>.
*/
public int codePointCount(int beginIndex, int endIndex) {
if (beginIndex < 0 || endIndex > count || beginIndex > endIndex) {
throw new IndexOutOfBoundsException();
}
return Character.codePointCountImpl(value, beginIndex, endIndex-beginIndex);
}

8. offsetByCodePoints

获取指定索引处的代码点偏移量，我个人的理解是根据代码单元的偏移量查找代码点的偏移量。
如果想获取i位置的代码点，则需要使用下面的方法：

String greeting = "Hello";
int index = greeting.offsetByCodePoints(0,i);
int cp = greeting.codePointAt(index);

public int offsetByCodePoints(int index, int codePointOffset) {}

/**
* Returns the index within this sequence that is offset from the
* given <code>index</code> by <code>codePointOffset</code> code
* points. Unpaired surrogates within the text range given by
* <code>index</code> and <code>codePointOffset</code> count as
* one code point each.
*
* @param index the index to be offset
* @param codePointOffset the offset in code points
* @return the index within this sequence
* @exception IndexOutOfBoundsException if <code>index</code>
*   is negative or larger then the length of this sequence,
*   or if <code>codePointOffset</code> is positive and the subsequence
*   starting with <code>index</code> has fewer than
*   <code>codePointOffset</code> code points,
*   or if <code>codePointOffset</code> is negative and the subsequence
*   before <code>index</code> has fewer than the absolute value of
*   <code>codePointOffset</code> code points.
*/
public int offsetByCodePoints(int index, int codePointOffset) {
if (index < 0 || index > count) {
throw new IndexOutOfBoundsException();
}
return Character.offsetByCodePointsImpl(value, 0, count,
index, codePointOffset);
}

9. append
附加对象（实际上是对象的字符串表示形式）
public AbstractStringBuilder append(Object obj) {}

/**
* Appends the string representation of the <code>Object</code>
* argument.
* <p>
* The argument is converted to a string as if by the method
* <code>String.valueOf</code>, and the characters of that
* string are then appended to this sequence.
*
* @param   obj   an <code>Object</code>.
* @return  a reference to this object.
*/
public AbstractStringBuilder append(Object obj) {
return append(String.valueOf(obj)); // 使用string的valueOf方法获取对象的字符串表示形式
}

10. append
附加String
public AbstractStringBuilder append(String str) {}

/**
* Appends the specified string to this character sequence.
* <p>
* The characters of the <code>String</code> argument are appended, in
* order, increasing the length of this sequence by the length of the
* argument. If <code>str</code> is <code>null</code>, then the four
* characters <code>"null"</code> are appended.
* <p>
* Let <i>n</i> be the length of this character sequence just prior to
* execution of the <code>append</code> method. Then the character at
* index <i>k</i> in the new character sequence is equal to the character
* at index <i>k</i> in the old character sequence, if <i>k</i> is less
* than <i>n</i>; otherwise, it is equal to the character at index
* <i>k-n</i> in the argument <code>str</code>.
*
* @param   str   a string.
* @return  a reference to this object.
*/
public AbstractStringBuilder append(String str) {
if (str == null) str = "null"; // 对象为null的处理方案
int len = str.length();
if (len == 0) return this; // 待附加的String长度为零时，不处理
int newCount = count + len;
if (newCount > value.length)
expandCapacity(newCount); // 如果附加后的容量大于此char数组的容量，则进行扩展
str.getChars(0, len, value, count); // 将String复制到此char数组，使用String的getChars方法
count = newCount; // 修改此char数组的count属性（所使用的字符数量）
return this;
}

未完待续。。。

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航