自己动手写写:HashMap源码浅析
2012-04-18 10:56
225 查看
http://boy00fly.iteye.com/blog/1139845
虽说论坛中有很多关于HashMap源码的分析,并且都是分析得很不错的文章,但是我还是想写出自己的一份心德!
三. HashMap
还是先来看看HashMap的类结构吧!
Java代码
public class HashMap<K, V> extends AbstractMap<K, V> implements Map<K, V>, Cloneable, Serializable
1. HashMap的数据存储结构
HashMap采用的是一种数组+链表的存储数据结构!先来感性地看一张图:
其中数据1,2,4,15都是属于HashMap中存储的value值,至于这些值为什么存放在不同位置,这是key经过hash运算,再计算得出的;
这里有人就会问了:”这个计算出来的结果会不会重复呢?“,答案是:这种情况是很有可能发生的。接着又会问:”重复了的话,值怎么放呢?“,
此时链表的作用就发挥了,图中4和15这两个value值就是这种情况。ps:下面会详细介绍。
2. 几个重要的成员变量
Java代码
/**
* The default initial capacity - MUST be a power of two.
*/
static final int DEFAULT_INITIAL_CAPACITY = 16;
/**
* The maximum capacity, used if a higher value is implicitly specified
* by either of the constructors with arguments.
* MUST be a power of two <= 1<<30.
*/
static final int MAXIMUM_CAPACITY = 1 << 30;
/**
* The load factor used when none specified in constructor.
*/
static final float DEFAULT_LOAD_FACTOR = 0.75f;
/**
* The table, resized as necessary. Length MUST Always be a power of two.
*/
transient Entry[] table;
/**
* The number of key-value mappings contained in this map.
*/
transient int size;
/**
* The next size value at which to resize (capacity * load factor).
* @serial
*/
int threshold;
/**
* The load factor for the hash table.
*
* @serial
*/
final float loadFactor;
DEFAULT_INITIAL_CAPACITY :其实并不是HashMap的默认初始化容量,而是table数组的长度,并且值大小必须是2的幂次方;
MAXIMUM_CAPACITY:table数组的最大长度是2的30次方;
table:存储了所有的key-value mapping!
我们先来看一下Entry的源码片段:
Java代码
static class Entry<K, V> implements Map.Entry<K, V>//类结构
//重要的变量
final K key;
V value;
Entry<K, V> next;
final int hash;
Entry是HashMap的一个内部静态类,这些成员变量你们一看就应该明白的,其中next是在链表上的下一个Entry;
例如上图中:值为15的Entry的next就指向了值为4的Entry,而值为1的Entry的next为null,因为没有此链表上没有next Entry.
size:HashMap的已存储数据的数量;ps:不是table数组的长度
DEFAULT_LOAD_FACTOR:默认的加载因子是0.75f;
threshold:称之为闸阀,如果HashMap的size >= threadhold了,那么table数组就要扩容了,并且扩容率是100%,即table数组长度变为原来的两倍;
此时有人要问了:”这个threshold的值大小是怎么算出来的呢?“,源码中已经表述得很清楚了,下面是构造函数中的一个代码片段:
Java代码
// Find a power of 2 >= initialCapacity
int capacity = 1;
while (capacity < initialCapacity)
capacity <<= 1;
this.loadFactor = loadFactor;
threshold = (int)(capacity * loadFactor);
其中initialCapacity是构造函数的一个参数,意为:初始容量;明白了吧,这个initialCapacity并不能直接拿来用,要经过一定的运算保证,
初始化的table数组大小必须是2的幂次方并且不能比initialCapacity的值小。
3. 构造函数
Java代码
/**
* Constructs an empty <tt>HashMap</tt> with the specified initial
* capacity and load factor.
*
* @param initialCapacity the initial capacity
* @param loadFactor the load factor
* @throws IllegalArgumentException if the initial capacity is negative
* or the load factor is nonpositive
*/
public HashMap(int initialCapacity, float loadFactor)
{
if (initialCapacity < 0)
throw new IllegalArgumentException("Illegal initial capacity: " + initialCapacity);
if (initialCapacity > MAXIMUM_CAPACITY)
initialCapacity = MAXIMUM_CAPACITY;
if (loadFactor <= 0 || Float.isNaN(loadFactor))
throw new IllegalArgumentException("Illegal load factor: " + loadFactor);
// Find a power of 2 >= initialCapacity
int capacity = 1;
while (capacity < initialCapacity)
capacity <<= 1;
this.loadFactor = loadFactor;
threshold = (int)(capacity * loadFactor);
table = new Entry[capacity];
init();
}
上面的这个构造函数是比较重要的,另外一些构造函数都是依赖于它的。在明白了上面我描述的内容后,此构造函数理解起来是相当简单的,不在累述了!
4. 几个重要的方法
put(K key, V value)
Java代码
/**
* Associates the specified value with the specified key in this map.
* If the map previously contained a mapping for the key, the old
* value is replaced.
*
* @param key key with which the specified value is to be associated
* @param value value to be associated with the specified key
* @return the previous value associated with <tt>key</tt>, or
* <tt>null</tt> if there was no mapping for <tt>key</tt>.
* (A <tt>null</tt> return can also indicate that the map
* previously associated <tt>null</tt> with <tt>key</tt>.)
*/
public V put(K key, V value)
{
if (key == null)
return putForNullKey(value);
int hash = hash(key.hashCode());
int i = indexFor(hash, table.length);
for (Entry<K, V> e = table[i]; e != null; e = e.next)
{
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
{
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}
modCount++;
addEntry(hash, key, value, i);
return null;
}
这个方法时比较重要的,也是值得好好分析一下的,下面我们一步一步来分析:
1. key == null 时,看一下putForNullKey(V value)这个方法的源码:
Java代码
/**
* Offloaded version of put for null keys
*/
private V putForNullKey(V value)
{
for (Entry<K, V> e = table[0]; e != null; e = e.next)
{
if (e.key == null)
{
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}
modCount++;
addEntry(0, null, value, 0);
return null;
}
Java代码
/**
* Adds a new entry with the specified key, value and hash code to
* the specified bucket. It is the responsibility of this
* method to resize the table if appropriate.
*
* Subclass overrides this to alter the behavior of put method.
*/
void addEntry(int hash, K key, V value, int bucketIndex)
{
Entry<K, V> e = table[bucketIndex];
table[bucketIndex] = new Entry<K, V>(hash, key, value, e);
if (size++ >= threshold)
resize(2 * table.length);
}
这里先遍历table[0]出的链表,看是否已经存放过key为null的Entry,如果存在则替换掉此Entry的value值,否则就在table[0]处插入Entry。
ps:这里我们可以看出key为null的Entry均是放在table[0]处的,并且hash值也为0.
2. key != null 时,先通过key计算出hash值,再通过hash值运算出table的索引值i,接着循环遍历在table[i]处的链表,
看链表中的key是否已经存在,存在就替换value值,不存在就new一个Entry出来,插入的链表中,next指向插入前table[i]处的Entry!
get(Object key)
Java代码
/**
* Returns the value to which the specified key is mapped,
* or {@code null} if this map contains no mapping for the key.
*
* <p>More formally, if this map contains a mapping from a key
* {@code k} to a value {@code v} such that {@code (key==null ? k==null :
* key.equals(k))}, then this method returns {@code v}; otherwise
* it returns {@code null}. (There can be at most one such mapping.)
*
* <p>A return value of {@code null} does not <i>necessarily</i>
* indicate that the map contains no mapping for the key; it's also
* possible that the map explicitly maps the key to {@code null}.
* The {@link #containsKey containsKey} operation may be used to
* distinguish these two cases.
*
* @see #put(Object, Object)
*/
public V get(Object key)
{
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry<K, V> e = table[indexFor(hash, table.length)]; e != null; e = e.next)
{
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}
get方法也很简单,对于key值为null的做一个特殊处理,table[0]出的链表遍历一遍,有就返回value,没有就返回null,不多说了.
containsKey(Object key)和containsValue(Object value)
说一下思路吧:
containsKey就是经过一系列的运算找到key对应的table index值(当然了null key要特殊处理的,你们懂的!),再循环遍历table[index]的链表即可。
containsVlaue没有好的办法,两层循环来搞定,看源码吧:
Java代码
public boolean containsValue(Object value)
{
if (value == null)
return containsNullValue();
Entry[] tab = table;
for (int i = 0; i < tab.length; i++)
for (Entry e = tab[i]; e != null; e = e.next)
if (value.equals(e.value))
return true;
return false;
}
看到了吧,遍历数组,再遍历每一个链表。
remove(Object key)
由于remove方法就是调用了removeEntryForKey,我们来看这个方法的源码:
Java代码
/**
* Removes and returns the entry associated with the specified key
* in the HashMap. Returns null if the HashMap contains no mapping
* for this key.
*/
final Entry<K, V> removeEntryForKey(Object key)
{
int hash = (key == null) ? 0 : hash(key.hashCode());
int i = indexFor(hash, table.length);
Entry<K, V> prev = table[i];
Entry<K, V> e = prev;
while (e != null)
{
Entry<K, V> next = e.next;
Object k;
if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k))))
{
modCount++;
size--;
if (prev == e)
table[i] = next;
else
prev.next = next;
e.recordRemoval(this);
return e;
}
prev = e;
e = next;
}
return e;
}
也说一下思路吧:
经过一系列的运算找到key对应的table index值,也就找到了这个链表,遍历链表得到此key的Entry,删除此Entry,再将链表接起来,
算法细节大家就自己直接看源码吧,不再累述了!
entrySet()
Java代码
/**
* Returns a {@link Set} view of the mappings contained in this map.
* The set is backed by the map, so changes to the map are
* reflected in the set, and vice-versa. If the map is modified
* while an iteration over the set is in progress (except through
* the iterator's own <tt>remove</tt> operation, or through the
* <tt>setValue</tt> operation on a map entry returned by the
* iterator) the results of the iteration are undefined. The set
* supports element removal, which removes the corresponding
* mapping from the map, via the <tt>Iterator.remove</tt>,
* <tt>Set.remove</tt>, <tt>removeAll</tt>, <tt>retainAll</tt> and
* <tt>clear</tt> operations. It does not support the
* <tt>add</tt> or <tt>addAll</tt> operations.
*
* @return a set view of the mappings contained in this map
*/
public Set<Map.Entry<K, V>> entrySet()
{
return entrySet0();
}
为何要将一下这个方法? 论坛中也有很多谈论map遍历的效率的问题,用哪种方法效率高! 如果你能够了解HashMap的内部数据结构的话这个问题就很简单了,
当然是遍历table这个数组就行了啊,效率杠杠地!呵呵,对entrySet就是返回的这个,不过是以Set的形式返回而已!
ps:对于这个方法的细节问题我们就不讨论了,有兴趣的可以自己看源码分析!
好了,HashMap的内容暂时就这么多了,当然了还有很多的问题我们没有讨论,比如hash运算的问题,我觉得这个是另外一块的内容了,
对于了解HashMap暂且可以抛开这个问题,hash运算是个很大的讨论内容了,这里不再累述了,有兴趣的读者可以google了解下。
ps:附件中我上传了一个jar包,可以模拟Data Structure相关的运算,非常的不错!推荐下载!命令java - jar visualization.jar 就可以运行!
里面包含了hashing模拟运算过程!
也可参考一篇文章Java Map 集合类简介
visualization.jar (572.9 KB)
下载次数: 12
虽说论坛中有很多关于HashMap源码的分析,并且都是分析得很不错的文章,但是我还是想写出自己的一份心德!
三. HashMap
还是先来看看HashMap的类结构吧!
Java代码
public class HashMap<K, V> extends AbstractMap<K, V> implements Map<K, V>, Cloneable, Serializable
1. HashMap的数据存储结构
HashMap采用的是一种数组+链表的存储数据结构!先来感性地看一张图:
其中数据1,2,4,15都是属于HashMap中存储的value值,至于这些值为什么存放在不同位置,这是key经过hash运算,再计算得出的;
这里有人就会问了:”这个计算出来的结果会不会重复呢?“,答案是:这种情况是很有可能发生的。接着又会问:”重复了的话,值怎么放呢?“,
此时链表的作用就发挥了,图中4和15这两个value值就是这种情况。ps:下面会详细介绍。
2. 几个重要的成员变量
Java代码
/**
* The default initial capacity - MUST be a power of two.
*/
static final int DEFAULT_INITIAL_CAPACITY = 16;
/**
* The maximum capacity, used if a higher value is implicitly specified
* by either of the constructors with arguments.
* MUST be a power of two <= 1<<30.
*/
static final int MAXIMUM_CAPACITY = 1 << 30;
/**
* The load factor used when none specified in constructor.
*/
static final float DEFAULT_LOAD_FACTOR = 0.75f;
/**
* The table, resized as necessary. Length MUST Always be a power of two.
*/
transient Entry[] table;
/**
* The number of key-value mappings contained in this map.
*/
transient int size;
/**
* The next size value at which to resize (capacity * load factor).
* @serial
*/
int threshold;
/**
* The load factor for the hash table.
*
* @serial
*/
final float loadFactor;
DEFAULT_INITIAL_CAPACITY :其实并不是HashMap的默认初始化容量,而是table数组的长度,并且值大小必须是2的幂次方;
MAXIMUM_CAPACITY:table数组的最大长度是2的30次方;
table:存储了所有的key-value mapping!
我们先来看一下Entry的源码片段:
Java代码
static class Entry<K, V> implements Map.Entry<K, V>//类结构
//重要的变量
final K key;
V value;
Entry<K, V> next;
final int hash;
Entry是HashMap的一个内部静态类,这些成员变量你们一看就应该明白的,其中next是在链表上的下一个Entry;
例如上图中:值为15的Entry的next就指向了值为4的Entry,而值为1的Entry的next为null,因为没有此链表上没有next Entry.
size:HashMap的已存储数据的数量;ps:不是table数组的长度
DEFAULT_LOAD_FACTOR:默认的加载因子是0.75f;
threshold:称之为闸阀,如果HashMap的size >= threadhold了,那么table数组就要扩容了,并且扩容率是100%,即table数组长度变为原来的两倍;
此时有人要问了:”这个threshold的值大小是怎么算出来的呢?“,源码中已经表述得很清楚了,下面是构造函数中的一个代码片段:
Java代码
// Find a power of 2 >= initialCapacity
int capacity = 1;
while (capacity < initialCapacity)
capacity <<= 1;
this.loadFactor = loadFactor;
threshold = (int)(capacity * loadFactor);
其中initialCapacity是构造函数的一个参数,意为:初始容量;明白了吧,这个initialCapacity并不能直接拿来用,要经过一定的运算保证,
初始化的table数组大小必须是2的幂次方并且不能比initialCapacity的值小。
3. 构造函数
Java代码
/**
* Constructs an empty <tt>HashMap</tt> with the specified initial
* capacity and load factor.
*
* @param initialCapacity the initial capacity
* @param loadFactor the load factor
* @throws IllegalArgumentException if the initial capacity is negative
* or the load factor is nonpositive
*/
public HashMap(int initialCapacity, float loadFactor)
{
if (initialCapacity < 0)
throw new IllegalArgumentException("Illegal initial capacity: " + initialCapacity);
if (initialCapacity > MAXIMUM_CAPACITY)
initialCapacity = MAXIMUM_CAPACITY;
if (loadFactor <= 0 || Float.isNaN(loadFactor))
throw new IllegalArgumentException("Illegal load factor: " + loadFactor);
// Find a power of 2 >= initialCapacity
int capacity = 1;
while (capacity < initialCapacity)
capacity <<= 1;
this.loadFactor = loadFactor;
threshold = (int)(capacity * loadFactor);
table = new Entry[capacity];
init();
}
上面的这个构造函数是比较重要的,另外一些构造函数都是依赖于它的。在明白了上面我描述的内容后,此构造函数理解起来是相当简单的,不在累述了!
4. 几个重要的方法
put(K key, V value)
Java代码
/**
* Associates the specified value with the specified key in this map.
* If the map previously contained a mapping for the key, the old
* value is replaced.
*
* @param key key with which the specified value is to be associated
* @param value value to be associated with the specified key
* @return the previous value associated with <tt>key</tt>, or
* <tt>null</tt> if there was no mapping for <tt>key</tt>.
* (A <tt>null</tt> return can also indicate that the map
* previously associated <tt>null</tt> with <tt>key</tt>.)
*/
public V put(K key, V value)
{
if (key == null)
return putForNullKey(value);
int hash = hash(key.hashCode());
int i = indexFor(hash, table.length);
for (Entry<K, V> e = table[i]; e != null; e = e.next)
{
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
{
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}
modCount++;
addEntry(hash, key, value, i);
return null;
}
这个方法时比较重要的,也是值得好好分析一下的,下面我们一步一步来分析:
1. key == null 时,看一下putForNullKey(V value)这个方法的源码:
Java代码
/**
* Offloaded version of put for null keys
*/
private V putForNullKey(V value)
{
for (Entry<K, V> e = table[0]; e != null; e = e.next)
{
if (e.key == null)
{
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}
modCount++;
addEntry(0, null, value, 0);
return null;
}
Java代码
/**
* Adds a new entry with the specified key, value and hash code to
* the specified bucket. It is the responsibility of this
* method to resize the table if appropriate.
*
* Subclass overrides this to alter the behavior of put method.
*/
void addEntry(int hash, K key, V value, int bucketIndex)
{
Entry<K, V> e = table[bucketIndex];
table[bucketIndex] = new Entry<K, V>(hash, key, value, e);
if (size++ >= threshold)
resize(2 * table.length);
}
这里先遍历table[0]出的链表,看是否已经存放过key为null的Entry,如果存在则替换掉此Entry的value值,否则就在table[0]处插入Entry。
ps:这里我们可以看出key为null的Entry均是放在table[0]处的,并且hash值也为0.
2. key != null 时,先通过key计算出hash值,再通过hash值运算出table的索引值i,接着循环遍历在table[i]处的链表,
看链表中的key是否已经存在,存在就替换value值,不存在就new一个Entry出来,插入的链表中,next指向插入前table[i]处的Entry!
get(Object key)
Java代码
/**
* Returns the value to which the specified key is mapped,
* or {@code null} if this map contains no mapping for the key.
*
* <p>More formally, if this map contains a mapping from a key
* {@code k} to a value {@code v} such that {@code (key==null ? k==null :
* key.equals(k))}, then this method returns {@code v}; otherwise
* it returns {@code null}. (There can be at most one such mapping.)
*
* <p>A return value of {@code null} does not <i>necessarily</i>
* indicate that the map contains no mapping for the key; it's also
* possible that the map explicitly maps the key to {@code null}.
* The {@link #containsKey containsKey} operation may be used to
* distinguish these two cases.
*
* @see #put(Object, Object)
*/
public V get(Object key)
{
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry<K, V> e = table[indexFor(hash, table.length)]; e != null; e = e.next)
{
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}
get方法也很简单,对于key值为null的做一个特殊处理,table[0]出的链表遍历一遍,有就返回value,没有就返回null,不多说了.
containsKey(Object key)和containsValue(Object value)
说一下思路吧:
containsKey就是经过一系列的运算找到key对应的table index值(当然了null key要特殊处理的,你们懂的!),再循环遍历table[index]的链表即可。
containsVlaue没有好的办法,两层循环来搞定,看源码吧:
Java代码
public boolean containsValue(Object value)
{
if (value == null)
return containsNullValue();
Entry[] tab = table;
for (int i = 0; i < tab.length; i++)
for (Entry e = tab[i]; e != null; e = e.next)
if (value.equals(e.value))
return true;
return false;
}
看到了吧,遍历数组,再遍历每一个链表。
remove(Object key)
由于remove方法就是调用了removeEntryForKey,我们来看这个方法的源码:
Java代码
/**
* Removes and returns the entry associated with the specified key
* in the HashMap. Returns null if the HashMap contains no mapping
* for this key.
*/
final Entry<K, V> removeEntryForKey(Object key)
{
int hash = (key == null) ? 0 : hash(key.hashCode());
int i = indexFor(hash, table.length);
Entry<K, V> prev = table[i];
Entry<K, V> e = prev;
while (e != null)
{
Entry<K, V> next = e.next;
Object k;
if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k))))
{
modCount++;
size--;
if (prev == e)
table[i] = next;
else
prev.next = next;
e.recordRemoval(this);
return e;
}
prev = e;
e = next;
}
return e;
}
也说一下思路吧:
经过一系列的运算找到key对应的table index值,也就找到了这个链表,遍历链表得到此key的Entry,删除此Entry,再将链表接起来,
算法细节大家就自己直接看源码吧,不再累述了!
entrySet()
Java代码
/**
* Returns a {@link Set} view of the mappings contained in this map.
* The set is backed by the map, so changes to the map are
* reflected in the set, and vice-versa. If the map is modified
* while an iteration over the set is in progress (except through
* the iterator's own <tt>remove</tt> operation, or through the
* <tt>setValue</tt> operation on a map entry returned by the
* iterator) the results of the iteration are undefined. The set
* supports element removal, which removes the corresponding
* mapping from the map, via the <tt>Iterator.remove</tt>,
* <tt>Set.remove</tt>, <tt>removeAll</tt>, <tt>retainAll</tt> and
* <tt>clear</tt> operations. It does not support the
* <tt>add</tt> or <tt>addAll</tt> operations.
*
* @return a set view of the mappings contained in this map
*/
public Set<Map.Entry<K, V>> entrySet()
{
return entrySet0();
}
为何要将一下这个方法? 论坛中也有很多谈论map遍历的效率的问题,用哪种方法效率高! 如果你能够了解HashMap的内部数据结构的话这个问题就很简单了,
当然是遍历table这个数组就行了啊,效率杠杠地!呵呵,对entrySet就是返回的这个,不过是以Set的形式返回而已!
ps:对于这个方法的细节问题我们就不讨论了,有兴趣的可以自己看源码分析!
好了,HashMap的内容暂时就这么多了,当然了还有很多的问题我们没有讨论,比如hash运算的问题,我觉得这个是另外一块的内容了,
对于了解HashMap暂且可以抛开这个问题,hash运算是个很大的讨论内容了,这里不再累述了,有兴趣的读者可以google了解下。
ps:附件中我上传了一个jar包,可以模拟Data Structure相关的运算,非常的不错!推荐下载!命令java - jar visualization.jar 就可以运行!
里面包含了hashing模拟运算过程!
也可参考一篇文章Java Map 集合类简介
visualization.jar (572.9 KB)
下载次数: 12
相关文章推荐
- 自己动手写写:关于jvm的理解(2)
- 自己动手写写:关于jvm的理解(3)
- 自己动手写写:LinkedHashMap源码浅析
- 自己动手写写:LinkedList源码浅析
- 自己动手写写:ArrayList源码浅析
- 自己动手写写:关于jvm的理解(1)
- 自己动手写CPU之第九阶段(6)——修改最小SOPC
- 自己动手写一个轻量级的Android网络请求框架
- 自己动手实现简单权限控制
- 自己动手构建IDS
- 【C大事】第七讲:自己动手写记事本及打开对话框的使用(上)
- 自己动手编写CSDN博客备份工具-blogspider
- 自己动手写数据库连接池
- 利用php下载xls文件(自己动手写的)
- 自己动手编写java编译器
- 自己动手编写tomcat服务器(一)
- 动手实现自己的 STL 容器《2》---- list
- 自己动手写PHP-MVC框架(一)
- 自己动手写个ORM实现(3)
- 如何自己动手打jar包