您的位置:首页 > 其它

G1 算法论文

2016-07-20 00:14 330 查看
GC相关术语

 Compaction – The garbage collection phase that defragments the heap, moves objects in

memory, remaps all affected references and frees contiguous memory regions.

 Concurrent – A type of garbage collection algorithm that where GC is done while the application

is running.

 Copying – A garbage collector that copies that performs mark/sweep/compact all at once by

copying live objects to a new area in memory.

 Dead object – An object that is no longer being referenced by the application.

 GC safepoint - A point or range in a thread’s execution where the collector can identify all the

references in the thread’s execution stack.

 Generational – Objects in memory are split between a young generation and old generation and

garbage collected separately.

 Incremental – Garbage collects only a portion of the heap at a time.

 Live object – One that is still being referenced by the application.

 Marking – The garbage collection phase that identifies all live objects in the heap.

 Monolithic – Garbage collects the entire heap at once.

 Mutator – Your application, which is changing (mutating) references in memory.

 Parallel – A collector that uses multiple threads.

 Pause – Time period when the application is stopped while garbage collection is occurring.

 Precise – A precise collector knows exactly where every possible object reference is.

 Promotion – Allocating an object from the young generation to the old generation of the heap.

 Remembered Set - Tracks all references into the young generation from the outside so the

collector doesn’t have to scan for them.

 Roots – Starting points for the garbage collector to find live objects.

 Stop-the-World – Indicates that the garbage collector stops application processing to collect the

heap.

 Sweeping – The garbage collection phase that locates dead objects

card table用在并行mark中,作为一个write barrier。当GC在mark对象时,如果有mutator修改了一个已经被GC mark过的对象,该对象就会记录在card table中,让GC再次访问其成员,再次进行mark。当再次mark时,需要用Stop-the-world来暂停

2 数据结构

2.1 Heap Layout/Heap Regions/Allocation

heap 被分成等大小的Heap Regions, 都是连续的空间地址

Heap Region 上的Allocation 由一个不断增长的top 指针组成,分割已分配区域和未分配区域

TLAB: thread-local allocation buffer, 每个线程有一个TLAB,优先在TLAB上分配

CAS: compare-and-swap

current allocation region: 当前Allocation正在使用的heap region

空heap region 用链表管理起来

larger object 直接在current allocation region上分配,不走TLAB

超过heap region 3/4大小的对象称为极大对象(humongous object), 极大对象占用一个单独的heap region

2.2 Remembered Set

为获取新生代中的活动对象,但是有不能遍历老年代,为了达到这个目的,引入了Remembered Set

一个heap region有一个对应的Remembered Set, 包含当前region中的活动对象的指针

card table region中每512字节对应card table中1字节

Remembered Set 是card的集合(用hash table实现)

每个线程关联一个RS log buffer, 该buffer包含当前修改的card。全局有一个满RS buffer集合

实际实现中,为支持并行GC, 一个region会有多个RS, 每个RS 对应一个并行的GC 线程

RS 写屏障(write barrier):当修改了对象的指针后,需要插入对RS的更新

RS写屏障

x.f = y; //原始语句
//do others

//RS 写屏障实现
rTemp = rX XOR rY
rTemp = rTemp >> LogOfHeapRegionSize //判断rX与rY是否在同一region内
rTemp = (rY == NULL) then 0 else rTemp
if (rTemp == 0) goto filtered
call rs_enqueue(rX)
filtered:
//do others


以上处理:

若x与y属于同一region, 不做任何处理,不需要记录在RS中

rs_enqueue 读取rX所在的card table内的项,如果该项已经dirty,不做处理;否则将其置脏,并将该指针放入Remembered Set log的队列中

当RS log buffer满后(默认256个元素),将其放入到全局 filled RS buffer中,并为该线程分配一个新的空RS log buffer

存在一个并发RS线程,该线程等到filled RS buffer队列的大小达到一个定义好的阀值(默认为5个buffer)时,会开始处理filled RS buffer队列,直到队列数目下降到预订阀值的1/4

RS线程处理buffer中的card table的每个指针项

有些card是热点,即被频繁写入。对于热点card,采用延迟的办法,延迟到下次停顿时处理

追踪热点card的方法是用一个辅助card table,记录自上次停顿以来,card被置脏的次数(每次停顿时,card table会被清空)

当处理card时,会对其计数,当超越热点阀值时(默认为4),该card被添加到一个被称为hot queue的循环buffer中(hot queue默认大小1K), hot queue也会在停顿期间被处理,然后被清空

若hot queue已经满了,从队列头取出一个card并立即处理,然后再把新的card加入队尾

RS线程处理未到达热点的card和被从hot queue驱逐出来的card

RS线程处理card的过程

将card所对应card table中的条目清除,以便让其他线程可以继续置脏

检查指针域内所有可能修改了card 的对象,找出指向外部heap的指针,若找到了这样的指针,就将card插入到reference region中

2.3 暂停处理 (Eacuation Pause)

并行处理方法:

线性的选择collect set

GC线程竞相执行以下任务:

扫描pending 的R Slog buffer,以更新RS

扫描RS和其他root对象

清除活动对象

GC线程没有明确的同步机制
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: