您的位置:首页 > 编程语言 > PHP开发

每天laravel-20160628|TaggedCache

2016-04-03 21:50 615 查看
Table of Contents

1
Introduction

2
Generations

2.1
Performance Considerations

2.2
Measurement

3
Sizing the Generations

3.1
Total Heap

3.2
The Young Generation

3.2.1
Young Generation Guarantee

4
Types of Collectors

4.1When
to Use the Throughput Collector

4.2
The Throughput Collector

4.2.1
Adaptive Sizing

4.2.2
AggressiveHeap

4.2.3
Measurements with the Throughput Collector

4.3
When to Use the Concurrent Low Pause Collector

4.4
The Concurrent Low Pause Collector

4.4.1
Overhead of Concurrency

4.4.2
Young Generation Guarantee

4.4.3
Full Collections

4.4.4
Floating Garbage

4.4.5
Pauses

4.4.6
Concurrent Phases

4.4.7
Measurements with the Concurrent Collector

4.4.8
Parallel Minor Collection Options with the Concurrent Collector

4.5
When to Use the Incremental Low Pause Collector

4.6
The Incremental Low Pause Collector

4.6.1
Measurements with the Incremental Collector

5
Other Considerations

6
Conclusion

7
Other Documentation

7.1
Example of Output

7.2
Frequently Asked Questions

 



1 Introduction

The JavaTM

2 Platform, Standard Edition (J2SETM

platform) is used for a wide variety of applications from small
applets on desktops to web services on large servers. In the J2SE
platform version 1.4.1 two new garbage collectors were introduced to
make a total of four garbage collectors from which to choose. How
should that choice be made and what are the consequences of that
choice? This document will describe some of the general features
shared by all the garbage collectors. It will then discuss tuning
options to take the best advantage of those features in the context
of the default single-threaded, stop-the-world collector. Finally, it
will discuss the specific features of the three other collectors, and
discuss the criteria for choosing one of the four collectors.

When does garbage collection performance matter to the user? For
many applications it doesn't. That is, the application can perform
within its specifications in the presence of garbage collection with
pauses of modest frequency and duration. An example where this is not
the case (when the default collector is used) would be a large
application that scales well to large number of threads, processors,
sockets, and a large amount of memory.

Amdahl observed that most workloads cannot be perfectly
parallelized

; some portion is always
sequential and does not benefit from parallelism. This is also true
for the J2SE platform. In particular, virtual
machines for the JavaTM

platform

up to and including version 1.3.1 do not have parallel garbage
collection, so the impact of garbage collection on a multiprocessor
system grows relative to an otherwise parallel application.

The graph below models an ideal system that is perfectly scalable
with the exception of garbage collection. The red line is an
application spending only 1% of the time in garbage collection on a
uniprocessor system. This translates to more than a 20% loss in
throughput on 32 processor systems. At 10% of the time in garbage
collection (not considered an outrageous amount of time in garbage
collection in uniprocessor applications) more than 75% of throughput
is lost when scaling up to 32 processors.



This shows that negligible

speed issues
when developing on small systems may become principal bottlenecks
when scaling up to large systems. However, small improvements in
reducing such a bottleneck can produce large gains in performance.
For a sufficiently large system it becomes well worthwhile to tune
the garbage collector.

The default collector should be the first choice for garbage
collection and will be adequate for the majority of applications.
Each of the other collectors have some added overhead and/or
complexity, which is the price for specialized behavior. If the
application doesn't need the specialized behavior of the alternate
collectors, use the default collector. The exception to this rule is
large applications that are heavily threaded and run on hardware with
a large amount of memory and a large number of processors. For such
applications, first try the aggressive heap option
(-XX:+AggressiveHeap

)
described below.

This document was written using the J2SE platform, version 1.4.2,
on the SolarisTM

Operating Environment (SPARC(R)

Platform Edition) as the base platform, because it provides the most
scalable hardware and software for the J2SE platform. However, the
descriptive text applies to other supported platforms, including
Linux, Microsoft Windows, and the Solaris Operating Environment (x86
Platform Edition), to the extent that scalable hardware is available.
Although command line options are consistent across platforms, some
platforms may have defaults different than those described here.



2 Generations

One strength of the J2SE platform is that it shields the
complexity of memory allocation and garbage collection from the
developer. However, once garbage collection is the principal
bottleneck, it is worth understanding some aspects of this hidden
implementation. Garbage collectors make assumptions about the way
applications use objects, and these are reflected in tunable
parameters that can be adjusted for improved performance without
sacrificing the power of the abstraction.

An object is considered garbage when it can no longer be reached
from any pointer in the running program. The most straightforward
garbage collection algorithms simply iterate over every reachable
object. Any objects left over are then considered garbage. The time
this approach takes is proportional to the number of live objects,
which is prohibitive for large applications maintaining lots of live
data.

Beginning with the J2SE platform, version 1.2, the virtual machine
incorporated a number of different garbage collection algorithms that
are combined using generational collection

. While naive
garbage collection examines every live object in the heap,
generational collection exploits several empirically observed
properties of most applications to avoid extra work.

The most important of these observed properties is infant
mortality

. The blue area in the diagram below is a typical
distribution for the lifetimes of objects. The X axis is object
lifetimes measured in bytes allocated. The byte count on the Y axis
is the total bytes in objects with the corresponding lifetime. The
sharp peak at the left represents objects that can be reclaimed
(i.e., have "died") shortly after being allocated. Iterator

objects, for example, are often alive for the duration of a single
loop.



Some objects do live longer, and so the distribution stretches out
to the the right. For instance, there are typically some objects
allocated at initialization that live until the process exits.
Between these two extremes are objects that live for the duration of
some intermediate computation, seen here as the lump to the right of
the infant mortality peak. Some applications have very different
looking distributions, but a surprisingly large number possess this
general shape. Efficient collection is made possible by focusing on
the fact that a majority of objects "die young".

To optimize for this scenario, memory is managed in generations,

or memory pools holding objects of different ages. Garbage collection
occurs in each generation when the generation fills up. Objects are
allocated in a generation for
younger objects or the

young

generation

, and because of infant mortality most objects die
there. When the young

generation fills up it causes a minor
collection.

Minor collections
can be optimized assuming a high infant mortality rate. The costs of
such collections are, to the first order, proportional to the number
of live objects being collected. A

young

generation full of dead objects is collected very quickly.

Some
surviving objects are moved to an tenured

generation

.
When the tenured

generation needs to be collected there is a
major collection

that is often much slower because it involves
all live objects.

The diagram below shows minor collections

occurring at
intervals long enough to allow many of the objects to die between
collections. It is well-tuned in the sense that the young

generation is large enough (and thus the period between minor
collections long enough) that the minor collection can take advantage
of the high infant mortality rate. This situation can be upset by
applications with unusual lifetime distributions, or by poorly sized
generations that cause collections to occur before objects have had
time to die.

The default garbage collector is meant to be used by applications
large and small. Its default parameters were designed to be effective
for most small applications. The default

parameters aren't optimal for many server applications. This
leads to the central tenet of this document:

 
If the garbage collector has become a bottleneck,
you may wish to customize the generation sizes. Check the verbose
garbage collector output, and then explore the sensitivity of your
individual performance metric to the garbage collector parameters.

 

The default arrangement of generations looks something like this.



At initialization, a maximum address space is virtually reserved
but not allocated to physical memory unless it is needed. The
complete address space reserved for object memory can be divided into
the young

and tenured

generations.

The young

generation consists of eden

plus two survivor

spaces

. Objects are initially allocated in eden. One survivor

space
is empty at any time, and serves as a destination of the next,
copying collection of any live objects in eden and the other survivor
space. Objects are copied between survivor spaces in this way until
they old enough to be tenured, or

copied to the tenured

generation.

Other virtual machines, including the production virtual machine
for the J2SE platform, version 1.2 for the Solaris Operating
Environment, used two equally sized spaces for copying rather than
one large eden plus two small spaces. This means the options for
sizing the young

generation are not directly comparable; see
the Performance
FAQ

for an example.

One portion of the tenured

generation called the permanent

generation

is special
because it holds all the reflective data of the virtual machine
itself, such as class and method objects.



2.1
Performance Considerations

There are two primary measures of garbage collection performance.
Throughput

is
the percentage of total time not spent in garbage collection,
considered over long periods of time. Throughput includes time spent
in allocation (but tuning for speed of allocation is generally not
needed.) Pauses

are
the times when an application appears unresponsive because garbage
collection is occurring.

Users have different requirements of garbage collection. For
example, some consider the right metric for a web server to be
throughput, since pauses during garbage collection may be tolerable,
or simply obscured by network latencies. However, in an interactive
graphics program even short pauses may negatively affect the user
experience.

Some users are sensitive to other considerations. Footprint

is the working set of a process, measured in pages and cache
lines. On systems with limited physical memory or many processes,
footprint may dictate scalability. Promptness

is the time between when an object becomes dead and when the
memory becomes available, an important consideration for distributed
systems, including remote method invocation

(RMI).

In general, a particular generation sizing chooses a trade-off
between these considerations. For example, a very large young

generation may maximize throughput, but does so at the expense of
footprint, promptness, and pause times. young

generation
pauses can be minimized by using a small young

generation at
the expense of throughput. To a first approximation, the sizing of
one generation does not affect the collection frequency and pause
times for another generation.

There is no one right way to size generations. The best choice is
determined by the way the application uses memory as well as user
requirements. For this reason the virtual machine's

default garbage collectior

may not be
optimal, and may be overridden by the user in the form of command
line options, described below.



2.2 Measurement

Throughput and footprint are best measured using metrics
particular to the application. For example, throughput of a web
server may be tested using a client load generator, while footprint
of the server might be measured on the Solaris Operating Environment
using the pmap

command. On the other hand, pauses due to garbage collection are
easily estimated by inspecting the diagnostic output of the virtual
machine itself.

The command line argument -verbose:gc

prints information at every collection. Note that the format of the
-verbose:gc

output is
subject to change between releases of the J2SE platform. For example,
here is output from a large server application:

  [GC
325407K->83000K(776768K), 0.2300771 secs

]

  [GC
325816K->83372K(776768K), 0.2454258 secs]

 
[Full GC 267628K->83769K(776768K), 1.8479984 secs]

Here we see two minor collections and one major one. The numbers
before and after the arrow

 
325407K->83000K
(

in the first line

)

 
indicate the combined size of live objects before and after
garbage collection, respectively. After minor collections the count
includes objects that aren't necessarily alive but can't be
reclaimed, either because they are directly alive, or because they
are within or referenced from the tenured

generation. The
number in parenthesis

(776768K)(

in
the first line)

 
is the total available space, not counting the space in the
permanent

generation, which is the total heap minus one of the
survivor spaces. The minor collection took about a quarter of a
second.

0.2300771
secs

(in
the first line)

The format for the major collection in the third line is similar.
The flag -XX:+PrintGCDetails

prints additional information about the collections. The additional
information printed with this flag is liable to change with each
version of the virtual machine. The additional output with the
-XX:+PrintGCDetails

flag
in particular changes with the needs of the development of the Java
Virtual Machine. An example of the output with -XX:+PrintGCDetails

for the J2SE platform, version 1.4.2 is shown here.

[GC [DefNew

:
64575K->959K(64576K), 0.0457646 secs] 196016K->133633K(261184K),
0.0459067 secs]]

indicates that the minor collection recovered about 98% of the
young

generation,

DefNew:
64575K->959K(64576K)

and took about 46
milliseconds.

0.0457646
secs

The usage of the entire heap
was reduced to about 51%

196016K->133633K(261184K)

and that there was some slight
additional overhead for the collection (over and above the collection
of the young

generation) as indicated by the final time:

0.0459067
secs

The flag
-XX:+PrintGCTimeStamps

will additionally print a time stamp at the start of each collection.

111.042: [GC 111.042: [DefNew:
8128K->8128K(8128K), 0.0000505 secs]111.042: [Tenured:
18154K->2311K(24576K), 0.1290354 secs] 26282K->2311K(32704K),
0.1293306 secs]

The collection starts about
111 seconds into the execution of the application. The minor
collection starts at about the same time. Additionally the
information is shown for a major collection delineated by

Tenured

.

The tenured

generation usage was
reduced to about 10%

18154K->2311K(24576K)

and took about .13 seconds.

0.1290354
secs



3 Sizing the
Generations

A number of parameters affect generation size. The following
diagram illustrates the difference between committed space and
virtual space in the heap. At initialization of the virtual machine,
the entire space for the heap is reserved. The size of the space
reserved can be specified with the -Xmx

option. If the value of the -Xms

parameter is smaller than the value of the -Xmx

parameter, not all of the space that is reserved is immediately
committed to the virtual machine. The uncommitted space is labeled
"virtual" in this figure. The different parts of the heap
(permanent

generation, tenured

generation, and young

generation) can grow to the limit of the virtual space as needed.

 

Some of the parameters are ratios of one part of the heap to
another. For example the parameter NewRatio

denotes the relative size of the tenured

generation to the
young

generation. These parameters are discussed below.





3.1 Total Heap

Since collections occur when generations fill up, throughput is
inversely proportional

to the amount of
memory available. Total available memory is the most important factor
affecting garbage collection performance.

By default, the virtual machine grows
or shrinks the heap at each collection to try to keep the proportion
of free space to live objects at each collection within a specific
range. This target range is set as a percentage by the parameters
-XX:MinHeapFreeRatio

=<minimum>

and -XX:MaxHeapFreeRatio

=<maximum>

,
and the total size is bounded below by -Xms

and above by -Xmx

.
The default parameters for the Solaris Operating Environment (SPARC
Platform Edition) are shown in this table:
-XX:MinHeapFreeRatio

=

40

-XX:MaxHeapFreeRatio=

70

-Xms

3670k

-Xmx

64m

 
With these parameters if the percent of
free space in a generation falls below 40%, the size of the
generation will be expanded so as to have 40% of the space free,
assuming the size of the generation has not already reached its
limit. Similarly, if the percent of free space exceeds 70%, the size
of the generation will be shrunk so as to have only 70% of the space
free as long as shrinking the generation does not
decrease

it below the minimum size of the generation.

 
Large server applications often
experience two problems with these defaults. One is slow startup,
because the initial heap is small and must be resized over many major

collections. A more pressing problem is that the default maximum heap
size is unreasonably small for most server applications. The rules of
thumb for server applications are:
Unless you have problems with pauses, try granting
as much memory as possible to the virtual machine. The default
size (64MB) is often too small.

Setting -Xms

and -Xmx

to the
same value increases predictability by removing the most important
sizing decision from the virtual machine. On the other hand, the
virtual machine can't compensate if you make a poor choice.

Be sure to increase the memory as you increase the
number of processors, since allocation can be parallelized.

 

A description of other virtual machine

options can be found at
http://java.sun.com
/docs/hotspot/VMOptions.html



3.2 The Young

Generation

The second most influential knob is the proportion of the heap
dedicated to the young

generation. The bigger the young

generation, the less often minor collections occur. However, for a
bounded heap size a larger young

generation implies a smaller
tenured

generation, which will increase the frequency of major
collections. The optimal choice depends on the lifetime distribution
of the objects allocated by the application.

By default, the young

generation size is controlled by
NewRatio

.

For example, setting -XX:NewRatio=3

means that the ratio between the young

and tenured

generation is 1:3. In other words, the combined size of the eden and
survivor spaces will be one fourth of the total heap size.

The parameters NewSize

and MaxNewSize

bound the young

generation size from below and above. Setting
these equal to one another fixes the young

generation, just as
setting -Xms

and -Xmx

equal fixes the total heap size. This is useful for tuning the young

generation at a finer granularity than the integral multiples allowed
by NewRatio

.



3.2.1
Young

Generation Guarantee

In an ideal minor collection the live
objects are copied from one part of the young

generation

(the eden space plus the first survivor space) to
another part of the young

generation (the second
survivor

space). However, there is no guarantee that all the
live objects will fit into the second survivor space. To ensure that
the minor collection can complete even if all the objects are live,
enough free memory must be reserved in the tenured

generation
to accommodate all the live objects. In the worst case, this reserved
memory is equal to the size of eden plus the objects in non-empty
survivor space. When there isn't enough memory available in the
tenured

generation for this worst case, a major collection
will occur instead. This policy is fine for small applications,
because the memory reserved in the tenured

generation is
typically only virtually committed but not actually used. But for
applications needing the largest possible heap, an eden bigger than
half the virtually committed size of the heap is useless: only major
collections would occur. Note that the young

generation guarantee applies to all of the collectors with the
exception of the throughput
collector

. The throughput
collector will proceed with a young

generation collection, and
if the tenured

generation cannot accommodate all the
promotions from the young

generation, both generations are
collected.

 
If desired, the parameter SurvivorRatio

can be used to tune the size of the survivor spaces, but this is
often not as important for performance. For example,
-XX:SurvivorRatio=6

sets the ratio between each survivor space and eden to be 1:6. In
other words, each survivor space will be one eighth of the young

generation (not

one seventh,
because there are two survivor spaces).

If survivor spaces are too small, copying collection overflows
directly into the tenured

generation. If survivor spaces are
too large, they will be uselessly empty. At each garbage collection
the virtual machine chooses a threshold number of times an object can
be copied before it is tenured

.
This threshold is chosen to keep the survivors half full. An option,
-XX:+PrintTenuringDistribution

,
can be used to show this threshold and the ages of objects in the new
generation. It is also useful for observing the lifetime distribution
of an application.

Here are the default values for the
Solaris Operating Environment (SPARC Platform Edition):
NewRatio

2 (client JVM:

8)

NewSize

2228k

MaxNewSize

unlimited

SurvivorRatio

32

The maximum

size
of the young

generation will be calculated from the maximum
size of the total heap and NewRatio

.
The "unlimited" default value for MaxNewSize

means that the calculated value is not limited by MaxNewSize

unless a value for MaxNewSize

is specified on the command line.
 
The rules of thumb for server
applications are:
First decide the total amount of memory you can
afford to give the virtual machine. Then graph your own
performance metric against young

generation sizes to find
the best setting.

Unless you find problems with excessive major
collection or pause times, grant plenty of memory to the young

generation.

Increasing the young

generation becomes
counterproductive at half the total heap or less (whenever the
young

generation guarantee cannot be met).

Be sure to increase the young

generation as
you increase the number of processors, since allocation can be
parallelized.

 



4 Types of Collectors

The discussion to this point has been about the default collector.
In the J2SE platform, version 1.4.2 there are three additional
collectors. Each is a generational collector which has been
implemented to emphasize the throughput of the application or low
garbage collection pause times.

The

throughput

collector: this collector uses a parallel version of the young

generation collector. It is used if the -XX:+UseParallelGC

option is passed on the command line. The tenured

generation
collector is the same as the default collector.

The

concurrent

low pause collector: this collector is used if the
-XX:+UseConcMarkSweepGC

is passed on the command line. The concurrent collector is used to
collect the tenured

generation and does most of the
collection concurrently with the execution of the application. The
application is paused for short periods during the collection. A parallel version of the young

generation copying collector is used with the concurrent collector (i.e. if -XX:+UseConcMarkSweepGC

is used on the command line then the flag
UseParNewGC

is also set to true if it is not otherwise explicitly set
on the command line).

The

incremental

(sometimes called train

)
low pause collector: this collector is used only if -Xincgc

is passed on the command line. By careful bookkeeping, the
incremental garbage collector collects just a portion of the tenured

generation at each minor collection, trying to spread the large
pause of a major collection over many minor collections. However, it
is even slower than the default tenured

generation collector
when considering overall throughput.

 

Note that -XX:+UseParallelGC

should not be used with -XX:+UseConcMarkSweepGC

.

The argument parsing in the J2SE
platform, version 1.4.2 should only allow legal combinations of
command line options for garbage collectors, but earlier releases may
not detect all illegal combinations and the results for illegal
combinations are unpredictable.

Always try the default collector on your application before trying
one of the other collectors. Tune the heap size for your application
and then consider what requirements of your application are not being
met. Based on the latter, consider using one of the other collectors.



4.1When
to Use the Throughput Collector

Use the throughput
collector when you want to improve the performance of your
application

with larger numbers of processors. In the default
collector garbage collection is done by one thread, and therefore
garbage collection adds to the serial execution time of the
application. The throughput collector uses multiple threads to
execute a minor

collection
and so reduces the serial execution time of the application. A
typical situation is one in which the application has a large number
of threads allocating objects. In such an
application

it is often the case that a large young

generation is needed.



4.2 The
Throughput Collector

The throughput collector is a generational collector similar to
the default collector but with multiple threads used to do the minor
collection. The major collections are essentially the same as with
the default collector. By default on a host with N

CPUs

,
the throughput collector uses N

garbage collector threads in
the collection. The number of garbage collector threads can be
controlled with a command line option (see below). On a host with 1
CPU the throughput collector will likely not perform as well as the
default collector because of the additional overhead for the parallel
execution (e.g., synchronization costs). On a host with 2
CPUs

the throughput collector generally performs as well as
the default garbage collector and a reduction in the minor garbage
collector pause times can be expected on hosts with more than 2 CPUs.

The throughput collector can be enabled by using command line flag
-XX:+UseParallelGC

.

The number of garbage collector threads can be controlled with the
ParallelGCThreads

command line option (-XX:ParallelGCThreads=<desired
number>

). The size of the heap needed with the throughput
collector to first order is the same as with the default collector.
Turning on the throughput collector should just make the minor
collection pauses shorter. Because there are multiple garbage
collector threads participating in the minor collection there is a
small possibility of fragmentation due to promotions from the young

generation to the tenured

generation during the collection.
Each garbage collection thread reserves a part of the tenured

generation for promotions and the division of the available space
into these "promotion buffers" can cause a fragmentation
effect. Reducing the number of garbage collector threads will reduce
this fragmentation effect as will increasing the size of the tenured

generation.



4.2.1 Adaptive
Sizing

A feature available with the throughput collector in the J2SE
platform, version 1.4.1 and later releases is the use of adaptive
sizing (-XX:+UseAdaptiveSizePolicy

),
which is on by default. Adaptive sizing keeps statistics about
garbage collection times, allocation rates, and the free space in the
heap after a collection. These statistics are
used to make decisions

regarding changes to the sizes of the
young

generation and tenured

generation so as to best
fit the behavior of the application. Use the command line option
-verbose:gc

to see the
resulting sizes of the heap.



4.2.2 AggressiveHeap

The -XX:+AggressiveHeap

option inspects the machine resources (size of memory and number of
processors) and attempts to set various parameters to be optimal for
long-running, memory allocation-intensive jobs. It was originally
intended for machines with large amounts of memory and a large number
of CPUs, but in the J2SE platform, version 1.4.1 and later it has
shown itself to be useful even on four processor machines. With this
option the throughput collector (-XX:+UseParallelGC

)
is used along with adaptive sizing (

-XX:+UseAdaptiveSizePolicy

).
The physical memory on the machines must be at least 256MB before

AggressiveHeap

can be used. The size of the initial heap is calculated based on the
size of the physical memory and attempts to make maximal use of the
physical memory for the heap (i.e., the algorithms attempt to use
heaps nearly as large as the total physical memory).



4.2.3 Measurements with the Throughput Collector

The verbose garbage
collector output is the same for the throughput collector as with the
default collector.



4.3 When to Use the Concurrent Low Pause Collector

Use the concurrent low pause collector if your application would
benefit from shorter garbage collector pauses and can afford to share
processor resources with the garbage collector when the application
is running. Typically applications which have a relatively large set
of long-lived data (a large tenured

generation), and run on
machines with two or more processors tend to benefit from the use of
this collector. However, this collector should be considered for any
application with a low pause time requirement. Optimal results have
been observed for interactive applications with tenured

generations of a modest size on a single processor.



4.4
The Concurrent Low Pause Collector

The concurrent low pause collector is a generational collector
similar to the default collector. The tenured

generation is
collected concurrently with this collector.

This collector attempts to reduce the pause times needed to
collect the tenured

generation. It uses a separate garbage
collector thread to do parts of the major collection concurrently
with the applications threads. The concurrent collector is enabled
with the command line option -XX:+UseConcMarkSweepGC

.
For each major collection the concurrent collector will pause all the
application threads for a brief period at the beginning of the
collection and toward the middle of the collection. The second pause
tends to be the longer of the two pauses and
multiple

threads are used to do the collection work during
that pause. The remainder

of the collection
is done with a garbage collector thread that runs concurrently with the application. The minor
collections are done in a manner similar to the default collector although
multiple garbage collector threads are used to reduce the minor collection times. See
"Parallel Minor Collection Options with the Concurrent Collector" below
for information on using multiple threads with the concurrent low pause
collector.

The techniques used in the concurrent collector (for the
collection of the tenured

generation) are described at:
http://research.sun.com
/techrep/2000/abstract-88.html



4.4.1
Overhead of Concurrency

The concurrent collector trades processor resources (which would
otherwise be available to the application) for shorter major
collection pause times. The concurrent part of the collection is done
by a single garbage collection thread. On an N

processor
system when the concurrent part of the collection is running, it will
be using 1/N

th

of the available
processor power. On a uniprocessor machine it would be fortuitous if
it provided any advantage. It conceivably

could
break up a single long pause into several shorter pauses (a pause
being defined in this case as the absence of any application threads
running) but that is not the intent of the concurrent collector. The
concurrent collector also has some additional overhead costs that
will take away from the throughput of the applications, and some
inherent disadvantages (e.g., fragmentation) for some types of
applications. On a two processor machine there is a processor
available for applications threads while the concurrent part of the
collection is running, so running the concurrent garbage collector
thread does not "pause" the application. There may be
reduced pause times as intended for the concurrent collector but
again less processor resources are available to the application and
some slowdown of the application should be expected. As N

increases, the reduction in processor resources due to the running of
the concurrent garbage collector thread becomes less, and the
advantages of the concurrent collector become more.



4.4.2
Young

Generation Guarantee

As with the default
collector a minor

collection
may require enough space in the tenured

generation to
accommodate all the objects in eden and one survivor space. Because
fragmentation can occur in a concurrent collection, the requirement
for this guarantee is more severe with the concurrent collector.
There has to be enough contiguous space available in the tenured

generation for all the objects in eden and one survivor space because
there is no a priori

way (except at a significant performance
cost) to know the distribution of the sizes in eden and the one
survivor space. A larger heap is almost always needed when the
concurrent collector is used as compared to the default collector. As
with the default collector the space in the tenured

generation
must be reserved but does not actually have to be used. As a rough
estimate choose the appropriate young

generation and tenured

generation heap sizes as would be appropriate for the default
collector, and then increase the tenured

generation size by
the equivalent of the young

generation size for the concurrent
collector. This is a very rough approximation and the correct values
are application dependent.



4.4.3 Full
Collections

The concurrent collector uses a single garbage collector thread
that runs simultaneously with the application threads with the goal
of completing the collection of the tenured

generation before
it becomes full. In normal operation, the concurrent collector is
able to do most of its work with the application threads still
running, so only brief pauses are seen by the application threads. As
a fall back, if the concurrent collector is unable to finish before
the tenured

generation fills up, the application is paused and
the collection is completed with all the application threads stopped.
Such collections with the application stopped are referred to as full
collections and are a sign that some adjustments need to be made to
the concurrent collection parameters.



4.4.4 Floating
Garbage

A garbage collector works to find the
live objects in the heap. Because application threads and the garbage
collector thread run concurrently, objects that are found to be alive
by the garbage collector thread may become dead by the time
collection finishes. Such objects are referred to as floating
garbage. The amount of floating garbage depends on the length of the
concurrent collection (more time for the applications threads to
discard

an object) and on the particulars of the application.
As a rough rule of thumb try increasing the size of the tenured

generation by 20% to account for the floating garbage. Floating
garbage is collected at the next garbage collection.



4.4.5 Pauses

The concurrent collector pauses an application twice during a
concurrent collection cycle. The first pause is to mark as live the
objects directly reachable from the roots (e.g., objects on thread
stack, static objects and so on) and elsewhere in the heap

(e.g., the young

generation). This first pause is referred to
as the initial mark. The second pause comes at the end of the marking
phase and finds objects that were missed during the concurrent
marking phase due to the concurrent execution of the application
threads. The second pause is referred to as the remark.



4.4.6 Concurrent
Phases

The concurrent marking occurs between
the initial mark and the remark. During the concurrent marking the
concurrent garbage collector thread is executing and using processor
resources that would otherwise be available to the application. After
the remark there is a concurrent sweeping phase which collects the
dead objects. During this phase the concurrent garbage collector
thread is again taking processor resources from the application.
After the sweeping phase the concurrent collector sleeps until the
start of the next major

collection.



4.4.7 Measurements with the Concurrent Collector

Below

is
output

for

-verbose:gc

with -XX:+PrintGCDetails

(some details have been removed). Note that the output for the
concurrent collector is interspersed with the output from the minor

collections. Typically many minor

collections will occur during a concurrent collection cycle. The
CMS-initial-mark:

indicates the start of the concurrent collection cycle. The
CMS-concurrent-mark:

indicates the end of the concurrent marking phase as
CMS-concurrent-sweep:

marks the end of the concurrent sweeping phase. Not discussed before
is the precleaning

phase indicated by
CMS-concurrent-preclean

:

which represents work that can be done concurrently and is in
preparation for the remark phase CMS-remark

.
The final phase is indicated by the CMS-concurrent-reset:

and is in preparation for the next concurrent collection.

[GC [1
CMS-initial-mark: 13991K(20288K)] 14103K(22400K), 0.0023781 secs

]

[GC [DefNew

:
2112K->64K(2112K), 0.0837052 secs] 16103K->15476K(22400K),
0.0838519 secs]

...

[GC [DefNew:
2077K->63K(2112K), 0.0126205 secs] 17552K->15855K(22400K),
0.0127482 secs]

[CMS-concurrent-mark:
0.267/0.374 secs]

[GC [DefNew:
2111K->64K(2112K), 0.0190851 secs] 17903K->16154K(22400K),
0.0191903 secs]

[CMS-concurrent-preclean

:
0.044/0.064 secs]

[GC[1 CMS-remark:
16090K(20288K)] 17242K(22400K), 0.0210460 secs]

[GC [DefNew:
2112K->63K(2112K), 0.0716116 secs] 18177K->17382K(22400K),
0.0718204 secs]

[GC [DefNew:
2111K->63K(2112K), 0.0830392 secs] 19363K->18757K(22400K),
0.0832943 secs]

...

[GC [DefNew:
2111K->0K(2112K), 0.0035190 secs] 17527K->15479K(22400K),
0.0036052 secs]

[CMS-concurrent-sweep:
0.291/0.662 secs]

[GC [DefNew:
2048K->0K(2112K), 0.0013347 secs] 17527K->15479K(27912K),
0.0014231 secs]

[CMS-concurrent-reset:
0.016/0.016 secs]

[GC [DefNew:
2048K->1K(2112K), 0.0013936 secs] 17527K->15479K(27912K),
0.0014814 secs]

 

The initial mark pause is typically short relative to the minor
collection pause time. The times of the concurrent phases (concurrent
mark, concurrent precleaning

, and
concurrent sweep) may be relatively long (as in the example a
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: