Java theory and practice: Safe construction techniques
2011-04-26 23:51
621 查看
http://www.ibm.com/developerworks/java/library/j-jtp0618/index.html
Summary: The Java language offers a flexible and seemingly simple threading facility that makes it easy to incorporate multithreading into your applications. However, concurrent programming in Java applications is more complicated than it looks: there are several subtle (and not so subtle) ways to create data races and other concurrency hazards in Java programs. In this installment of Java theory and practice, Brian looks at a common threading hazard: allowing the
Testing and debugging multithreaded programs is extremely difficult, because concurrency hazards often do not manifest themselves uniformly or reliably. Most threading problems are unpredictable by their nature, and may not occur at all on certain platforms (like uniprocessor systems) or below a certain level of load. Because testing multithreaded programs for correctness is so difficult and bugs can take so long to appear, it becomes even more important to develop applications with thread safety in mind from the beginning. In this article, we're going to explore how a particular thread-safety problem -- allowing the
Following "safe construction" techniques
Analyzing programs for thread-safety violations can be very difficult and requires specialized experience. Fortunately, and perhaps surprisingly, creating thread-safe classes from the outset is not as difficult, although it requires a different specialized skill: discipline. Most concurrency errors stem from programmers attempting to break the rules in the name of convenience, perceived performance benefits, or just plain laziness. Like many other concurrency problems, you can avoid the escaped reference problem by following a few simple rules when you write constructors.
Hazardous race conditions
Most concurrency hazards boil down to some sort of data race. A data race, or race condition, occurs when multiple threads or processes are reading and writing a shared data item, and the final result depends on the order in which the threads are scheduled. Listing 1 gives an example of a simple data race in which a program may print either 0 or 1, depending on the scheduling of the threads.
Listing 1. Simple data race
The second thread could be scheduled immediately, printing the initial value of 0 for
Visibility hazards
There is actually another data race in Listing 1, besides the obvious race of whether the second thread starts executing before or after the first thread sets
Back to top
Don't publish the "this" reference during construction
One of the mistakes that can introduce a data race into your class is to expose the
Listing 2. Introducing race condition into a constructor
On first inspection, the
Listing 3. Subclassing EventListener
Because the Java language specification requires that a call to
The problem with Listing 2 is that
Back to top
Don't implicitly expose the "this" reference
It is possible to create the escaped reference problem without using the
Listing 4. No explicit use of this reference
The
Back to top
Don't start threads from within constructors
A special case of the problem in Listing 4 is starting a thread from within a constructor, because often when an object owns a thread, either that thread is an inner class or we pass the
Back to top
What do you mean by "publish"?
Not all references to the
Listing 5. Safe and unsafe practices with this
As you can see, many of the unsafe constructs in the
Back to top
More reasons not to let references escape during construction
The practices detailed above for thread-safe construction take on even more importance when we consider the effects of synchronization. For example, when thread A starts thread B, the Java Language Specification (JLS) guarantees that all variables that were visible to thread A when it starts thread B are visible to thread B, which is effectively like having an implicit synchronization in
Because of some of its more confusing aspects, the JMM is being revised under Java Community Process JSR 133, which will (among other things) change the semantics of
Back to top
Conclusion
Making a reference to an incompletely constructed object visible to another thread is clearly undesirable. After all, how can we tell the properly constructed objects from the incomplete ones? But by publishing a reference to
Resources
Doug Lea's Concurrent Programming in Java, Second Edition (Addison-Wesley, 1999) is a masterful book on the subtle issues surrounding multithreaded programming in Java applications.
Synchronization and the Java Memory Model is an excerpt from Doug Lea's book that focuses on the actual meaning of
"Double-checked locking: Clever, but broken" (JavaWorld, February 2001) and "Can double-checked locking be fixed?" (JavaWorld, May 2001) explore the JMM and the surprising consequences of failing to synchronize in certain situations.
In "Double-checked locking and the Singleton pattern" (developerWorks, May 2002), Peter Haggar gives a step-by-step explanation of how strange things can happen when you fail to synchronize.
Semantics of Multithreaded Java (PDF) details the proposed changes in the Java Memory Model as a result of JSR 133.
In "Writing multithreaded Java applications" (developerWorks, February 2001), Alex Roetter gives a basic overview of threads, synchronization, and locking in Java classes.
Find other Java technology content in the developerWorks Java technology zone.
About the author
Brian Goetz is a software consultant and has been a professional software developer for the past 15 years. He is a Principal Consultant at Quiotix, a software development and consulting firm located in Los Altos, California. See Brian's published and upcoming articles in popular industry publications.
Summary: The Java language offers a flexible and seemingly simple threading facility that makes it easy to incorporate multithreading into your applications. However, concurrent programming in Java applications is more complicated than it looks: there are several subtle (and not so subtle) ways to create data races and other concurrency hazards in Java programs. In this installment of Java theory and practice, Brian looks at a common threading hazard: allowing the
thisreference to escape during construction. This harmless-looking practice can cause unpredictable and undesirable results in your Java programs.
Testing and debugging multithreaded programs is extremely difficult, because concurrency hazards often do not manifest themselves uniformly or reliably. Most threading problems are unpredictable by their nature, and may not occur at all on certain platforms (like uniprocessor systems) or below a certain level of load. Because testing multithreaded programs for correctness is so difficult and bugs can take so long to appear, it becomes even more important to develop applications with thread safety in mind from the beginning. In this article, we're going to explore how a particular thread-safety problem -- allowing the
thisreference to escape during construction (which we'll call the escaped reference problem) -- can create some very undesirable results. We'll then establish some guidelines for writing thread-safe constructors.
Following "safe construction" techniques
Analyzing programs for thread-safety violations can be very difficult and requires specialized experience. Fortunately, and perhaps surprisingly, creating thread-safe classes from the outset is not as difficult, although it requires a different specialized skill: discipline. Most concurrency errors stem from programmers attempting to break the rules in the name of convenience, perceived performance benefits, or just plain laziness. Like many other concurrency problems, you can avoid the escaped reference problem by following a few simple rules when you write constructors.
Hazardous race conditions
Most concurrency hazards boil down to some sort of data race. A data race, or race condition, occurs when multiple threads or processes are reading and writing a shared data item, and the final result depends on the order in which the threads are scheduled. Listing 1 gives an example of a simple data race in which a program may print either 0 or 1, depending on the scheduling of the threads.
Listing 1. Simple data race
public class DataRace { static int a = 0; public static void main() { new MyThread().start(); a = 1; } public static class MyThread extends Thread { public void run() { System.out.println(a); } } } |
a. Alternately, the second thread might notrun immediately, resulting in the value 1 being printed instead. The output of this program may depend on the JDK you are using, the scheduler of the underlying operating system, or random timing artifacts. Running it multiple times could produce different results.
Visibility hazards
There is actually another data race in Listing 1, besides the obvious race of whether the second thread starts executing before or after the first thread sets
ato 1. The second race is a visibility race: the two threads are not using synchronization, which would ensure visibility of data changes across threads. Because there's no synchronization, if the second thread runs after the assignment to
ais completed by the first thread, changes made by the first thread may or may not be immediately visible to the second thread. It is possible that the second thread might still see
aas having a value of 0 even though the first thread already assigned it a value of 1. This second class of data race, where two threads are accessing the same variable in the absence of proper synchronization, is a complicated subject, but fortunately you can avoid this class of data race by using synchronization whenever you are reading a variable that might have been last written by another thread, or writing a variable that might next be read by another thread. We won't be exploring this type of data race further here, but see the "Synching up with the Java Memory Model" sidebar and the Resources section for more information on this complicated issue.
Synching up with the Java Memory Model
The keyword in Java programming enforces mutual exclusion: it ensures that only one thread is executing a given block of code at a given time. But synchronization -- or the lack thereof -- also has other more subtle consequences on multiprocessor systems with weak memory models (that is, platforms that don't necessarily provide cache coherency). Synchronization ensures that changes made by one thread become visible to other threads in a predictable manner. On some architectures, in the absence of synchronization, different threads may see memory operations appear to have been executed in a different order than they actually were executed. This is confusing, but normal -- and critical for achieving good performance on these platforms. If you just follow the rules -- synchronize every time you read a variable that might have been written by another thread or write a variable that may be read next by another thread -- then you won't have any problems. See the Resources section for more information.Back to top
Don't publish the "this" reference during construction
One of the mistakes that can introduce a data race into your class is to expose the
thisreference to another thread before the constructor has completed. Sometimes the reference is explicit, such as directly storing
thisin a static field or collection, but other times it can be implicit, such as when you publish a reference to an instance of a non-static inner class in a constructor. Constructors are not ordinary methods -- they have special semantics for initialization safety. An object is assumed to be in a predictable, consistent state after the constructor has completed, and publishing a reference to an incompletely constructed object is dangerous. Listing 2 shows an example of introducing this sort of race condition into a constructor. It may look harmless, but it contains the seeds of serious concurrency problems.
Listing 2. Introducing race condition into a constructor
public class EventListener { public EventListener(EventSource eventSource) { // do our initialization ... // register ourselves with the event source eventSource.registerListener(this); } public onEvent(Event e) { // handle the event } } |
EventListenerclass looks harmless. The registration of the listener, which publishes a reference to the new object where other threads might be able to see it, is the last thing that the constructor does. But even ignoring all the Java Memory Model (JMM) issues such as differences in visibility across threads and memory access reordering, this code still is in danger of exposing an incompletely constructed
EventListenerobject to other threads. Consider what happens when
EventListeneris subclassed, as in Listing 3:
Listing 3. Subclassing EventListener
public class RecordingEventListener extends EventListener { private final ArrayList list; public RecordingEventListener(EventSource eventSource) { super(eventSource); list = Collections.synchronizedList(new ArrayList()); } public onEvent(Event e) { list.add(e); super.onEvent(e); } public Event[] getEvents() { return (Event[]) list.toArray(new Event[0]); } } |
super()be the first statement in a subclass constructor, our not-yet-constructed event listener is already registered with the event source before we can finish the initialization of the subclass fields. Now we have a data race for the
listfield. If the event listener decides to send an event from within the registration call, or we just get unlucky and an event arrives at exactly the wrong moment,
RecordingEventListener.onEvent()could get called while
liststill has the default value of
null, and would then throw a
NullPointerExceptionexception. Class methods like
onEvent()shouldn't have to code against final fields not being initialized.
The problem with Listing 2 is that
EventListenerpublished a reference to the object being constructed before construction was complete. While it might have looked like the object was almost fully constructed, and therefore passing
thisto the event source seemed safe, looks can be deceiving. Publishing the
thisreference from within the constructor, as in Listing 2, is a time bomb waiting to explode.
Back to top
Don't implicitly expose the "this" reference
It is possible to create the escaped reference problem without using the
thisreference at all. Non-static inner classes maintain an implicit copy of the
thisreference of their parent object, so creating an anonymous inner class instance and passing it to an object visible from outside the current thread has all the same risks as exposing the
thisreference itself. Consider Listing 4, which has the same basic problem as Listing 2, but without explicit use of the
thisreference:
Listing 4. No explicit use of this reference
public class EventListener2 { public EventListener2(EventSource eventSource) { eventSource.registerListener( new EventListener() { public void onEvent(Event e) { eventReceived(e); } }); } public void eventReceived(Event e) { } } |
EventListener2class has the same disease as its
EventListenercousin in Listing 2: a reference to the object under construction is being published -- in this case indirectly -- where another thread can see it. If we were to subclass
EventListener2, we would have the same problem where the subclass method could be called before the subclass constructor completes.
Back to top
Don't start threads from within constructors
A special case of the problem in Listing 4 is starting a thread from within a constructor, because often when an object owns a thread, either that thread is an inner class or we pass the
thisreference to its constructor (or the class itself extends the
Threadclass). If an object is going to own a thread, it is best if the object provides a
start()method, just like
Threaddoes, and starts the thread from the
start()method instead of from the constructor. While this does expose some implementation details (such as the possible existence of an owned thread) of the class via the interface, which is often not desirable, in this case the risks of starting the thread from the constructor outweigh the benefit of implementation hiding.
Back to top
What do you mean by "publish"?
Not all references to the
thisreference during construction are harmful, only those that publish the reference where other threads can see it. Determining whether it is safe to share the
thisreference with another object requires detailed understanding of that object's visibility and what that object will do with the reference. Listing 5 contains some examples of safe and unsafe practices with respect to letting the
thisreference escape during construction:
Listing 5. Safe and unsafe practices with this
public class Safe { private Object me; private Set set = new HashSet(); private Thread thread; public Safe() { // Safe because "me" is not visible from any other thread me = this; // Safe because "set" is not visible from any other thread set.add(this); // Safe because MyThread won't start until construction is complete // and the constructor doesn't publish the reference thread = new MyThread(this); } public void start() { thread.start(); } private class MyThread(Object o) { private Object theObject; public MyThread(Object o) { this.theObject = o; } ... } } public class Unsafe { public static Unsafe anInstance; public static Set set = new HashSet(); private Set mySet = new HashSet(); public Unsafe() { // Unsafe because anInstance is globally visible anInstance = this; // Unsafe because SomeOtherClass.anInstance is globally visible SomeOtherClass.anInstance = this; // Unsafe because SomeOtherClass might save the "this" reference // where another thread could see it SomeOtherClass.registerObject(this); // Unsafe because set is globally visible set.add(this); // Unsafe because we are publishing a reference to mySet mySet.add(this); SomeOtherClass.someMethod(mySet); // Unsafe because the "this" object will be visible from the new // thread before the constructor completes thread = new MyThread(this); thread.start(); } public Unsafe(Collection c) { // Unsafe because "c" may be visible from other threads c.add(this); } } |
Unsafeclass bear a significant resemblance to the safe constructs in the
Safeclass. Determining whether the
thisreference can become visible to another thread can be tricky. The best strategy is to avoid using the
thisreference at all (directly or indirectly) in constructors. In reality, however, that's not always possible. Just remember to be very careful with the
thisreference and with creating instances of nonstatic inner classes in constructors.
Back to top
More reasons not to let references escape during construction
The practices detailed above for thread-safe construction take on even more importance when we consider the effects of synchronization. For example, when thread A starts thread B, the Java Language Specification (JLS) guarantees that all variables that were visible to thread A when it starts thread B are visible to thread B, which is effectively like having an implicit synchronization in
Thread.start(). If we start a thread from within a constructor, the object under construction is not completely constructed, and so we lose these visibility guarantees.
Because of some of its more confusing aspects, the JMM is being revised under Java Community Process JSR 133, which will (among other things) change the semantics of
volatileand
finalto bring them more in line with general intuition. For example, under the current JMM semantics, it is possible for a thread to see a
finalfield have more than one value over its lifetime. The new memory model semantics will prevent this, but only if a constructor is defined properly -- which means not letting the
thisreference escape during construction.
Back to top
Conclusion
Making a reference to an incompletely constructed object visible to another thread is clearly undesirable. After all, how can we tell the properly constructed objects from the incomplete ones? But by publishing a reference to
thisfrom inside a constructor -- either directly or indirectly through inner classes -- we do just that, and invite unpredictable results. To prevent this hazard, try to avoid using
this, creating instances of inner classes, or starting threads from constructors. If you cannot avoid using
thiseither directly or indirectly in a constructor, be very sure that you are not making the
thisreference visible to other threads.
Resources
Doug Lea's Concurrent Programming in Java, Second Edition (Addison-Wesley, 1999) is a masterful book on the subtle issues surrounding multithreaded programming in Java applications.
Synchronization and the Java Memory Model is an excerpt from Doug Lea's book that focuses on the actual meaning of
synchronized.
"Double-checked locking: Clever, but broken" (JavaWorld, February 2001) and "Can double-checked locking be fixed?" (JavaWorld, May 2001) explore the JMM and the surprising consequences of failing to synchronize in certain situations.
In "Double-checked locking and the Singleton pattern" (developerWorks, May 2002), Peter Haggar gives a step-by-step explanation of how strange things can happen when you fail to synchronize.
Semantics of Multithreaded Java (PDF) details the proposed changes in the Java Memory Model as a result of JSR 133.
In "Writing multithreaded Java applications" (developerWorks, February 2001), Alex Roetter gives a basic overview of threads, synchronization, and locking in Java classes.
Find other Java technology content in the developerWorks Java technology zone.
About the author
Brian Goetz is a software consultant and has been a professional software developer for the past 15 years. He is a Principal Consultant at Quiotix, a software development and consulting firm located in Los Altos, California. See Brian's published and upcoming articles in popular industry publications.
相关文章推荐
- 解读【Java theory and practice: Managing volatility】
- Java theory and practice
- Java theory and practice: Thread pools and work queues--reference
- Java theory and practice: Hashing it out
- Java theory and practice: More flexible, scalable locking in JDK 5.0
- Java theory and practice
- (泛型)Java theory and practice: Generics gotchas
- (泛型)Java theory and practice: Generics gotchas
- Java theory and practice: More flexible, scalable locking in JDK 5.0
- Java theory and practice: Thread pools and work queues
- Java theory and practice: Dealing with InterruptedException
- Java theory and practice: Fixing the Java Memory Model, Part 1
- Java theory and practice: Fixing the Java Memory Model, Part 2
- Java theory and practice: Fixing the Java Memory Model, Part 1
- Java theory and practice: Fixing the Java Memory Model, Part 2
- Java theory and practice: Good housekeeping practices
- Systems Modelling: Theory and Practice
- graphics shaders : theory and practice 2 命名规范
- HBase Tutorial: Theory and Practice of a Distributed Data Store (2)
- Java Tip 27: Typesafe constants in C++ and Java