C++11 Concurrency Tutorial – Part 2 : Protect shared data
2014-03-26 13:19
113 查看
In the previous article, we saw how to start threads to execute some code in parallel. All the code executed in the threads were independant. In the general case, you often use shared objects between the threads. And when you do it, you will face another problem:
synchronization.
We will see what is this problem in a simple code.
As an example, we will take a simple Counter structure. This structure has a value and methods to increment or decrement the value. Here is the structure:
There is nothing new here. Now, let’s start some threads and make some increments:
If we launch this program, we should expect that it will print 500. But this is not the case. No one can say what this program will print. Here are some results I obtained on my computer:
442
500
477
400
422
487
The problem is that the incrementation is not an atomic operation. As a matter of fact, an incrementation is made of three operations:
Read the current value of value
Add one to the current value
Write that new value to value
When you run that code using a single thread, there are no problems. It will execute each part of the operation one after another. But when you have several threads, you can start having troubles. Imagine this situation:
Thread 1 : read the value, get 0, add 1, so value = 1
Thread 2 : read the value, get 0, add 1, so value = 1
Thread 1 : write 1 to the field value and return 1
Thread 2 : write 1 to the field value and return 1
These situations come from what we call interleaving. Interleaving describes the possible situations of several threads executing some statements. Even for three operations and two threads, there is a lot of possible interleavings. When you have more threads
and more operations, it is almost impossible to enumerate the possibles interleavings. The problem can also occurs when a thread gets preempted between instructions of the operation.
There are several solutions to fix this problem:
Semaphores
Atomic references
Monitors
Condition codes
Compare and swap
etc.
In this blog post we will learn how to use semaphores to fix this problem. As a matter of fact, we will a special kind of semaphores called mutexes. A mutex is a very object. Only one thread can obtain the lock on a mutex at the same time. This simple (and
powerful) property of a mutex allow us to use it to fix synchronization problems.
the lock and the second releases the lock. The lock() method is blocking. The thread will only return from the lock() method when the lock has been obtained.
To make our Counter struct thread-safe, we have to add a set::mutex member to it and then to lock()/unlock() the mutex in every function of the object:
You want to access this structure concurrently without modifying the class. So you create a wrapper with locks for this class:
you program is completely blocked. To fix this problem, you have to use a try/catch structure to unlock the lock before throwing again the exception:
The code is not difficult but starts looking ugly. Now imagine you are in a function with 10 different exit points. You will have to call unlock() from each of these points and the probability that you will forget one is big. Even bigger is the risk that you
won’t add a call to unlock when you add a new exit point to a function.
The next section gives a very nice solution to this problem.
This class is a simple smart manager for a lock. When the std::lock_guard is created, it automatically calls lock() on the mutex. When the guard gets destructed, it also releases the lock. You can use it like this:
![](http://d1c6f9jjy1p8x6.cloudfront.net/wp-includes/images/smilies/icon_smile.gif)
With that solution, you do not have to handle all the cases of exit of the function, they are all handled by the destructor of the std::lock_guard instance.
Keep in mind that locks are slow. Indeed, when you use locks you make sections of the code sequential. If you want an highly parallel application, there are other solutions than locks that are performing much better but this is out of the scope of this article.
from http://www.baptiste-wicht.com/2012/03/cp11-concurrency-tutorial-part-2-protect-shared-data/
synchronization.
We will see what is this problem in a simple code.
Synchronization issues
As an example, we will take a simple Counter structure. This structure has a value and methods to increment or decrement the value. Here is the structure:struct Counter { int value; void increment() { ++value; } };
There is nothing new here. Now, let’s start some threads and make some increments:
int main() { Counter counter; vector<thread> threads; for (int i = 0; i < 5; ++i) { threads.push_back(thread([&counter]() { for (int i = 0; i < 100; ++i) { counter.increment(); } })); } for (auto& thread : threads) { thread.join(); } cout << counter.value << endl; return 0; }Again, nothing new there. We launch 5 threads and each one increment the counter hundred times. After all thread have finished their work, we print the value of the counter.
If we launch this program, we should expect that it will print 500. But this is not the case. No one can say what this program will print. Here are some results I obtained on my computer:
442
500
477
400
422
487
The problem is that the incrementation is not an atomic operation. As a matter of fact, an incrementation is made of three operations:
Read the current value of value
Add one to the current value
Write that new value to value
When you run that code using a single thread, there are no problems. It will execute each part of the operation one after another. But when you have several threads, you can start having troubles. Imagine this situation:
Thread 1 : read the value, get 0, add 1, so value = 1
Thread 2 : read the value, get 0, add 1, so value = 1
Thread 1 : write 1 to the field value and return 1
Thread 2 : write 1 to the field value and return 1
These situations come from what we call interleaving. Interleaving describes the possible situations of several threads executing some statements. Even for three operations and two threads, there is a lot of possible interleavings. When you have more threads
and more operations, it is almost impossible to enumerate the possibles interleavings. The problem can also occurs when a thread gets preempted between instructions of the operation.
There are several solutions to fix this problem:
Semaphores
Atomic references
Monitors
Condition codes
Compare and swap
etc.
In this blog post we will learn how to use semaphores to fix this problem. As a matter of fact, we will a special kind of semaphores called mutexes. A mutex is a very object. Only one thread can obtain the lock on a mutex at the same time. This simple (and
powerful) property of a mutex allow us to use it to fix synchronization problems.
Use a mutex to make our Counter thread-safe
In the C++11 threading library, the mutexes are in the mutex header and the class representing a mutex is the std::mutex class. There are two important methods on a mutex: lock() and unlock(). As their names indicate, the first one enable a thread to obtainthe lock and the second releases the lock. The lock() method is blocking. The thread will only return from the lock() method when the lock has been obtained.
To make our Counter struct thread-safe, we have to add a set::mutex member to it and then to lock()/unlock() the mutex in every function of the object:
struct Counter { std::mutex mutex; int value; Counter() : value(0) {} void increment() { mutex.lock(); ++value; mutex.unlock(); } };If we test now this implementation with the same code as before for starting the threads, the program will always displays 500.
Exceptions and locks
Now, let’s see what happens in another case. Imagine that the Counter has a decrement operation that throws an exception if the value is 0:struct Counter { int value; Counter() : value(0) {} void increment() { ++value; } void decrement() { if (value == 0) { throw "Value cannot be less than 0"; } --value; } };
You want to access this structure concurrently without modifying the class. So you create a wrapper with locks for this class:
struct ConcurrentCounter { std::mutex mutex; Counter counter; void increment() { mutex.lock(); counter.increment(); mutex.unlock(); } void decrement() { mutex.lock(); counter.decrement(); mutex.unlock(); } };This wrapper works well in most of the cases, but when an exception occurs in the decrement method, you have a big problem. Indeed, if an exception occurs, the unlock() function is not called and so the lock is not released. As a consequence,
you program is completely blocked. To fix this problem, you have to use a try/catch structure to unlock the lock before throwing again the exception:
void decrement() { mutex.lock(); try { counter.decrement(); } catch (string e) { mutex.unlock(); throw e; } mutex.unlock(); }
The code is not difficult but starts looking ugly. Now imagine you are in a function with 10 different exit points. You will have to call unlock() from each of these points and the probability that you will forget one is big. Even bigger is the risk that you
won’t add a call to unlock when you add a new exit point to a function.
The next section gives a very nice solution to this problem.
Automatic management of locks
When you want to protect a whole block of code (a function in our case, but can be inside a loop or another control structure), it exists a good solution to avoid forgetting to release the lock: std::lock_guard.This class is a simple smart manager for a lock. When the std::lock_guard is created, it automatically calls lock() on the mutex. When the guard gets destructed, it also releases the lock. You can use it like this:
struct ConcurrentSafeCounter { std::mutex mutex; Counter counter; void increment() { lock_guard<std::mutex> guard(mutex); counter.increment(); } void decrement() { lock_guard<std::mutex> guard(mutex); counter.decrement(); } };Much nicer, isn’t it
![](http://d1c6f9jjy1p8x6.cloudfront.net/wp-includes/images/smilies/icon_smile.gif)
With that solution, you do not have to handle all the cases of exit of the function, they are all handled by the destructor of the std::lock_guard instance.
Conclusion
We are now done with semaphores. In this article, you learned how to protect shared data using mutexes from the C++ Threads Library.Keep in mind that locks are slow. Indeed, when you use locks you make sections of the code sequential. If you want an highly parallel application, there are other solutions than locks that are performing much better but this is out of the scope of this article.
Next
In the next blog post of this serie, I will talk about advanced concepts for mutexes and how to use condition variables to fix little concurrent programming problem.from http://www.baptiste-wicht.com/2012/03/cp11-concurrency-tutorial-part-2-protect-shared-data/
相关文章推荐
- TensorFlow Data Input (Part 2): Extensions & Hacks
- 【C++11 并发编程教程 - Part 1 : thread 初探(bill译)】
- C++11 多线程编程《C++ Concurrency in Action》读书笔记(3)-Sharing data between Threads
- Concurrent Programming 1: Shared Data and Message Passing
- ASP.NET AJAX 4.0 Preview 3 (Part 1 - ADO.NET Data Service Client Library)
- Spring Data JPA Tutorial Part Nine: Conclusions(未翻译)
- Mining Twitter Data with Python Part 3: Term Frequencies
- 实战c++中的vector系列--C++11对vector成员函数的扩展(cbegin()、cend()、crbegin()、crend()、emplace()、data())
- C++11智能指针之std::shared_ptr
- C++11 并发编程教程 - Part 3 : 锁的进阶与条件变量
- LINQ to SQL系列Part 6 - Retrieving Data Using Stored Procedures
- Synchronize access to shared mutable data
- Bigtable: A Distributed Storage System for Structured Data : part9 Lessons
- C++11智能指针(一):shared_ptr介绍与实例
- c++11智能指针(二):shared_ptr和自定义的Deleter
- How to use data analysis for machine learning (example, part 1)
- A Relational Model of Data for Large Shared Data Banks
- Mining Twitter Data with Python Part 4: Rugby and Term Co-occurrences
- C++11-智能指针-shared_ptr
- Big Data Security Part One: Introducing PacketPig