您的位置:首页 > 其它

标准非STL容器 : bitset

2016-04-19 11:12 344 查看
1. 概念

什么是“标准非STL容器”?标准非STL容器是指“可以认为它们是容器,但是他们并不满足STL容器的所有要求”。前文提到的容器适配器stack、queue及priority_queue都是标准非STL容器的一部分。此外,valarray也是标准非STL容器。

bitset:一种高效位集合操作容器。

2. API

bitset提供的api:

(constructor) Construct bitset (public member function)

operator[] Access bit (public member function)

set Set bits (public member function)

reset Reset bits (public member function )

flip Flip bits (public member function)

to_ulong Convert to unsigned long integer (public member function)

to_string Convert to string (public member function)

count Count bits set (public member function)

size Return size (public member function)

test Return bit value (public member function )

any Test if any bit is set (public member function)

none Test if no bit is set (public member function)

3. 源码剖析

SGI bitset部分实现源码

[cpp] view
plain copy

template<size_t _Nb>

class bitset : private _Base_bitset<__BITSET_WORDS(_Nb)>

{

private:

typedef _Base_bitset<__BITSET_WORDS(_Nb)> _Base;

typedef unsigned long _WordT;

private:

void _M_do_sanitize() {

_Sanitize<_Nb%__BITS_PER_WORD>::_M_do_sanitize(this->_M_hiword());

}

.....

}

[cpp] view
plain copy

#define __BITS_PER_WORD (CHAR_BIT*sizeof(unsigned long))

#define __BITSET_WORDS(__n) \

((__n) < 1 ? 1 : ((__n) + __BITS_PER_WORD - 1)/__BITS_PER_WORD)

[cpp] view
plain copy

template<size_t _Nw>

struct _Base_bitset {

typedef unsigned long _WordT;

_WordT _M_w[_Nw]; // 0 is the least significant word.

_Base_bitset( void ) { _M_do_reset(); }

_Base_bitset(unsigned long __val) {

_M_do_reset();

_M_w[0] = __val;

}

static size_t _S_whichword( size_t __pos )

{ return __pos / __BITS_PER_WORD; }

static size_t _S_whichbyte( size_t __pos )

{ return (__pos % __BITS_PER_WORD) / CHAR_BIT; }

static size_t _S_whichbit( size_t __pos )

{ return __pos % __BITS_PER_WORD; }

static _WordT _S_maskbit( size_t __pos )

{ return (static_cast<_WordT>(1)) << _S_whichbit(__pos); }

_WordT& _M_getword(size_t __pos) { return _M_w[_S_whichword(__pos)]; }

_WordT _M_getword(size_t __pos) const { return _M_w[_S_whichword(__pos)]; }

_WordT& _M_hiword() { return _M_w[_Nw - 1]; }

_WordT _M_hiword() const { return _M_w[_Nw - 1]; }

void _M_do_and(const _Base_bitset<_Nw>& __x) {

for ( size_t __i = 0; __i < _Nw; __i++ ) {

_M_w[__i] &= __x._M_w[__i];

}

}

void _M_do_or(const _Base_bitset<_Nw>& __x) {

for ( size_t __i = 0; __i < _Nw; __i++ ) {

_M_w[__i] |= __x._M_w[__i];

}

}

void _M_do_xor(const _Base_bitset<_Nw>& __x) {

for ( size_t __i = 0; __i < _Nw; __i++ ) {

_M_w[__i] ^= __x._M_w[__i];

}

}

节选上述代码,可以得到:

1. bitset继承_Base_bitset,具体操作封装在_Base_bitset中

2. bitset 的size作为模板参数(非类型模板参数的一个要求是,编译器能在编译期就能把参数确定下来),因此,bitset大小在编译期固定,不支持插入和删除元素

3. 各种位操作,性能高

4._Base_bitset使unsigned long作为底层存储,不支持指针、引用、迭代器

5. 使用 _WordT _M_w[_Nw];分配内存,因此在栈中定义bitset需要注意大小(和STL标准容器堆内存分配区别开)。

eg,下面的代码将栈溢出(测试机器栈内存10M)

[cpp] view
plain copy

void fun()

{

const int n = 800000000;

bitset<n> a;

cout << a.size() << endl;

}

int main(int argc, char** argv)

{

fun();

return 0;

}

大内存分配可以分配在堆中,如下:

[cpp] view
plain copy

const int n = 800000000;

bitset<n> *a = new(std::nothrow) bitset<n>;

if(a)

{

cout << a->size() << endl;

delete a;

a = NULL;

}

4. vector<bool>及deque<bool>

bitset高效,但是size必须在编译器确定,不支持插入和删除。因此,一个可能的替代品是vector<bool>和deque<bool>

两者的区别:

vector<bool>不是一个STL容器,并且不容纳bool(like bitse底层t机制)

deque<bool>是一个STL容器,它保存真正的bool值

分别运行

[cpp] view
plain copy

deque<bool> a;

a[0] = 0;

bool* b = &a[0];

cout << *b << endl;



[cpp] view
plain copy

vector<bool> a;

a[0] = 0;

bool* b = &a[0];

cout << *b << endl;

将会发现:

使用deque<bool>正确,而是用vector<bool>会报错:“cannot convert `std::_Bit_reference*' to `bool*' in initialization“

但是,deque简直是在践踏内存。

使用deque<bool>

[cpp] view
plain copy

int main(int argc, char** argv)

{

deque<bool> a(10000000000);

sleep(100);

return 0;

}

内存使用:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

23612 work 25 0 9990m 9.8g
720 S 0.0 65.0 0:39.35 test

使用vector<bool>

[cpp] view
plain copy

int main(int argc, char** argv)

{

vector<bool> a(10000000000);

sleep(100);

return 0;

}

内存使用:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

23909 work 25 0 1198m 1.2g
716 S 0.0 7.8 0:01.31 test

使用bitset

[cpp] view
plain copy

int main(int argc, char** argv)

{

const unsigned long int n = 10000000000;

bitset<n> *a = new(std::nothrow) bitset<n>;

sleep(100);

return 0;

}

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

24439 work 25 0 1198m 1.2g
712 S 30.7 7.8 0:00.92 test

10亿个bool,vector<bool>和bitset使用内存1198M,deque<bool>则是9990M

5. 总结

在需要对位集合进行操作的时候,如何操作集合大小比较固定,优先选择高效的bitset;

如果需要动态增删元素,或者编译期间无法确定集合大小,则可以考虑vector<bool>,deque<bool>内存开销太大,基本上不考虑。

参考:
http://www.sgi.com/tech/stl/download.html http://www.cplusplus.com/reference/stl/vector/ http://www.cplusplus.com/reference/stl/bitset/
扩展阅读:


Vector specialization: vector<bool>

The vector class template has a special template specialization for the bool type.

This specialization is provided to optimize for space allocation: In this template specialization, each element occupies only one bit (which is eight times less than the
smallest type in C++:char).

The references to elements of a bool vector
returned by the vector members are not references tobool objects,
but a special member type which is a reference to a single bit, defined inside thevector<bool> class
specialization as:

[cpp] view
plain copy

class vector<bool>::reference {

friend class vector;

reference(); // no public constructor

public:

~reference();

operator bool () const; // convert to bool

reference& operator= ( const bool x ); // assign from bool

reference& operator= ( const reference& x ); // assign from bit

void flip(); // flip bit value.

}
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: