您的位置：首页 > 编程语言 > Go语言

堆排序 [Algorithm]

2010-12-07 19:35 197 查看

老生常谈：

插入排序最坏情况O(n2), 其内循环比较紧凑，对于小规模输入是一个快速的原地（数组中某个局部）排序算法。归并排序有着渐进运行时间nlgn时间，merge不在原地操作（merge最用在整个数组中）。堆排序正是前面两者优点的整合，在nlgn时间，对n个数进行原地排序[1]。

分析

      堆是典型的完全二叉树，n个元素的堆数组中。

      n/2（隐式下取整）+1 开始是叶子节点。证明：对于任意下标为i（从0开始）的非叶子节点，有性质：2*i+1, 2*i+2为其左右孩子，有2*i+1 <= n and 2*i+2 <= n, 则最后一个非叶子结点为(int)n/2。下一个n/2+1为叶子节点。

      堆排序关键是一个调整函数，每次递归调整某个子树，使其保持最大堆或者最小堆的性质，这个函数的一个细节是作用范围，以函数参数的形式传入。

代码实现

程序中，通过宏在debug模式下，增加了一些“脚手架”，所谓的脚手架就是程序中加入输出以观察运行过程，度量代码以及组件测试的方法[2]。这种辅助工具，有点像工业制造中的工装，工装”即生产过程工艺装备：指制造过程中所用的各种工具的总称。包括刀具/夹具/模具/量具/检具/辅具/钳工工具/工位器具等[3]。

有了脚手架的代码，看上去就不那么整齐，就像去工地一样，布满工装，肯定没有装修好的高楼大厦整洁干净~ 下面是整个代码：

#include <iostream>
#include <algorithm>
#include <vector>
#include <ctime>
#include <cassert>

using namespace std;

void swap(int& a, int& b)
{
typedef int type;

type temp = a;
a = b;
b = temp;
}

/** 调节第i个结点，使保存最大堆特性
@param ptr 数组指针
@param range 堆调整的作用范围[0, range-1]
@param i 当前被调节的结点
**/
void max_heapify(int* ptr, int range, int i)
{
typedef int type;
typedef type* ptr_type;

int left = i*2+1;
int right = i*2+2;

int largest;		//记录left, i, right中最大元素下标

//if ( left <= length && ptr[left] > ptr[i] )
if ( left < range && ptr[left] > ptr[i] )
largest = left;
else
largest = i;

//if ( right <= length && ptr[right] > ptr[largest] )
//if ( right <= length && ptr[right] > ptr[i] )
if ( right < range && ptr[right] > ptr[largest] )
largest = right;

if (largest!=i)
{
swap(ptr[i], ptr[largest]);

max_heapify(ptr, range, largest);
}
}

void build_max_heap(int* ptr, int length)
{
typedef int type;
typedef type* ptr_type;

int end = length/2;	//end+1: first leaf position

//for (int i(0); i<end; i++)
for (int i=end-1; i>=0; i--)
max_heapify(ptr, length, i);
}

//
void heap_sort(int* ptr, int length)
{
typedef int type;
typedef type* ptr_type;

build_max_heap(ptr, length);

#ifdef _DEBUG
cout << "The max heap is: " << endl;
for (int t(0); t<length; t++) cout << ptr[t] << " ";
cout << endl;

cout << "Heap sort process: " << endl;
#endif

for (int lastId(length-1); lastId; lastId--)
{
swap(ptr[0], ptr[lastId]);

#ifdef _DEBUG
for (int t(0); t<lastId; t++) cout << ptr[t] << " ";
cout << "--";
#endif
max_heapify(ptr, lastId, 0);		//adjust [0, lastId-1].

#ifdef _DEBUG
for (int t(lastId); t<length; t++) cout << ptr[t] << " ";
cout << endl;
#endif
}
}

bool larger(int left, int right)
{
return left < right;
}

int main ()
{
#define P_2

#ifdef P_1
{
int myints[] = {10,20,30,5,15};
vector<int> v(myints,myints+5);
vector<int>::iterator it;

make_heap (v.begin(),v.end());
sort_heap (v.begin(),v.end());

cout << "final sorted range :";
for (unsigned i=0; i<v.size(); i++) cout << " " << v[i];
cout << endl;
}
#endif // P_1

#ifdef P_2
{
const int test_num = 10000000;	// 1 million
srand((int)time(0));

cout << "Test number " << "/t:/t/t" << test_num << std::endl;

int* testPtr = new int[test_num];
int* testPtr1 = new int[test_num];

for(int idx=0; idx<test_num; idx++)
testPtr1[idx] = testPtr[idx] = rand();

clock_t t = clock();
//build_max_heap(testPtr, test_num);
heap_sort(testPtr, test_num);
cout << "heap_sort " << "/t:/t/t" << clock() - t << std::endl;

t = clock();
make_heap(testPtr1, testPtr1+test_num, larger);
std::sort_heap(testPtr1, testPtr1+test_num, larger);
cout << "std::sort_heap " << "/t:/t/t" << clock() - t << std::endl;

// Test the result validity
for(int idx=0; idx<test_num; idx++)
if (testPtr[idx]!=testPtr1[idx])
cout << "wrong";

delete[] testPtr;	testPtr = 0;
delete[] testPtr1;	testPtr1 = 0;
}
#endif // P_2

#ifdef P_3
{
int myints[] = {5,2,4,7,1,3,6};
int length = sizeof(myints)/sizeof(myints[0]);

heap_sort(myints, length);
cout << endl;
for (int i(0); i<length; i++) cout << myints[i] << " ";
cout << endl;
}
#endif // P_3

system("PAUSE");

return 0;
}

自己写的heap_sort与STL的sort_heap比，效率差了点（主要是验证思路），实验结果如下：

Test number     :               10000000
heap_sort       :               7734
std::sort_heap  :               5556

后记

建堆有两种方案：一种思路从最后一个父亲结点，逆序调整堆；另一种思路是从第二个元素开始插入法建堆。STL中是插入法建堆。插入法建大顶堆的思路：先将一个最小数字放到堆末尾，此时满足最大堆性质；然后循环增加键值，保持大顶堆的性质。

始终有个疑问，这种插入法建立的堆是否是唯一的？测试结果：对于一堆相同的输入，插入法建堆，调整方式建堆，和STL建堆结果都不一样~

代码如下：

void heap_increase_key(int* ptr, int pos, int key)
{
ptr[pos] = key;

int parent = (pos-1)/2;
while( pos>=0 && ptr[parent]<ptr[pos] )
{
swap(ptr[pos], ptr[parent]);
pos = parent;

parent = (pos-1)/2;

// pos is 0 ==> parent is 0
// ptr[parent]==ptr[pos]
// exit the while loop
}
}

void heap_insert_key(int* ptr, int& lastId, int key)
{
//lastId++;

int imin = std::numeric_limits<int>::min();
ptr[lastId] = imin;

heap_increase_key(ptr, lastId, key);
}

void insert_build_max_heap(int* ptr, int length)
{
typedef int type;
typedef type* ptr_type;

for (int i=1; i<length; i++)
heap_insert_key(ptr, i, ptr[i]);
}

add - 2010/12/8 p.m

[1] 算法导论 - 堆排序

[2] 编程珠玑 - 程序员的忏悔

[3] 百度百科 - 工装 http://baike.baidu.com/view/907594.htm

add - 2011/9/21 p.m

Find_Top_K问题：数组N中寻找最大的K个元素：

用前K[0 - k-1]个元素，建立小顶堆K-Min-Heap；

遍历K - N-1元素，和堆顶元素比较：

小于跳过，

大于则替换堆顶元素，调整K-Min-Heap结构使之保持小顶堆特性；

K-Heap的操作时间为log2(k)

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签： algorithm insert iterator merge 工具 build

相关文章推荐

新的分享

章节导航