#EPI#Find running median from a stream of integers
2015-07-29 11:01
483 查看
连续stream取median,如果数据量不大,维护两个heap,一个max heap存小半部分,一个min heap存大的半部分,只看两个heap的root值就可以得到median,For the first two elements add smaller one to the maxHeap on the left, and bigger one to the minHeap on the right. Then process stream data one by one,
Step 1: Add next item to one of the heaps if next item is smaller than maxHeap root add it to maxHeap, else add it to minHeap Step 2: Balance the heaps (after this step heaps will be either balanced or one of them will contain 1 more item) if number of elements in one of the heaps is greater than the other by more than 1, remove the root element from the one containing more elements and add to the other oneThen at any given time you can calculate median like this:
If the heaps contain equal elements; median = (root of maxHeap + root of minHeap)/2 Else median = root of the heap with more elements如果数据非常多,counting sortIf you can't hold all the items in memory at once, this problem becomes much harder. The heap solution requires you to hold all the elements in memory at once. This is not possible in most real world applications of this problem.Instead, as you see numbers, keep track of the count of the number of times you see each integer. Assuming 4 byte integers, that's 2^32 buckets, or at most 2^33 integers (key and count for each int),which is 2^35 bytes or 32GB. It will likely be much less than this because you don't need to store the key or count for those entries that are 0 (ie. like a defaultdict in python). This takes constant time to insert each new integer.Then at any point, to find the median, just use the counts to determine which integer is the middle element. This takes constant time (albeit a large constant, but constant nonetheless).reference:http://stackoverflow.com/questions/10657503/find-running-median-from-a-stream-of-integers https://gist.github.com/Vedrana/3675434 import java.util.Comparator;import java.util.PriorityQueue;import java.util.Queue;// Given a stream of unsorted integers, find the median element in sorted order at any given time.// http://www.ardendertat.com/2011/11/03/programming-interview-questions-13-median-of-integer-stream/ public class MedianOfIntegerStream {public Queue<Integer> minHeap;public Queue<Integer> maxHeap;public int numOfElements;public MedianOfIntegerStream() {minHeap = new PriorityQueue<Integer>();maxHeap = new PriorityQueue<Integer>(10, new MaxHeapComparator());numOfElements = 0;}public void addNumberToStream(Integer num) {maxHeap.add(num);if (numOfElements%2 == 0) {if (minHeap.isEmpty()) {numOfElements++;return;}else if (maxHeap.peek() > minHeap.peek()) {Integer maxHeapRoot = maxHeap.poll();Integer minHeapRoot = minHeap.poll();maxHeap.add(minHeapRoot);minHeap.add(maxHeapRoot);}} else {minHeap.add(maxHeap.poll());}numOfElements++;}public Double getMedian() {if (numOfElements%2 != 0)return new Double(maxHeap.peek());elsereturn (maxHeap.peek() + minHeap.peek()) / 2.0;}private class MaxHeapComparator implements Comparator<Integer> {@Overridepublic int compare(Integer o1, Integer o2) {return o2 - o1;}}public static void main(String[] args) {MedianOfIntegerStream streamMedian = new MedianOfIntegerStream();streamMedian.addNumberToStream(1);System.out.println(streamMedian.getMedian()); // should be 1streamMedian.addNumberToStream(5);streamMedian.addNumberToStream(10);streamMedian.addNumberToStream(12);streamMedian.addNumberToStream(2);System.out.println(streamMedian.getMedian()); // should be 5streamMedian.addNumberToStream(3);streamMedian.addNumberToStream(8);streamMedian.addNumberToStream(9);System.out.println(streamMedian.getMedian()); // should be 6.5}}
相关文章推荐
- android 八个月学习计划表
- vs2010 问题 >LINK : fatal error LNK1123: 转换到 COFF 期间失败: 文件无效或损坏
- Linux下从信号量看线程调度时间
- STL具体操作之set
- Unity协程(Coroutine)原理深入剖析
- springmvc+hibernate的一个简单实例 推荐
- NSArray,NSSet,NSDictionary
- IOS 整体框架类图值得收藏
- 【HttpClient4.5中文教程】译者的话,目录,序言
- SQL Server Profiler工具
- 纯CSS绘制三角形(各种角度)
- 翻转二叉树
- ios文件预览以及使用其他应用打开文件
- R语言与机器学习-学习笔记2(数据探索及理解)
- 服务器主机IP禁ping 之 Windows平台
- android-javascript调用java方法获取html内容
- [模拟] 多校联合第三场 painter HDU 5319
- Android中自定义View的MeasureSpec使用
- mysql中case-when-then和oracle的decode函数
- 安全驾驶-车颜色与安全(二)