Heap

Heap

Table of content

Basic concept

  1. Heap is data structure organized by a binary tree in which the value of a parent node is greater than its children (max heap).
  2. Heap in Python is implemented with the heapq module including the following functionalities
    1. heapq.heappush(heap, item): push an item into heap
    2. heapq.heappop(heap): pop an item from heap
    3. heapq.pushpop(heap, item): push an item into heap and pop an item from heap
    4. heapq.heapify(list): transform a list into heap
    5. heapq.heapreplace(heap, tiem): pop an item from heap and push an new item from heap
  3. Time complexity of a binary heap
    1. Insertion: O(logN)
    2. Deletion of min: O(logN)
    3. Remove: O(logN)
    4. findMin: O(1)
  4. Time Complexity compare to other data structures

    Structure Insertion Deletion of Min Remove findMin sort
    heap O(logN) O(logN) O(logN) O(1) O(NlogN)
    orderedarray O(N) O(1) O(1) O(N) O(N)
    unordered array O(1) O(N) O(1) O(N) O(NlogN)
    BST O(logN) O(logN) O(logN) O(logN) O(N)

Related coding exercises

Kth Largest Element in an Array

  1. Kth Largest Element in an Array problem is defined as to find the kth largest element in an unsorted array. Note that it is the kth largest element in the sorted order, not the kth distinct element.
  2. The problem can be approached with a heap data structure where
    1. First to construct a heap array with all numbers in O(nlogn) time
    2. Then pop k times, each pop operation uses O(1) time
  3. An example Python solution using heap and internal sort is given as the following
    class Solution:
     # @param {integer[]} nums
     # @param {integer} k
     # @return {integer}
     def findKthLargest(self, nums, k):
         '''
         '''
         #return solution_sort(nums,k)
         return solution_heap(nums,k)
    import heapq
    def solution_heap(nums,k):
     h = []
     for num in nums:
         h.append(-num)
     heapq.heapify(h)
     print h 
     res = []
     for i in xrange(k):
         res = (heapq.heappop(h))
         print res
     return -res
     pass
    def solution_sort(nums,k):
     '''
     use build-in sort
     '''
     if len(nums) == 0: return None
     nums.sort()
     return nums[len(nums)-k]
     pass
    

Sliding Window Maximum

  1. Sliding Window Maximum problem is defined as given an array nums, there is a sliding window of size k which is moving from the very left of the array to the very right. You can only see the k numbers in the window. Each time the sliding window moves right by one position.
  2. heap approach: add element into a heap one by another, in each step
    1. Pop an element from heap until the index is within the k different from the current index
    2. The solution roughly takes O(nlogn) time
  3. ‘deque’ approach:
    1. Maintain a deque structure
    2. In each iteration
      1. if the deque is empty, add current number into the head of deque
      2. pop the element from the tail of the queue when its index is out of range
  4. An example Python solution is given as the following
    class Solution:
     # @param {integer[]} nums
     # @param {integer} k
     # @return {integer[]}
     def maxSlidingWindow(self, nums, k):
         return solution(nums,k)
    def solution(nums,k):
     '''
     29
     '''
     if len(nums) == 0: return nums
     h = []
     res = []
     for i in xrange(k):
         heapq.heappush( h, (-nums[i],i) )
     (val,i) = heapq.heappop(h)
     res.append(-val)
     heapq.heappush( h, (-nums[i],i) )
     print res
     for j in xrange(k,len(nums)):
         heapq.heappush( h, (-nums[j],j) )
         (val,i) = heapq.heappop(h)
         while i<j-k+1:
             (val,i) = heapq.heappop(h)
         res.append(-val)
         heapq.heappush( h,(-nums[i],i) )
     return res
    

Merge k Sorted Lists

  1. Merge k Sorted Lists problem is to merge k sorted linked lists and return it as one sorted list.
  2. The problem can be solved by maintaining a heap with $k$ elements, and in each step
    1. pop an element from the heap
    2. add one element from the corresponding list into heap
  3. An example Python solution is given as the following
    # Definition for singly-linked list.
    # class ListNode:
    #     def __init__(self, x):
    #         self.val = x
    #         self.next = None
    class Solution:
     # @param {ListNode[]} lists
     # @return {ListNode}
     def mergeKLists(self, lists):
         return solution(lists)
    def solution(lists):
     if len(lists) == 0: return None
     if len(lists) == 1: return lists[0] 
     res = []
     h = []
     for i in xrange(len(lists)):
         if lists[i] != None:
             heapq.heappush(h, (lists[i].val,i) )
             lists[i] = lists[i].next
     while len(h)>0:
         (val,i) = heapq.heappop(h)
         res.append(val)
         if lists[i] != None:
             heapq.heappush(h, (lists[i].val,i))
             lists[i] = lists[i].next
     if len(res) == 0: return None
     print res
     head = ListNode(res[0])
     p = head
     for i in xrange(1,len(res)):
         tmp = ListNode(res[i])
         p.next = tmp
         p=p.next
     return head
    

The Skyline Problem

  1. The goal of The Skyline Problem is to find a contour formed by a collection of buildings.
  2. The problem can also be solve with a heap structure in O(nlogn) time where n is the number of buildings from the input.
  3. In particular, the solution will keep a active building together with a heap of buildings and in each iteration
    1. it takes in a new building
    2. if the new building is overlap with the active building, the algorithm will either put it into the heap if it is not as high as the active one or use it as a active building and push the previous active building into the heap
    3. if the new building is not overlap with the active building, the algorithm will work out the contour of the building in the heap.
  4. An example Python solution is given as the following
    class Solution:
     # @param {integer[][]} buildings
     # @return {integer[][]}
     def getSkyline(self, buildings):
         if len(buildings) == 0: return []
         res = []
         heap = []
         activebuilding = ( buildings[0][2],buildings[0][0],buildings[0][1] )
         res.append( [activebuilding[1],activebuilding[0]] )
         i=0
         while i<len(buildings)-1:
             i+=1
             currentbuilding = (buildings[i][2],buildings[i][0],buildings[i][1])
             if len(buildings) < 10: print i,heap,res,activebuilding,currentbuilding
             if currentbuilding[1] > activebuilding[2]:
                 # current start on the right
                 # process current heap
                 if len(heap) == 0:
                     res.append( [activebuilding[2],0] )
                     activebuilding = currentbuilding
                     res.append( [activebuilding[1],activebuilding[0]] )
                 else:
                     while len(heap) > 0:
                         (a,b,c) = heapq.heappop(heap)
                         newactivebuilding = (-a,b,c)
                         if newactivebuilding[2] > activebuilding[2]:
                             break
                     if newactivebuilding == None:
                         res.append( [activebuilding[2],0] )
                         activebuilding = currentbuilding
                         res.append( [activebuilding[1],activebuilding[0]] )
                     else:
                         res.append( [activebuilding[2],newactivebuilding[0]] )
                         activebuilding = newactivebuilding
                         i-=1
             elif currentbuilding[1] < activebuilding[2]:
                 # current start on the left
                 if currentbuilding[0]>activebuilding[0]:
                     heapq.heappush(heap,(-activebuilding[0],activebuilding[1],activebuilding[2]))
                     activebuilding = currentbuilding
                     print '--', activebuilding
                     res.append([activebuilding[1],activebuilding[0]])
                 else:
                     heapq.heappush(heap,(-currentbuilding[0],currentbuilding[1],currentbuilding[2]))
             elif currentbuilding[1] == activebuilding[2]:
                 # current start on the same position
                 if currentbuilding[0] >= activebuilding[0]:
                     activebuilding = currentbuilding
                 else:
                     heapq.heappush(heap,(-currentbuilding[0],currentbuilding[1],currentbuilding[2]))
                     (a,b,c) = heapq.heappop(heap)
                     currentbuilding = (-a,b,c)
                     res.append([activebuilding[2],currentbuilding[0]])
                     activebuilding = currentbuilding
         if len(buildings) < 10: print 'heap', heap,res
         # no more element
         if len(heap) == 0:
             res.append( [activebuilding[2],0] )
         while len(heap) > 0:
             if len(buildings) < 10:print heap
             (a,b,c) = heapq.heappop(heap)
             currentbuilding = (-a,b,c)
             while len(heap) >0 and currentbuilding[2] <= activebuilding[2]:
                 (a,b,c) = heapq.heappop(heap)
                 currentbuilding = (-a,b,c)
             if currentbuilding[2] > activebuilding[2]:
                 res.append( [activebuilding[2],currentbuilding[0]] )
                 activebuilding = currentbuilding
         res.append([activebuilding[2],0])
         # process res
         newres = []
         for i in xrange(len(res)-1):
             if res[i][0] != res[i+1][0]:
                 newres.append(res[i])
         newres.append(res[len(res)-1])
         print 'newres',newres
         return newres
    
Hongyu Su 03 August 2015