Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Kth Largest Element In a Stream

Difficulty: Easy Source: NeetCode

Problem

Design a class to find the kth largest element in a stream. Note that it is the kth largest element in sorted order, not the kth distinct element.

Implement the KthLargest class:

  • KthLargest(int k, int[] nums) — initializes the object with the integer k and the stream of integers nums.
  • int add(int val) — appends the integer val to the stream and returns the element representing the kth largest element in the stream.

Example 1: Input: k = 3, nums = [4, 5, 8, 2], then add(3), add(5), add(10), add(9), add(4) Output: [4, 5, 5, 8, 8]

Constraints:

  • 1 <= k <= 10^4
  • 0 <= nums.length <= 10^4
  • -10^4 <= nums[i] <= 10^4
  • -10^4 <= val <= 10^4
  • At most 10^4 calls will be made to add
  • It is guaranteed that there will be at least k elements in the array when add is called

Prerequisites

Before attempting this problem, you should be comfortable with:

  • Heaps / Priority Queues — understanding min-heap and max-heap structure and operations
  • Python’s heapq moduleheappush, heappop, and the fact that Python only provides a min-heap

1. Brute Force

Intuition

Every time add is called, insert the new value into a sorted list, then return the element at index -k (the kth from the end). Sorting on every call is simple but wasteful — we redo work we already did.

Algorithm

  1. In __init__, store k and concatenate val into a sorted list.
  2. In add(val), append val, sort the list, then return sorted_list[-k].

Solution

import bisect

class KthLargest:
    def __init__(self, k, nums):
        self.k = k
        self.stream = sorted(nums)

    def add(self, val):
        bisect.insort(self.stream, val)
        return self.stream[-self.k]


# Test 1
obj = KthLargest(3, [4, 5, 8, 2])
print(obj.add(3))   # 4
print(obj.add(5))   # 5
print(obj.add(10))  # 5
print(obj.add(9))   # 8
print(obj.add(4))   # 8

# Test 2
obj2 = KthLargest(1, [])
print(obj2.add(-3))  # -3
print(obj2.add(-2))  # -2

Complexity

  • Time: O(n log n) init, O(log n) per add (with bisect insert)
  • Space: O(n)

2. Min-Heap of Size K

Intuition

We only care about the top k largest elements — everything smaller is irrelevant. A min-heap of size exactly k keeps exactly those elements, with the smallest of the top-k sitting at the heap root. That root is our answer: it’s the kth largest. Whenever the heap exceeds size k, pop the minimum to evict the element that just fell out of the top-k.

Algorithm

  1. In __init__, heapify the input and pop elements until the heap has exactly k items.
  2. In add(val), push val onto the heap. If the heap size exceeds k, pop the minimum. Return heap[0] (the kth largest).

Solution

import heapq

class KthLargest:
    def __init__(self, k, nums):
        self.k = k
        self.heap = nums[:]
        heapq.heapify(self.heap)
        # Trim down to k elements
        while len(self.heap) > k:
            heapq.heappop(self.heap)

    def add(self, val):
        heapq.heappush(self.heap, val)
        if len(self.heap) > self.k:
            heapq.heappop(self.heap)
        return self.heap[0]  # root = kth largest


# Test 1
obj = KthLargest(3, [4, 5, 8, 2])
print(obj.add(3))   # 4
print(obj.add(5))   # 5
print(obj.add(10))  # 5
print(obj.add(9))   # 8
print(obj.add(4))   # 8

# Test 2 — stream starts empty
obj2 = KthLargest(1, [])
print(obj2.add(5))   # 5
print(obj2.add(3))   # 5
print(obj2.add(7))   # 7

# Test 3 — k equals length of initial nums
obj3 = KthLargest(2, [3, 1, 2])
print(obj3.add(0))  # 1

Complexity

  • Time: O(n log n) init, O(log k) per add
  • Space: O(k)

Common Pitfalls

Forgetting to trim the heap in __init__. If you skip the initial trim, the heap can hold more than k elements and heap[0] will not be the kth largest — it will be the overall minimum.

Using a max-heap unnecessarily. A max-heap of size n would let you find the kth largest, but it stores all elements and costs O(log n) per operation instead of O(log k). The min-heap of size k is both simpler and faster.

Off-by-one on size check. After pushing, the heap can be size k + 1 at most. The pop should happen when len(heap) > k, not >= k.