Question

Given an array of integers, find out whether there are two distinct indices i and j in the array such that the absolute difference between nums[i] and nums[j] is at most t and the absolute difference between i and j is at most k.

Example 1:

Input: nums = [1,2,3,1], k = 3, t = 0
Output: true

Example 2:

Input: nums = [1,0,1,1], k = 1, t = 2
Output: true

Example 3:

Input: nums = [1,5,9,1,5,9], k = 2, t = 3
Output: false

Solution

Inspired by bucket sort, we can achieve linear time complexity in our problem using buckets as window.

Bucket sort is a sorting algorithm that works by distributing the elements of an array into a number of buckets. Each bucket is then sorted individually, using a different sorting algorithm. For example, we have 8 unsorted integers, 20, 13, 4, 49, 41, 32, 9, 18. We create 5 buckets covering the inclusive ranges of [0,9], [10,19], [20, 29], [30, 39], [40, 49] individually. Each of the eight elements is in a particular bucket. For element with value x, its bucket label is x / w and here we have w = 10. Sort each bucket using some other sorting algorithm and then collect all of them bucket by bucket.

Back to our problem, the critical issue we are trying to solve is: For a given element x is there an item in the window that is within the range of [x-t, x+t]?

Could we do this in constant time?

Let us consider an example where each element is a person’s birthday. Your birthday, say some day in March, is the new element x. Suppose that each month has 30 days and you want to know if anyone has a birthday within t = 30 days of yours. Immediately, we can rule out all other months except February, March, April.

The reason we know this is because each birthday belongs to a bucket we called month! And the range covered by the buckets are the same as distance t which simplifies things a lot. Any two elements that are not in the same or adjacent buckets must have a distance greater than t.

We apply the above bucketing principle and design buckets covering the ranges of ..., [0,t], [t+1, 2t+1], ......,[0,t],[t+1,2t+1],.... We keep the window using this buckets. Then, each time, all we need to check is the bucket that x belongs to and its two adjacent buckets. Thus, we have a constant time algorithm for searching almost duplicate in the window.

One thing worth mentioning is the difference from bucket sort – Each of our buckets contains at most one element at any time, because two elements in a bucket means “almost duplicate” and we can return early from the function. Therefore, a HashMap with an element associated with a bucket label is enough for our purpose.

class Solution(object):
    def getId(self, n, w):
        if n >= 0:
            return n // w
        else:
            return n // w - 1

    def containsNearbyAlmostDuplicate(self, nums, k, t):
        """
        :type nums: List[int]
        :type k: int
        :type t: int
        :rtype: bool
        """
        if k <= 0 or t < 0:
            return False

        w = t + 1
        buckets = dict()
        for i in range(len(nums)):
            n = nums[i]
            id = self.getId(n, w)
            if id in buckets:
                return True
            if id - 1 in buckets and -t <= n - buckets[id - 1] <= t:
                return True
            if id + 1 in buckets and -t <= n - buckets[id + 1] <= t:
                return True
            buckets[id] = n
            if i >= k:
                del buckets[self.getId(nums[i - k], w)]

        return False