Wednesday, 27 December 2017

Algorithms | Interpolation Search: Beating Binary Search

Interpolation search is an algorithm to search given key in a uniformly distributed sorted array. It is an improvement over Binary Search when the keys are uniformly distributed in a sorted array.

Binary search always chooses the middle element for comparison, discarding one half of the search space or the other. However, Interpolation search tries to predict where the key would lie in the search space via a linear interpolation, reducing the search space to the part before or after the estimated position if the key is not found there.

A linear interpolation of values, 'key' should be found at position:

pos = low + [(key - array[low]) * (high-low) / (array[high]-array[low])]

here,
array[]: Array where keys are stored.
key: Key to be searched
low: Starting index in array[]
high: Ending index in array[]

Algorithm
It works same as Binary search except above interpolation logic.

1. Calculate the value of "pos" using the linear interpolation formula.
2. if it is a match, return the position of the item, and exit.
3. if the key < array[pos]
3.a. Calculate the linear interpolation of the left sub-array.
Otherwise,
3.b. Calculate the linear interpolation in the right sub-array.
4. Repeat until a match is found or the sub-array reduces to zero.

Pseudo Code
int interpolationSearch(int[] array, int x) {
int low = 0;
int high = array.length - 1;
int pos;

while ((array[high] != array[low]) && (x >= array[low]) && (x <= array[high])) {
pos = low + ((x - array[low]) * (high - low) / (array[high] - array[low]));

if (array[pos] < x) {
low = pos + 1;
} else if (x < array[pos]) {
high = pos - 1;
} else {
return pos;
}
}

if (x == array[low]) {
return low;
} else {
return -1;
}
}

Complexity
This approach works faster than Binary search when the keys are uniformly distributed. Average case complexity of interpolation search is O(log log N) if the keys are uniformly distributed, where N is the number of keys.

However, the worst case complexity is O(N) e.g., searching for 1000 in 1, 2, 3, …, 999, 1000, 10^9.