Think of the phonebook example:
- When you are searching by surname, you open the book in the middle and then see whether the name you are looking for is in the first half or the second half of the phonebook. Let’s say it’s in the first half.
- You then search the first half of the phonebook only. It’s like you’re searching a new, smaller phonebook. And so it goes on, with the phonebook being halved each time.
- However, you can’t use this technique if you’re searching by phone number, because the book is not sorted on phone number. Opening the phonebook in the middle and looking at a phone number doesn’t tell you whether the number you’re looking for is in the first part of the book or the second.
In order to use binary search, your list must be sorted on the search key.
However, notice that you have to be able to jump into the middle of the data. Certain data structures never support this kind of direct access, and so you cannot use binary search on them. Linked lists, for instance, don’t offer any means of direct access and accordingly binary search cannot be used on a linked list, even if it’s sorted.
[Thinking point: Your friend reads this last paragraph and disagrees. He says that binary search can be used with linked lists because the java.util.LinkedList class has a get() method such that myList.get(n) returns the nth element of the list. What is your response to him?]
In order to use binary search, your list must support direct access.
Searching through the phonebook by phone number would mean that you had to start at the very first entry and just go through the whole book from start to finish. This is called linear search. Sometimes you might find the phone number at the beginning of the book (a stroke of luck) and sometimes you might have to search all the way through. On average you would have to search N/2 items.
Searching through the phonebook by surname means that you can use binary search. The worst case scenario for binary search is that the number of items you have to examine is equal to the number of times that you can cut N in half, ie log2 N. The is very, very efficient. You can double the number of items in the list and binary search only has to do one more check.
Linear search is O(N). Binary search is O(log2 N).
Index files can be used to search data that is not sorted on the physical medium (ie the disk). It takes some time to build the indexes, but in future they can be used to locate data very quickly. Indexes sometimes require a lot of storage space though.
A Bubblesort of N items runs through the list N times. Running through the list once means examining all N items. Running through the list N times means examining each item N times; a total of N^2 operations.
Bubblesort is O(N^2).