One of the past paper questions in the File Organization presentation asks what you would do if you had 10,000 large data objects that you wanted to sort, but you couldn’t read them all into RAM at the same time because they are too big.
The answer to this is to read small batches of data into RAM, sort them there using some sorting algorithm like quicksort, and then write them back to the disk in order. Do this repeatedly and you will have, say, 100 batches of 100 sorted objects. Then you can take pairs of 100 objects and merge them into 200 sorted objects, then take pairs of 200 objects and merge them into 400 objects, 800, 1600, and so on until the whole data set is merged.
The point is that you can merge two sorted lists without reading the lists into RAM first. This is a useful technique.In fact it is one of the HL program dossier mastery factors.
Imagine you have list A and list B. This is the algorithm to merge them in ascending order:
Read an item from A and an item from B off the disk While there are still items Compare the two items you have If item A < item B then write item A to the disk read an item from A If item B < item A write item B to the disk read an item from B Loop
In this way you only ever have two items in RAM at the same time.
- Create a class FileMerger to perform this merge on the two lists given below. You will need to save the lists to two separate .txt files.
- The class should have three private class variables of type File called inputFileOne, inputFileTwo and outputFile.
- The class should have a constructor which takes two file paths and initializes the files to be sorted (ie creates new File objects and assigns them to inputFileOne and inputFileTwo).
- The class should have a method merge, which merges the inputFileOne and inputFileTwo into outputFile.
- The file path of outputFile should be a class string constant.
You should email this work to me as a java file (code) and jpeg image (sample output) before the beginning of next Wednesday’s lesson (April 18). The lists you need are below.
Save this data to a file called BlockE.txt
Cruz, Ma. Victoria
De Villa, Nathan Dela
Mohd Farid Shah, Aina
Villafuerte, Vincenzo Renato
Save this data to a file called BlockH.txt
Quah, Isabel Pei-Yi
R Pannirsilvam, Sanjeevan