Collections are boxes which allow you to store and group multiple items. However, there are so many collection classes inside the Collection framework. Why is there a need for different collections? How do you choose which to use? Here we understand the characteristics of each collection and learn how to choose which one to use. By understanding the characteristics we can know which operations will run fast or slow on these collections, thus matching our task requirement to the correct collection.
Array
Strictly speaking, the array is not part of the collections framework. However it should be discussed because sometimes a simple array is the best solution instead of using a collection.
An array is a fixed-length collection accessed by index (number). As such it is fast for random access — accessing a random object within the collection, by its index number. Iteration take O(n) time, which is considered fast. Searching in an array is slow, since you need to iterate over the whole collection to find the object.
As such, use an array when you need to access the objects by index, and require little searching. You should also use an array if you have a small number of elements, as it is simpler to use than using fanciful collections.
ArrayList
An ArrayList is the next simplest collection after the array. It has similar characteristics as the array. Being accessed by index, it also allows fast random access. At the same time, it automatically grows as you add elements into the ArrayList.
Therefore use the ArrayList if you cannot pre-determine the size of your collection, or the size will grow and shrink over time. If the size is fixed or can be determined, use an array.
Vector
Vectors existed before ArrayLists, and used to be the default “variable-length array”. It behaves exactly like an ArrayList: it is accesed by index, allows fast random access and grows automatically.
The ONLY major difference is that a Vector is threadsafe. This means that a Vector can be accessed by multiple threads at the same time, e.g. one thread calling add() and one calling remove() on the same Vector. An unsynchronized ArrayList that participates in thread operations may encounter exceptions or incorrect behaviour. As a side-effect of being thread-safe, a Vector is slightly (very slightly) slower in executing its operations.
Use a Vector if your application is multi-threaded. Some developers use Vectors through out and never use ArrayLists, because the performance difference is too slight to even be noticed.
Note: Vectors are considered “legacy” implementations, ArrayLists are supposed to replace Vectors as part of the Collections framework. The correct way to create a synchronized ArrayList is to use the Collections.synchronizedList() method.
LinkedList
A linked list is a chain of objects, which allows easy insertion/deletion of objects in the middle of the list. To insert/delete an object into a middle of an ArrayList requires shifting of other objects in the list. This is absent in the LinkedList, since it can be achieved by switching the pointers in the chain. As a result, it is slow in random access, but still allows fast iteration.
If add/remove form the bulk of your operations, access the list by iteration with little searching, you should consider using a LinkedList.
Hashtable
A hashtable uses a map-based interface: it allows you to retrieve objects based on object keys. Since all retrieval is based on keys, it provides very fast random access but slow iteration. Searching by key is constant time, but searching a value in a hashtable requires iteration.
Beware not to overuse hashtables, since it is overhyped as being “faster than vectors”. They are only faster if you access elements by key, with little iteration. Use hashtables if you need mapping of one object to another. Also note that hashtables require unique keys.
HashMap
Just like Vectors and ArrayLists, a HashMap is an un-synchronized version of the Hashtable. Use a HashMap if your application is NOT multi-threaded, and you are concerned with the little efficiency increase. Most developers never use HashMaps at all; they stick to Hashtables.
Note: Hashtables are considered “legacy” implementations, HashMaps are supposed to replace Hashtables as part of the Collections framework. The correct way to create a synchronized HashMap is to use the Collections.synchronizedList() method.
HashSet
A HashSet ensure unique elements in the “Set”, as in a mathematical set. Use a HashSet if you require the feature of uniqueness. As it is backed by a HashMap, it is also un-synchronized.
Others
- TreeMap – A mapped collection (key -> value) where its keys are sorted in a tree. Allows very fast random access, AND fast ordered iteration. Takes up more memory as a result of maintaining the tree as well as the HashMap.
- TreeSet – Set implementation backed by a TreeMap. Results in unique, ordered elements.
- LinkedHashMap – A mapped collection (key -> value) where keys are doubly-linked. Allows very fast random access, AND forward/backward traversal.
- LinkedHashSet – Set implementation backed by a LinkedHashMap. Results in unique traversable elements.
Ordering is based on the object’s natural order (must implement Comparable). These special collections should only be used when you really require their combined features.
Manipulating Collections
The collections framework contains a number of utility functions to operate on a collection, such as sorting, searching, reversing, shuffling, etc. See the algorithms section at http://java.sun.com/j2se/1.5.0/docs/guide/collections/reference.html to discover what you can do without coding it yourself.
For example, to sort a List containing Customers (which are Comparable), simply do
ArrayList customerList = ...;
Collections.sort(customerList);
Also read the article on Equality and Hashcodes to understand how collections identify its objects.
Manipulating/Converting Arrays
Since arrays are not within the Collections framework, they cannot use the algorithms shown above. However, the java.util.Arrays class provides basic sorting and searching methods to arrays. You can also convert an array into a list and vice-versa to use Collection functions, or feed a method which requires an array as its parameter.
To convert an array into a list, use the Arrays.asList() method. To convert a list into an array, use the toArray() method.
Customer[] customerArray = ...;
// Array -> List
List customerList = Arrays.asList(customerArray);
// List -> Array
customerArray = customerList.toArray(customerArray);
Conclusion
Learning about collections simplify our code when working with multiple similar objects. Using the correct collection for the task will also increase performance of the application. The Java Collections Framework provides a number of common methods to manipulate collections, and you can easily convert between arrays and Lists.