When they say HashMap is not thread-safe, it means it is not.

We had this deadlock-like behavior where the UI froze sometimes when performing a particular action. But when run with our profiler, it did not report a deadlock when it froze.

After hours of tracing, the cause was identified as an infinite loop on the HashMap.get() method, running in the AWT Event Dispatch thread. The HashMap is the constraints object in a javax.swing.SpringLayout. Why would that happen???

More tracing revealed that there were multiple threads trying to add components to this container, triggering SpringLayout to add new constraints to its map. With reasonable luck the map hits its threshold and tries to resize itself. If you understand hashing you know that this is when it picks a new table size and re-hashes all the objects into its new buckets. When two of this happen at the same time the linked list screws up, and with more luck a later element in the link list points to the next element which is earlier in the same list. The worse part is this is a hidden problem, it doesn’t “surface” until you try to get() an object that’s even later than the later element, that’s when it iterates the bucket past the later element, back to the earlier element, and repeats.

The typical solution to swap the HashMap for a Hashtable is not available here, because the map exists in the core JDK. So it’s a good time to re-iterate to the team the significance of “single threaded model” of AWT/Swing. To be explicit, this issue was resolved by queuing the adding of components to the container on the Event Dispatch thread. Of course a more direct approach would be to synchronize the points where the components were added, but you shouldn’t be doing such things outside the AWT Event Dispatch thread anyway.

When I googled about the HashMap.get() infinite loop I realize it occurs in other areas as well, e.g. a Servlet’s session may be represented as a HashMap. Concurrent put()s and then a get() can cause the same problem.

6 Comments »

  1. Jun Xu said,

    February 21, 2011 at 8:48 pm

    Hi ,
    I got the same issue and want to know how this happened. I also think that the cause is that “When two of this happen at the same time the linked list screws up”, but I just can’t figure it out when checking the source code in JDK.

    As the newTable is created in method scope (void resize(int newCapacity)) for each thread, there is no way I can imagine that we get circular link in the list

    Entry[] newTable = new Entry[newCapacity];

    Could you elaborate more on how the “screw up” thing will happen in multi-thread conext ?

    Thanks,
    Tony

    void resize(int newCapacity) {
    Entry[] oldTable = table;
    int oldCapacity = oldTable.length;
    if (oldCapacity == MAXIMUM_CAPACITY) {
    threshold = Integer.MAX_VALUE;
    return;
    }

    Entry[] newTable = new Entry[newCapacity];
    transfer(newTable);
    table = newTable;
    threshold = (int)(newCapacity * loadFactor);
    }

    /**
    * Transfers all entries from current table to newTable.
    */
    void transfer(Entry[] newTable) {
    Entry[] src = table;
    int newCapacity = newTable.length;
    for (int j = 0; j < src.length; j++) {
    Entry e = src[j];
    if (e != null) {
    src[j] = null;
    do {
    Entry next = e.next;
    int i = indexFor(e.hash, newCapacity);
    e.next = newTable[i];
    newTable[i] = e;
    e = next;
    } while (e != null);
    }
    }
    }

  2. geek said,

    February 22, 2011 at 10:02 am

    You’re right that the newTable is local, but the elements that are to go in are not.

    In the transfer(Entry[]) method called by resize(int) method, the src variable references the table which is not local, and it performs direct mutation on the e.next property. In a multi-threaded situation there is a possibility that the same Entry e gets assigned at the same time before src[j] gets null-ed.

    I see other possibilities such as having two newTable-s floating around may mean e.next may point to either newTable and the final table will only be one of the two newTable-s but I have not sequenced that out in detail.

  3. Jun Xu said,

    February 22, 2011 at 5:03 pm

    Thanks so much. You’re talking about two threads starting the resize at the same time and I agree that this most possible way to create a circular link. The hard thing is to “imagine” a scenario that will create a circular link either within one newTable or between two newTables – I’ve tried serveral case with your hints in which the shared variable element e get used in two threads concurrently but just can’t figure out one that really makes the link a circle– seems that all the linkages created between Entries are unidirectional, so even two newTables can refer to one Entry, that Entry will not link back to any of the newTable, because every new Entry is inserted ahead of all existing entries in every linked list !

  4. geek said,

    February 23, 2011 at 8:11 am

    That’s true. In this case that might not be the direct cause to the circular link, although it does corrupt the table. You got me interested in this again but I’ve not the luxury of time to dig into it at this moment. Could it be other mutator methods playing tricks (put running at the same time with a resize? But no other existing element should point to the new one yet!)

    Do share with me if you have findings!

  5. Jun Xu said,

    February 23, 2011 at 11:03 pm

    Thanks. I do find one scenario that produces a circular link. Try to show it step by step as below (you may need to re-format the table before you can read it).

    source code from JDK:
    void transfer(Entry[] newTable) {
    Entry[] src = table;
    int newCapacity = newTable.length;
    for (int j = 0; j < src.length; j++) {
    Entry e = src[j];
    if (e != null) {
    src[j] = null;
    do {
    (line479) Entry next = e.next;
    (line480) int i = indexFor(e.hash, newCapacity);
    (line481) e.next = newTable[i];
    (line482) newTable[i] = e;
    (line483) e = next;
    } while (e != null);
    }
    }
    }

    original table, src[i]==A, A->B

    Time(heap view) newTable1[j](Thread-1) Time(heap view) newTable2[j](Thread-2)
    t1(line479,e==A,next==B) EMPTY t1(line479,e==A,next==B) EMPTY

    t2(line482,e.next==null) A t2(line482,e.next==null) A

    t3(line483,e==B) A t3(line483,e==B) A

    t4(line479,e==B,next=null) A t4(line483, HANGUP) A

    t5(line482,e.next==A) B->A t5(line483, HANGUP) A

    t6(line483,e==null) B->A t6(line483, HANGUP) A

    t7(done) B->A t7(line479, e==B,next==B.next==A) A

    t8(done) B->A t8(line482, e.next==A) B->A

    t9(done) B->A t9(line483, e==A) B->A

    t10(done) B->A t10(line479,e==A, next==null) B->A

    t11(done) B->A t11(line482,e.next==B) B->A->B (circular link)

  6. Yet another story about high cpu utilization… « performance stories said,

    October 2, 2012 at 1:18 am

    […] http://geek.starbean.net/?p=343(In particular, see the comment from February 23, 2011 at 11:03 pm by Jun Xu). […]

RSS feed for comments on this post · TrackBack URL

Leave a Comment