mysql ibdata keeps growing

Didn’t manage to collect complete information for this post, so I’ll just write whatever I have.

It all began when our development database server ran out of disk space and crashed. We mounted a temporary virtual hard disk and moved the database there, and all was well for a while. With df and du we narrowed our culprit to mysql’s ibdata, which was growing so fast that we will run out of disk space again soon enough. Public information tells us ibdata is supposed to always grow, but we do not expect our dev ibdata to grow at this rate. After multiple searches this blog finally closes us in.

SHOW ENGINE INNODB STATUS

With the command, the innodb history list length was a very large number (>1mil) and keeps going up. A check with our test and production databases show that the length goes at most to a few hundred and drops back down — a significant difference. Restarting the database doesn’t help.

SELECT * FROM INFORMATION_SCHEMA.INNODB_TRX

This showed us we had 16 XA transactions, that were started 2 days ago, but never committed or rolled back. They are not locked, so their trx_mysql_thread_id is 0. We immediately linked the causes together. Stuck XA transactions -> history list growing.

XA RECOVER

According to mysql docs, this command can rollback the XA transactions. The user comment was especially helpful on how to reproduce the xid, reproduced verbatim here:


To rollback the transaction, first get its xid:

mysql> xa recover;

+----------+--------------+--------------+------------------------------------------------------------+
| formatID | gtrid_length | bqual_length | data                                                       |
+----------+--------------+--------------+------------------------------------------------------------+
|   131075 |           30 |           28 | 1-a00640d:c09d:4ac454ef:b284c0a00640d:c09d:4ac454ef:b284c2 |
+----------+--------------+--------------+------------------------------------------------------------+
1 row in set (2.13 sec)

The xid is present in this output, but you have to perform a little string manipulation to get it. The format of a xid is: gtrid,bqual,formatID. The column 'data' contains a concatenation of 'gtrid' and 'bqual'. The columns 'gtrid_length' and 'bqual_length' specify how many bytes each of these values uses; use them to split apart 'data'. In this example, the result is:

mysql> xa rollback '1-a00640d:c09d:4ac454ef:b284c0','a00640d:c09d:4ac454ef:b284c2',131075;

ERROR 1402 (XA100): XA_RBROLLBACK: Transaction branch was rolled back

The tricky part here was that, my data had binary characters, which I couldn’t directly copy and paste in the MySQL Workbench. I couldn’t bear to write a program to read the value and write it back either, so I was poking around for solutions on that. From the same mysql doc page,

gtrid and bqual must be string literals, each up to 64 bytes (not characters) long. gtrid and bqual can be specified in several ways. You can use a quoted string ('ab'), hex string (0x6162, X'ab'), or bit value (b'nnnn').

Good, I could write the xid in hex. So I right-clicked the data column, and “Open Value in Viewer”. In the binary tab I copied down the hex values and reconstructed the xid as described by the helpful comment.

XA ROLLBACK X'7e3ae860eb21de21b84d392cb03bf8363b41482b9b1207f6e6823355012e91858c',X'526801e7500b06fd05a2f5882d20be19982a46aed4b6c26dc63887',4264851;

Viola, one by one the transactions were gone. Once the last one was rolled back, the history list started to decrease and behave in a similar pattern as our other databases, and the ibdata stopped growing at the crazy rate. The one last part I haven’t figured out is: how do I copy the hex values from MySQL Workbench, or how do I show the XA RECOVER data column in hex?

Comments

ConcurrentHashSet

You could use a map with a useless value as well and use containsKey, but using a Set makes the semantics clearer.

Collections.newSetFromMap(
	new ConcurrentHashMap<YourObject, Boolean>());

Comments

Delegates are not retained

When you assign a delegate, remember to retain it manually based on the lifetime of the source. Otherwise you will get a “EXC_BAD_ACCESS code=1” (invalid pointer) when the source attempts to fire the event to the delegate (e.g. when setting UITabBarController.delegate).

Comments

Firefox marquee restarts prematurely

I was trying to fix a HTML marquee bug today, where only on Firefox the marquee restarts itself before the whole message scrolls to the end. After stripping down the page the cause appears to be Firefox observing the length of the original text (which was just a single space character).

The HTML code below reproduces it. Click on “Change Marquee Text” to see the difference between the two marquees.


<html>
<body>
<marquee id="m" style="color:white;background:black;width:300px">At first</marquee>
<br/>
<button onclick="document.getElementById('m').innerHTML='Later, the marquee text became longer but Firefox does not recognize it.'">Change Marquee Text</button>
<br/>
<marquee id="m" style="color:white;background:black;width:300px">Later, the marquee text became longer but Firefox does not recognize it.</marquee>
</body>
</html>

Not worth to further investigate here, I just dynamically re-rendered a container div for this marquee.

Comments

double HTTP requests

We first discovered that our webapp was receiving duplicate HTTP AJAX requests from clients that results in a database insert. Fortunately jQuery had a nocache timestamp as part of the AJAX request so we could recognize it as a duplicate and reject the 2nd request.

As we tried to narrow down the cause we found that it happens on both GET/POST, as well as on a variety of browsers and network providers. After days of trying to reproducing the behavior we resorted to using iMacro to do browser automation and finally manage to reproduce it occasionally. What was surprising was that we were receiving the response of the 2nd response, which was the rejection message! (while the database insert was successful) It was absolutely confusing and we set up the automation again with Wireshark.

The packet analysis confirmed that the client browser was sending the duplicate request. However we also scanned Firebug and confirmed that only a single AJAX call was made! It didn’t make any sense to any of us until I noticed a pattern in the Wireshark logs that the 1st request made was always terminated without a response (either a TCP Connection Reset or a proper TCP Connection Close initiated by the server), and the 2nd request would be made on another kept-alive TCP connection. Following that symptom with Google I chanced upon a blog that highlighted HTTP 1.1 RFC 8.2.4.

If an HTTP/1.1 client sends a request which includes a request body, but which does not include an Expect request-header field with the “100-continue” expectation, and if the client is not directly connected to an HTTP/1.1 origin server, and if the client sees the connection close before receiving any status from the server, the client SHOULD retry the request.

Could this be it? We mocked up a HTTP server using raw Java ServerSockets by intentionally closing the first connection and allowing the second. We logged the requests on the server-side and voila! The mock server showed that it received two HTTP requests, Firebug showed that it sent one only, and Wireshark showed that the browser sent two.

Essentially, we just proved that the browsers adhered to HTTP 1.1 RFC 8.2.4…….

Comments (6)