mysql ibdata keeps growing

Didn’t manage to collect complete information for this post, so I’ll just write whatever I have.

It all began when our development database server ran out of disk space and crashed. We mounted a temporary virtual hard disk and moved the database there, and all was well for a while. With df and du we narrowed our culprit to mysql’s ibdata, which was growing so fast that we will run out of disk space again soon enough. Public information tells us ibdata is supposed to always grow, but we do not expect our dev ibdata to grow at this rate. After multiple searches this blog finally closes us in.

SHOW ENGINE INNODB STATUS

With the command, the innodb history list length was a very large number (>1mil) and keeps going up. A check with our test and production databases show that the length goes at most to a few hundred and drops back down — a significant difference. Restarting the database doesn’t help.

SELECT * FROM INFORMATION_SCHEMA.INNODB_TRX

This showed us we had 16 XA transactions, that were started 2 days ago, but never committed or rolled back. They are not locked, so their trx_mysql_thread_id is 0. We immediately linked the causes together. Stuck XA transactions -> history list growing.

XA RECOVER

According to mysql docs, this command can rollback the XA transactions. The user comment was especially helpful on how to reproduce the xid, reproduced verbatim here:


To rollback the transaction, first get its xid:

mysql> xa recover;

+----------+--------------+--------------+------------------------------------------------------------+
| formatID | gtrid_length | bqual_length | data                                                       |
+----------+--------------+--------------+------------------------------------------------------------+
|   131075 |           30 |           28 | 1-a00640d:c09d:4ac454ef:b284c0a00640d:c09d:4ac454ef:b284c2 |
+----------+--------------+--------------+------------------------------------------------------------+
1 row in set (2.13 sec)

The xid is present in this output, but you have to perform a little string manipulation to get it. The format of a xid is: gtrid,bqual,formatID. The column 'data' contains a concatenation of 'gtrid' and 'bqual'. The columns 'gtrid_length' and 'bqual_length' specify how many bytes each of these values uses; use them to split apart 'data'. In this example, the result is:

mysql> xa rollback '1-a00640d:c09d:4ac454ef:b284c0','a00640d:c09d:4ac454ef:b284c2',131075;

ERROR 1402 (XA100): XA_RBROLLBACK: Transaction branch was rolled back

The tricky part here was that, my data had binary characters, which I couldn’t directly copy and paste in the MySQL Workbench. I couldn’t bear to write a program to read the value and write it back either, so I was poking around for solutions on that. From the same mysql doc page,

gtrid and bqual must be string literals, each up to 64 bytes (not characters) long. gtrid and bqual can be specified in several ways. You can use a quoted string ('ab'), hex string (0x6162, X'ab'), or bit value (b'nnnn').

Good, I could write the xid in hex. So I right-clicked the data column, and “Open Value in Viewer”. In the binary tab I copied down the hex values and reconstructed the xid as described by the helpful comment.

XA ROLLBACK X'7e3ae860eb21de21b84d392cb03bf8363b41482b9b1207f6e6823355012e91858c',X'526801e7500b06fd05a2f5882d20be19982a46aed4b6c26dc63887',4264851;

Viola, one by one the transactions were gone. Once the last one was rolled back, the history list started to decrease and behave in a similar pattern as our other databases, and the ibdata stopped growing at the crazy rate. The one last part I haven’t figured out is: how do I copy the hex values from MySQL Workbench, or how do I show the XA RECOVER data column in hex?

Delegates are not retained

When you assign a delegate, remember to retain it manually based on the lifetime of the source. Otherwise you will get a “EXC_BAD_ACCESS code=1” (invalid pointer) when the source attempts to fire the event to the delegate (e.g. when setting UITabBarController.delegate).

Firefox marquee restarts prematurely

I was trying to fix a HTML marquee bug today, where only on Firefox the marquee restarts itself before the whole message scrolls to the end. After stripping down the page the cause appears to be Firefox observing the length of the original text (which was just a single space character).

The HTML code below reproduces it. Click on “Change Marquee Text” to see the difference between the two marquees.




At first


Later, the marquee text became longer but Firefox does not recognize it.

Not worth to further investigate here, I just dynamically re-rendered a container div for this marquee.

double HTTP requests

We first discovered that our webapp was receiving duplicate HTTP AJAX requests from clients that results in a database insert. Fortunately jQuery had a nocache timestamp as part of the AJAX request so we could recognize it as a duplicate and reject the 2nd request.

As we tried to narrow down the cause we found that it happens on both GET/POST, as well as on a variety of browsers and network providers. After days of trying to reproducing the behavior we resorted to using iMacro to do browser automation and finally manage to reproduce it occasionally. What was surprising was that we were receiving the response of the 2nd response, which was the rejection message! (while the database insert was successful) It was absolutely confusing and we set up the automation again with Wireshark.

The packet analysis confirmed that the client browser was sending the duplicate request. However we also scanned Firebug and confirmed that only a single AJAX call was made! It didn’t make any sense to any of us until I noticed a pattern in the Wireshark logs that the 1st request made was always terminated without a response (either a TCP Connection Reset or a proper TCP Connection Close initiated by the server), and the 2nd request would be made on another kept-alive TCP connection. Following that symptom with Google I chanced upon a blog that highlighted HTTP 1.1 RFC 8.2.4.

If an HTTP/1.1 client sends a request which includes a request body, but which does not include an Expect request-header field with the “100-continue” expectation, and if the client is not directly connected to an HTTP/1.1 origin server, and if the client sees the connection close before receiving any status from the server, the client SHOULD retry the request.

Could this be it? We mocked up a HTTP server using raw Java ServerSockets by intentionally closing the first connection and allowing the second. We logged the requests on the server-side and voila! The mock server showed that it received two HTTP requests, Firebug showed that it sent one only, and Wireshark showed that the browser sent two.

Essentially, we just proved that the browsers adhered to HTTP 1.1 RFC 8.2.4…….

Trimpath unterminated string literal in IE

We’re using an ancient library called TrimPath to render templated data with JavaScript, and a peer hit the “unterminated string literal” in eval() on IE only. I assisted to solve it, by extracting the offending string and isolated the cause to a ‘\n’ within the string. It was easy to cut away the additional newlines and IE is happy. (This is on the assumption that the newline are decorative and not necessary as part of the output)

However, I wasn’t satisfied what caused the newline to be parsed wrongly, and why IE only. The good thing with Open Source is we get to dig in ourselves to troubleshoot issues.

After stepping through the cause is identified: multiple continuous \n not parsed correctly. Apparently someone hit this at least 5 years ago and there was a fix for it, by detecting the additional \n and eating it up. But it still fails if there are more \n.

Why IE only? We were using script tags as the template, and in IE script.innerHTML is prefixed with an additional \n.

The following snippet will fail in IE, but pass in FF.



    
        
    
    
        
        

If another newline is added (with no additional whitepsace), FF will fail with the same error.
If <textarea> instead of <script> is used, it will pass, but adding more newlines will also cause it to fail.

As an anecdote, I’m not sure how our project got to use 1.1.2, when the last on the download site is 1.0.38 (legacy stuff…). But a search on the web shows it’s not just us

TimeUnit

How many times have you got a millisecond long timestamp and had to compare it to a duration e.g. 1000 * 60 * 60 * 24 * … ? I stumbled upon this more readable and less error prone way of representing, say, 5 hours, by using the java.util.concurrent.TimeUnit.


private static final long FIVE_HOURS = TimeUnit.MILLISECONDS.convert(5, TimeUnit.HOURS);
...
if (timestamp < System.currentTimeMillis() - FIVE_HOURS) {
    // do something.
}

The above will check if a timestamp is overdue for more than five hours. The variable twoDays will have the value 172800000. TimeUnit supports from nanoseconds, micro-, milli-, all the way up to days.

However, if the date/locale is important to you (e.g. daylight savings) then you should use the Calendar API/JodaTime rather than this "duration"-based TimeUnit.

maven compile jrxml to jasper

Unless your jasper reports change at runtime, .jrxml templates should be compiled at compile time into .jasper, and you will not need JDT at runtime, nor need to re-compile the reports each time it is run.

If you’re on maven, simply paste the usage guide into your pom and change your JasperCompileManager.compileReport(InputStream) into JRLoader.loadObject(InputStream).

Change SVN commit message

This day finally came when I copied/paste the wrong bug number into the SVN commit message. Usually errors in SVN commit messages can be ignored but this particular one may lead to much confusion later. Luckily there was a way to edit the commit message after it’s committed. As usual I paste the command here instead of just the reference link as I have already some posts that have dead reference links.


svn propset -r 12345 --revprop svn:log "#1234 description"

where 12345 is the revision number, and text in quotes is the corrected commit message.

I still prefer to be cautious to commit messages correctly, but humans are just humans…

Ref: http://subversion.apache.org/faq.html#change-log-msg

Netbeans Subversion 1.6 vs 1.7

I installed the latest and greatest TortoiseSVN on a new PC (which happen to be for SVN 1.7), and happily upgraded my working copy when prompted. Newer is better isn’t it? After all, I didn’t like the .svn folder everywhere, which 1.7 got rid of with centralized metadata storage.

To my horror later, Netbeans wouldn’t update my workspace anymore, since the current version has no way to support 1.7 yet, other than the command line client. Fine, I installed the 1.7 CLI over my 1.6 CLI, but the Subversion features in Netbeans became sluggish and often show incorrect change flags until I manually activated Show Changes each time.

Downgrading to 1.6 was not an easy feat, I installed TortoiseSVN 1.6.16, judging by it being the latest release bound to SVN 1.6. I was lucky my SVN server is local so re-checking out my workspace was not that tough (I know some places which will take hours, and I haven’t found a “downgrade working copy” method that didn’t require a checkout).

Then Netbeans complained the working copy isn’t compatible anymore with my SVN 1.7 CLI. I cleared the SVN path setting, but couldn’t find any option to force it to revert to using the built-in bindings. Reinstalling the plugin did not help either. A half hour later I resorted to a brute force search on “subversion” in the netbeans folder and my home folder and I finally found this file:


C:\Users\%USERNAME%\.netbeans\7.1\config\Preferences\org\netbeans\modules\subversion.properties

where the option “forcedCommandline” is set to true. Changing it to false and restarting my IDE got me out of the situation and ended my fight with SVN 1.7. I still like the idea of the new working copy, so hopefully a new NB SVN plugin will be released soon.