Archive for January, 2007

Read Byte Streams with Java IO

Maybe it’s the search terms I used, I couldn’t find straightforward and simple examples to read binary streams with JavaIO. It’s not I don’t know how to, but I think it’s useful for other developers who know less IO. So, here it is. The first example shows reading a byte at a time, the second shows reading in chunks.

// Example 1
InputStream in = somewhere.getInputStream();

int c;
while ((c = != -1) {
	// process read byte

The input stream is where you want to read from, such as a new FileInputStream() or socket.getInputStream(). An integer c is declared to hold each byte as it is read. The method will return an integer in the range 0-255, which is the value of the byte read. Why not use a Java byte? This is because the Java byte is signed and has the range -128 to 127. Java gives you the “benefit” of getting the actual value so you can process it directly.

The while loop line is the most confusing part of the loop. If we resolve the inner bracket first, it reads a byte from the inputstream, and stores it in c. The bracket now resolves to the value of c, which is compared against -1. This comparison is done because read() will return a -1 if the end of stream is reached. This will break the processing loop and allow the program to continue.

If a valid value is read, it goes into the loop, and you can process the integer c. In this example it just prints the read byte. The cast to char is necessary, or else it will print the number code of the byte.

// Example 2
InputStream in = somewhere.getInputStream();
OutputStream out = new ByteArrayOutputStream(); // for example

byte[] buf = new byte[1024];
int len;

while ((len = != -1) {
	// process byte buffer
	out.write(buf, 0, len);

In this example we also have an inputstream to read from, additionally we prepare an outputstream where we will store the read bytes. You should change it to whatever your purpose was for reading the stream. For this method we’ll need two variables — a byte array buffer (buf) for storing the read bytes and an integer (len) which represent the number of bytes actually read. There’s no fixed number for the buffer size, it’ll work whether you put 10 or 100000 for now. I’ll explain the effect later.

Next we reach the confusing while loop again, this time it’s even more complicated. Let’s resolve the inner bracket again, what happens here is will modify the byte array to store the content of the read bytes. This means when you execute by itself, the contents of buf before and after this statement might be different. The number returned will tell you how many bytes were read and stored in that byte array. The inner bracket now resolves to the value of len, which is matched against -1. This is because will return -1 if the end of the stream is reached, so we can terminate the while loop.

If some bytes were read, we go into the while loop, and write the read bytes to the outputstream. We will need to specify that we want to write the bytes 0 to len, because it may be possible that the inputstream read less than the size of the buffer. This may be due to a network latency, or it could be the last chunk of a file that is not a multiple of the buffer size. If len was not specified, we might be writing rubbish that contains data previously written into the byte array, thus corrupting the data.

Now that you understand the loop (hopefully), I’ll explain the effect of the buffer size. If you have a small buffer, you’ll need to run the loop more times, to read an amount of data. If you have a big buffer, you’ll loop less times for the same amount of data. Then why not assign a VERY BIG buffer? This depends on the amount of memory your application can spare. Allocating a big buffer means the byte buffer will take up that much memory, even if only a small part of it is used. So the decision on the size of the buffer depends on whether you have constraints on processing power or memory size, or even the typical size of data read. You don’t need a 1MB buffer to read 1KB streams.

The 2nd buffer method is more efficient than the one used in Example 1, which reads byte by byte. Therefore it is preferred the 2nd method is used.

Comments (1)


WEP, or Wired Equivalent Privacy, is a wireLESS standard for protecting data transmited over a WLAN network. Since wireless signals run over the air, they may be tapped easily. These data may include your login credentials to websites or application, sensitve emails, etc.

WEP uses a key which the user must enter into the router as well as all participating nodes. The key is then used to allow the user on the network and subsequently encrypt all trafiic using the key. Users will still be able to descrypt and see the data sent by another user on the same network, just as if the user had physical access to the Ethernet wire on the wired version. The problem with WEP is that it is not secure; by intercepting a big number of encrypted packets a cracker is able to crack the key used. There are also other known problems with WEP that cannot be solved with a bigger key.

After discovering this major security problem, WPA (Wi-FI Protected Access) was quickly created to replace WEP. As the 802.11i specifications was complete, WPA2 was introduced to comply with the new standard. WPA allows for two modes of operation, a “Personal” mode, where a Pre-Shared Key (PSK) is used for authentication and encryption, or an “Enterprise” mode where a IEEE 802.11X authentication server is used. [1]

The personal mode works similar to WEP, all users enter the “Network Key” to gain access to the network, then all traffic is protected. This scheme is suitable for home networks and small offices, where there are few machines and seldom changing.

No chance to try the authentication server mode yet…

[1] Wi-Fi Protected Access – Wikipedia, the free encyclopedia



Stripes is an “easy-to-use” web framework to overthrow Struts, as described on [1]. I have not tried it myself, but I quite agree with the disadvantages of using Struts, especially the high learning curve of learning Struts. The tight integration between the components and cryptic errors has also made incremental development difficult. Stripes has made it easy for a new developer do Stripes in less time, but it will be especially easy for a existing Struts developer to switch over because of the large similarities between them.

You can read about the Struts post at [2].

[1] Stripes vs. Struts
[2] Apache Jakarta Struts (Action 1)



Was recently introduced to a streaming TV application that allows you to watch overseas channels. Turns out it is based on the popular BitTorrent technology for streaming. What’s interesting about this form is that it has inverse properties of traditional Internet broadcasting: In traditional (uni-)broadcasts, the less people watching, the better the quality. The more people watch, the quality drops. Using BT, the more people watch, the better is quality, since there are more peers on the network and everyone becomes a re-broadcaster.

It also just takes a simple subscriber to the desired channel service to be the source. Despite only one source, if the content is popular (such as soccer matches), the peers will quickly help to make the swarm very big. It also does not matter if the original source bandwidth is not that high. As long there are enough people in the swarm, and you have a good download bandwidth, it is more likely your stream will be served well by the other peers.


Code markup

Yes, like many others I have realized the troublesome-ness of posting code in WordPress. There’s no need for me to re-iterate the problem since this site [1] has done it perfectly I feel. The Problem analysis, Alternatives such as off-the-shelf plugins, selected Solution, Usage and Tests look just like a perfect project to me.

However, the plugin didn’t work for me at first, nothing was being escaped at all. The same symptoms appeared. The plugin didn’t work with WP 2.0’s WYSI[N]WYG editor, and there was advise to disable the editor under the post comments. Despite disabling it at Options>Writing, the Visual Rich Editor still persistently appeared! Finally this quicktip [2] taught me the correct way to disable it, so now I am writing without the Visual Editor! I’ll probably be better staying this way with the TechBlog due to the amount of code I post.

[1] WordPress Plugin: Code Autoescape
[2] QuickTip: Turn Off WordPress 2.0 Visual Editor