C++ compilation

Java goes through a single compilation step to translate source code (.java) to bytecode (.class). C++ goes through a few more steps to reach binary code. (.exe/.dll)

Source Code (.c) -> PREPROCESSOR -> COMPILER -> Assembly Code (.o)
Assembly Code (.o) -> LINKER (+Libraries) -> Executable Code (.exe/.dll)

The Preprocessor processes # preprocessor directives. For example this will #include the contents of the header file into this position. #define will define pre-processor variables, do substitution and include only relevant #ifdef sections.

The compiler will then translate source code into assembly code, similar to the Java compiler step. This step will catch all syntax errors. The compiler should be able to locate all #include files through the additional include directories specified by the /I command line.

The linker then produces the binary for the assembly code. Linker errors can occur here if the compiled libraries referenced are not found.

The end result of a compilation/linking is usually an .exe file, which can be directly executed on a Windows machine. There are also options to create a dynamic library (dll) or a static library (lib). A dynamic library is looked up at runtime, and is usually shared among multiple programs. A static library is compiled into the executable directly, and does not require the existance of the lib during execution. A program referencing a DLL library will be smaller than one referencing a LIB, since the DLL code is located inside the DLL.

To call another method, the method must be “declared” before the calling line. Declaration may mean the actual method implementation, or just a “header” specifying the method name and parameters. Thus the use of #include files often at the top which includes the method definitions so they may be referenced. Actual implementation is later linked using the linker.

Testing

(Warning: This is an incoherent post – there has been no attempt to organize the paragraphs such that it flows in a thoughtful manner. These thoughts just poured out of my mind at that moment in time…)

Software Testing is important to maintain the quality of software produced. Good testing is difficult, hindered by time, management, and possibly testing techniques.

To me, good tests must be documented. This may come in the simple form of documented test cases, although I seriously prefer automated tests to perform those test cases. Automated tests are usually run more often, as compared to needing people to sit and key in pre-defined input and check outputs.

Using human as testers have an additional advantage, they help to introduce human error. Good software should be able to recover and remain stable under human error conditions (prefably all kinds of error conditions) To simulate this situation, automated tests should introduce elements of randomness. For example, free text input such as text boxes should be tested with all possible input characters from the keyboard, as well as random combination of input keys. The former ensures that the system is able to handle all characters, and the latter simulates “a monkey banging the keyboard”.

Applying randomness in testing catches more errors than if you always used fixed input, say “John Smith” for full names. Or even worse, “test” as a full name. Can your system accept “Thomas D’Cruz”? Or a 94-character long indian full name? Yet once you catch an error with random input, are you able to reproduce the same input to verify that your system has handled that case?

This is a common flaw when doing automated random testing. Test sequences MUST be logged (to files) so that the errorneous test can be repeated with the exact same input so that the bug can be verified (as a bug) and verified (as fixed). Although threaded programs may not return you the same output given those same input, the information will also serve as a debug log that will assist you in reproducing the error. If the error cannot be reproduced, it cannot be verified.

Using frameworks such as JUnit will assist you in combining several tests into a test suite so that automatic regression testing is simplified. Hopefully every refactor or change to the base source can be accompanied by a test run.

New beginning

No entry for the last week, was getting used to my new work environment. Due to the shift to technologies like SWT and XMPP, my posts will be rather geared towards those topics. For now, I’ll need to understand SWT:

– How to access shells and displays correctly
– How to create and use simple SWT widgets
– Managing/swtiching between windows and composites
– SWT layouts, maybe creating SWT-AWT layout adapters

Quote

“If the cost of breaking into a system is significantly higher than the benefits gained by attacking the system, and the cost of protecting the system is lower than the value of what is being protected, then we call a system secure.”

– Jian Zhong, senior software architect, ActioNet Inc.

Although posted once on the Starbean forum before, I felt that this quote is very apt to justify for how much security is desired. Perfection cannot be achieved; there is no unbreakable system. Spending too little on security can compromise the system, overspending waste resources, time and money, with no further benefits.

[1] http://www.javaworld.com/javaworld/jw-0 … ee-p3.html

Get Real! No Perfect Code

No commercial system is perfect – I added “commercial” because you could technically create a one-liner program that prints text to the screen. It matches your specifications exactly to print a line of text, thus perfect code.

Commercial systems, one that you can really sell and earn bucks, are far less than perfect. Bugs occur due to individuals – pure carelessness, lack of skill, too much code to handle for a single programmer; due to team – poor design, poor management, poor leadership, wrong use of tools, technology, architecture, design; due to customers – poor requirements, miscommunication, scope creep; plus so many other reasons for any bug.

Level 1 people still expect bugless code: clients who do not accept products with known bugs, or developers who try to fix every known bug to try achieve the bugless state. Level 2 developers understand this imperfection, but anyhow choose bugs to fix: the most common mistake is fixing bugs that are “easy to fix”. However, that bug may not have a big impact or severity, instead increases chances of introducing new bugs (as described below). High-level people like Eric uses a process to determine which bugs are worth fixing.

Eric approaches bugs in a well-defined process: After a bug is reported and confirmed, a decision must be made to FIX or NOT FIX the bug. This is because:

  1. There is a time constraint to fix all bugs
  2. Fixing a bug may introduce more bugs

The bug is analysed in terms of Severity, Frequency, Cost, Risk. The first 2 relate to the client: what the impact is and how often is occurs. The last 2 relates to the developers: how difficult / how long to fix it and the probability of introducing more bugs.

The recommendation is to plot the first 2 factors on a graph. Always fix stuff in the top right corner and never fix stuff in the bottom left. The factor ratings might change over time, such as a miscalculated risk. Re-visiting these factors when they change can allow us to make better decisions (including rollback bugfixes).

[1] is a condensed version of what Eric wrote in [2] originally.

[1] http://technology.guardian.co.uk/weekly … 95,00.html
[2] http://software.ericsink.com/articles/F … tions.html

RFID = vulnerable

RFID, short for radio frequency identification, is becoming the de facto identification mechanism recently. There is no need for line-of-sight, can be embedded as passive emitters (no onboard battery), and can be small enough to be implanted into the human skin.

However, like the early Internet stages, security was not a concern and was not considered in its design. This leads to problems especially when many RFID are used in security areas such as building passes and passports. Other uses include tracking shipment, retail stores, library books. The tags may store credit information, which can be used in gaming centers, public transport, petrol stations and toll gates.

Data on an RFID can be easily tapped by a scanner. They are usually unencrypted AND unlocked, meaning they can be read in clear and even overwritten. People have tried tapping a hotel keycard and transferring the information onto a cream cheese (retail food product) with an RFID tag. He then used the cheese to open his hotel door!

Building passes can also be scanned by just walking past the person who keeps the pass in his wallet or bag. The signal can be re-emitted at the building door to gain access easily. Walking along a shelf of library books using an emitter can potentially erase all information in those tags if they are left unlocked. Free gas top-ups are possible and are tried and tested.

I wonder if our EZLink, ERP, NLB, and building passes in Singapore have the same security issues…

http://www.wired.com/wired/archive/14.05/rfid.html

OpenID

I tried OpenID last month if i remember correctly, after I saw it in LiveJournal. Basically I searched for a PHP OpenID server solution and “installed” it on my server, and set up an identity for myself. I tested the identity by using it on LiveJournal comments. LiveJournal is the main website that supports OpenID, though some other sites are beginning to offer OpenID as an alternative authentication mechanism.

OpenID asserts that you own a particular URL, meaning it authenticates your identity. It does not provide trust or authorisation. It is advertised as a true decentralized authentication system, as opposed to other central authentication system like Microsoft Passport. We’ll look at how it works, and followed by all the other benefits that OpenID says it achieves.

From the user point-of-view, he accesses a web site which requires his identity. An OpenID logo is available with a textbox for him to type his OpenID identity. He submits his identity URL and his identity server login is displayed. Once he logs in using his identity server’s id/password, he is returned to the original web site and he is authenticated with the identity.

Behind the scenes, once the original web site picks up the identity URL, it accesses that URL to pick up the identity server location inside the URL’s header. The site redirects the user to the identity server site, along with the URL that the identity server is supposed to return to after authentication. The identity server takes over, authenticates the user, and redirects the user back to the URL given by the original website, along with a “key”. The user submits this key to the orignal website, and the web site verifies this with the identity server, to check if the key is valid. The original web site is then satisfied that the identity URL provided belongs to the user.

The good things about this are:
(warning: example URLs are ficitious, they may not represent actual web sites)

1. Only HTML required for identity URL.
Any web page can be your identity URL, you need not have PHP or ASP running, just HTML. This is because you just need to insert a special tag in the HTML section, which points to your identity server. The identity server can be total different from your identity URL.

E.g. your identity is http://sned.blogserver.com/ where you can only put HTML pages, and you registered your OpenID with http://www.openid.net/

2. Single ID
The OpenID can be used on any site that supports OpenID, it does not require you to have an account with the site you are visiting. You do not have to sign up or remember multiple passwords. You just need one OpenID identity.

E.g. You leave comments at LiveJournal.com using you OpenID registered at http://www.openid.net/, without a livejournal account.

3. Decentralized autentication, Freedom to choose identity server
There is no central OpenID server that stores all your login information. All of it is in your identity server, which can be run by any company. Even if your identity server goes out of business, or turns evil and start charging $, you can switch to another OpenID server, but preserve your identity URL by just changing the tag in you to point to the new server.

E.g. You identity is http://sned.blogserver.com/ and you use http://openid.microsoft.com/ as your identity server. Later you change your server to http://www.openid.net, but still use http://sned.blogserver.com/ as your identity.

Yet with all these benefits, adoption is slow. Use of OpenID is now still only seen in blog comments. This is because while OpenID can establish you own that identity, the web site has to TRUST the identity server. I can easily set up my own identity server, which authenticates myself irregardless of the id/password I use. (tried and tested). However, once a trusted network of OpenIDs is set up and used by popular web sites, it may be possible to become a widespread identity system.

AJAX – Asynchronous Javascript And XML

A discussion on using AJAX instead of traditional binding to build datagrids prompted me to write this. AJAX, still considered new technology, is being tried out by many developers, but sadly in more wrong ways than one.

Since AJAX involves Javascript and XML, it can be used in most web-based projects, including Java (servlets), PHP, .Net, etc. AJAX allows creating a XMLHttpRequest using Javascript and call server-side functions and returning XML responses. The response can be used to update the GUI using DHTML without refreshing the whole page.

So far the most common example I’ve seen is free-text auto-completion, such as sending a web-based email. The user types a few letters and a combo-box like list appears below, allowing the user to select a completed email address retrieved from the user address book in the server.

Other valid uses I’ve seen is using AJAX to update RSS tickers (starbean forum), and using AJAX to update dependant combo-boxes (selecting category updates list of products). Generally I feel that use of AJAX should be for a updating information on a page that otherwise required a “postback” type of update in Web 1.0. (btw if you’re not sure about Web2.0 or Web1.0 go google it up)

Talking about all these good points, here comes the bads, and WHY I personally think it’s bad:

1. Using AJAX to emulate server-side include (eg separate menu.html and use AJAX to load menu.html on every page)

AJAX is aimed at updating the UI after it has been displayed. To use AJAX to load another page during onLoad is making the client-side browser make another server request, during the server request. The code required (at the moment) is also much more complicated and bug-prone. If the server software (such as IIS/Tomcat/Apache) supports server side includes (SSI), PLUS most of the time the page already requires generation by the server, why not use it? Using SSI causes the page to be created completely on the server side and sent to the client directly.

2. ELIMINATE page refresh by loading all links using AJAX (eg. all links and form submits are AJAX calls that rewrites the current page)

For what? Don’t like the “Loading …” in the status bar? Maybe you shouldn’t be using web then. This requires the entire site’s logic to be embedded in one single page. You get the idea.

3. Databinding using AJAX

This point is arguable, it depends on the implementation. If the purpose is to update the table after it is displayed, it may still be justifiable. But if the binding is done during onLoad, it violates the same purpose as point 1. BTW I don’t think it should be directly binded. Instead AJAX can be used to download updates of the table periodically.

4. Server-side client-side validation

OK this came from thedailywtf, but this guy really doesn’t know what he was doing. He somehow felt the client side date validation was so difficult, that he wrote his Javascript validation function to make an AJAX call to the server, which implemented a Web Service that accepts a String and returns whether the input was a valid date. This could have been accomplished by using CLIENT-SIDE javascript, and various FREE ready scripts are available for download. For one good thing, AJAX works well with web services as they communicate in the same language – XML, but one bad thing is enough to kill many other goods.

If you think otherwise, drop me a comment.

RSS – Really Simple Syndication

Just to fill in the “missing” entry on 9 May, I’m gonna talk about RSS. Many sites offer RSS, but I’ve been slow to catch up on whats that. It simply stands for Really Simple Syndication, which is a way of staying updated with the new stuff on a web site.

The best use is on News and Blogs websites, where the headlines of the updated news are published on RSS. RSS clients then regularly check for updates and the user is informed of any new things on the site (either manually or automatically).

Basically, the server-side (news/blog website) publishes a RSS feed, which is a small XML file containing a list of the updated items. Each item contains a title, a URL link, date and a text description. RSS clients, known as aggregators, regularly poll the XML file to download any updates for the user. The updates may be in a link form where the user can click back to access the website. This greatly increases the traffic for the news site, and allows users to stay updated with site content.

Aggregators come in a few forms:
* standalone applications – which a user must keep running in the background;
* plug-ins – such as a Mozilla or email client plug-in that shows as an icon in the host program;
* websites – a web application that does aggregation
* scripts – PHP, ASP or other web-based scripts, dedicates a small space on the site to display the updated content.

Each type has its advantages – standalone apps usually support notification such as a sound or popup when news arrive; plug-ins reduce the number of running applications you have; websites allow you to see what news you’ve already seen irregardless of the machine you use; scripts allow you integrate RSS into any of your website.

I have tried RSS and installed one custom PHP/AJAX RSS script into the starbean forum. I modified the script to do a nicer fade in/out to swap the news from CNA.

Competing technologies to RSS include Atom and RDF, which are also XML formats for online syndication.