"true" client side testing best practices

iRobot NS5 When performing a LoadTest user actions are simulated. This implicates that mouse or keyboard actions are executed based on a script, based on a scenario, and that the script waits for a response on the screen.

The response on the screen can be determined using API’s giving information about windows present, or the controls on the windows. For instance: the script waits until a window is active with the caption "Microsoft Word".

Another way of determing if a response is given is by comparing the content of the screen with a bitmap. For instance: the script waits until an empty document is displayed in Microsoft Word.

The difference between the two techniques is that a window caption is present right when the application is launched (even if the application is still loading) while the content on the screen is more simular to the way users interact in a session. So looking at a screen region is more accurate, it prevents assumptions (best practice #9 in loadtesting best practices) like "how much time should we wait between lauching an application and clicking on a menu?”.

In this article I will be discussing some best about practices about “true” client side testing (best practice #12 in loadtesting best practices).

What is “true” client side testing?

First of all, let me try to explain what I mean with “true” client side testing.

During a LoadTest a users actions are simulated, or in other words whey’re simulating a user. A user is controlling a workstation via input devices like a keyboard or a mouse, and receiving the results on a display. The workstation is connected, via a network, to servers / virtual desktops. For the record: they don’t care what’s “behind” the curtain (workstation).

Server side testing

Traditional loadtesting applications use a process on the server to simulate user actions. This process can be a script based on batch, VBscript, AutoIT, you name it or a proprietary application. Executing the process on the server side enables the script to “see” and “control” the session, enabling the script to simulate user actions. This is what I call “server side testing”Downside of this method is that the process doesn’t care about what happens between the server and the user (the left side from the servers). Or, in other words, the effects of the network components/connection are not visible for the loadtest. Another downside is that the script uses the system clock of the server (or virtual desktop). If the server is virtualized, a hypervisor (VMM) under heavy load might cause a clock drift affecting the accuracy of the measurement (best practive #13 in loadtesting best practices).

Client side testing

Like a person in real life, the content of the screen on the client device (a workstation) can be used to determine if a action is completed (for instance launching an application). When a loadtest application is able to “look” at the content of the screen from the client, including network components/connection, this would be result in a more accurate measurement.

I frequently use DeNamiK LoadGen to simulate user actions and perform LoadTests. With DeNamiK LoadGen I’m able to simulate a user, by controlling the keyboard and mouse and waiting for results by looking at the content of the screen. In other words: I tell the DeNamiK User Action Framework (DUAF) to wait until a certain bitmap (hash) is visible on the screen of the client, and then I’ll continue with next action.

Summary

So, that’s what I mean with “true” client side testing. The content of the screen on the workstation, including the network components/connection like a WAN, and remoting protocol effects. If a large image is displayed via a slow WAN connection, resulting in blocks building slowly on the screen, the script wait until it is displayed (like a user) instead of continuing because the caption exists.

“true” client side best practices

Now I’ve explained what I mean with “true” client side testing, here are some best practices. Like the loadtesting best practices, they are based on my experience.

source: Dilbert comic strips

1. Use screen regions, unless…

Try to use the content of the screen as a condition in your script as much as possible, unless this is impossible or requires assumptions. Since the content of the screen, a bitmap, is the most accurate condition this is the preferred method. Especially because it prevents assumptions, assumptions like “wait x-milliseconds before the executing a command” or “the document is loaded”.

Use other (API based) functions when this is more accurate or necessary to determine the location of a window.

2. Position of windows

Determine the position of windows using API functions and continue using screen regions. An API is capable of retrieving properties of windows, like the window metrics, which is more efficient than looking for a bitmap on each pixel on the screen.

3. Use small regions

The region on the screen, which will be used to compare with a bitmap(hash), should be as small as possible. This has two benefits. First of all performance, the smaller the region the easier it is to calculate the hash / compare the bitmaps. Second, the smaller the chance that the content of the region is changed due to another process (see the next best practice).

4. Use static content

The content of the region, which will be used to compare with a bitmap(hash), should be static. When the content of the region on the screen is changed the bitmap(hash) won’t match, and the script won’t continue. An example is the name of a document in the region, a username or a systemtime. Although these are valid items to check, in most conditions they’re irrelevant.

5. Re-use regions and bitmap(hash)

When you’re testing a certain region on the screen multiple times, for instance when you checking if the content of a document is displayed, try to re-use the same region and matchin bitmap(hash). This way, when the environment changes, the script can be updated easily. And, it makes troubleshooting a lot easier. A good example is the progressbar in a browser like Internet Explorer. I frequently check the progressbar to see if the page is loaded, if the progressbar isn’t visible (in IE) the page is loaded. By re-using the region and matchin bitmap(hash), maybe in a function, I’m able to quickly find and solve problems.

6. Mouse events

Mouse events, like mousebutton down or up events, should be executed when the content of the screen matches a certain condition. For instance, when the Start button (I keep on calling that) is pressed, you’ll want to wait before clicking on ‘All Programs’. As long as the menu isn’t visible, there’s no need to click (in fact, if you do the script will fail). This is an ideal situation where compare the content on the screen with a small bitmap (10×10) helps preveing assumptions.

7. Color depth

The hash of a bitmap, representing a certain region on the screen, is calculated with a formula that uses the color of each pixel and returns a hash. If one of the pixels changes the hash changes. This means that if the color depth (16bpp, 24bpp, etc.) changes, the hash will change. This sounds logical, and it is, but this has a big impact on the script. If the script is created with a different color depth then when it is exectuted, all hashes are invalid and need to be recreated (loads of work!)

Also keep in mind that not only the color depth of the session is important, the color depth of the client is important aswell. The hash of a bitmap changes when the color depth of the client is changed (for instance from 24bpp to 16bpp), even when the color depth of the session remains equal (for instance 16bpp).

8. Speedscreen / font smoothing (etc)

Remoting protocol techniques like speedscreen, progressive display or other techniques that uses image compression change the pixels of a bitmap. Or in other words : the content of the screen is displayed in a different way each time you display it (to be honest, there are usually around 3 to 4 hashes). This means that the hash of the bitmap changes when a different compression (or other technique..) is used, some of these techniques kick in under certain conditions (like a slow WAN or heavy load) which makes it difficult to get all hashes at the first time.

In the script you should be able to enter multiple hashes for the same bitmap. In the DeNamiK LoadGen this is done by giving an array of bitmaps :

New String(){“hash1”, “hash2”, “hash3”}

Font smoothing, which is enabled by default since Microsoft Window Server 2008, smooths the edges of fonts. This makes the content look better, and better readable for users, but makes it difficult to get a static bitmap(hash).

Best practice is either avoid using regions that are influenced by the optimizations techniques OR to disable the technique. In fact, none of the two is ideal but (for now) necessary. I’ve heard rumours that a new algorithm is developed by DeNamiK to prevent this ‘workaround’. But as long as I haven’t seen it, this is the way to go!

Any comments, additions or questions? Please leave a comment, send me an e-mail or a tweet.

Ingmar Verheij