Cursor Lock Optimizations

My recent major version of Multimon Cursor Lock was pretty CPU efficient, using around .15 to .3 seconds of CPU time after two minutes with a poll rate of 20ms. Still, I was noticing a large number of page faults while the program was polling–about 100 for each poll. Page faults occur when a needed “page” of memory is not currently in physical memory and thus needs to be loaded back in from the page file (swap space on a hard drive). Of course, page faults are not fatal, but they do lower performance.

Tonight, I finally decided to investigate the page faults and see if there was any fix. It was really as simple as watching Process Explorer’s process performance tab while stepping through the code. It didn’t take long until I found the culprits: CreateToolhelp32Snapshot() and Process32Next(). (I use these functions to enumerate the processes on a system. Of course, using EnumProcesses() would have been a better way, but that API function is only available on 2000/XP+ and I wanted to support as far back as 98.) Creating the snapshot requires at least 20 page faults, and looping through each process is a few more. Since I saw no memory actually being swapped, I figure the system module that contains this legacy function must be at fault (pun intended).

Seeing that my process enumeration code was causing 100 faults every time it ran (give or take, depending on how many processes are running), I came up with a simple optimization to cause it not to be needed as much. It was basically just to allow the program to remember the ID of the process it locked to last and check if its window still had focus on the next poll. Thus, while the program (typically a game) is locked, it doesn’t need to enumerate processes and creates no page faults. When I went to look at the performance for my optimized code, I actually had trouble seeing if it used any CPU at all. Apparently, the usage is so much smaller than the performance counter’s resolution, that after five minutes of locked polling, I couldn’t see any increase in CPU time usage. Because the usage is so immeasurably small, it was amusingly difficult to say how much the performance increased for the changelog.

Also, I’ve written a new column recently; it’s in the guides section, but it really doesn’t fit that category. I’d love to rearrange my column categories to include a software one, but unfortunately the URLs for columns are based on their category. Anyways, if you enjoyed this post, you’ll probably enjoy this column as well. It’s the same kind of reverse engineering goodness that gets us geeks all hot.

This entry was posted in Programming and tagged , . Bookmark the permalink.

Leave a Reply