2008/06/11

My disk is thrashing every 4-5 minutes and the system is blocked for a few seconds each time

Have you ever had this happen?
I was lately remembered about this (still unfixed) problem a few days ago, and I guess this is a problem many people are experiencing, but not everyone knows how to solve.
But first a little background:
The system INI files are updated very frequently during the normal WPS operation. Since they are critical to the correct operation of the WPS, the system consolidates all the mods to disk every 4-5 minutes. When it does this, the actual amount of work depends on the size of the INIs, how much they are fragmented, how many mods are there to consolidate.
Sometimes around Warp 4 someone at IBM decided it was more secure to make the OS write to the system INI files in write-through mode. What this means is that writes to the system INI files does not go into the file system cache, but instead go directly to disk.
This may seem a good idea at first sight, but in fact it isn't so. If you think about it, this means the system doesn't optimize writes (i.e. pages that are updated, but whose content is not changed, are written nonetheless to disk), and it also generally means that the system will spend more time writing to INIs, so that the risk of a power interruption during the writes is more likely instead of less likely. Also think this: if power is interrupted while the mods are in cache but are still not written to disk, you lose the last changes you made in the WPS. But if power is interrupted while the mods are being written to disk, you get corrupted INIs and you have to restore from a backup. So it is best to minimize the time dedicated to writing the INIs to disk, instead of forcing a timely write like it was done. Using the write-back cache is the best method to ensure both data consistency and that the INIs on disk are as much up-to-date as possible.
What happens to a lot of users (and I'm sure everyone hits it sooner or later) is that this non-cached process of writing INIs back to disk starts to become a nuisance when some kind of threshold is crossed (presumably on the size of the INIs). The system suddenly blocks while the disk is thrashing, it doesn't accept input for several seconds, than it starts behaving normal for another 4-5 minutes. The it goes through the whole process again, and again, and again... you get the idea. Very irritating.
Peter Fitzsimmons found a way to (securely) patch the system so that it returns to the old behavior of using write-back caching even on INI writes. Later, Carsten Arnold packaged Peter's patch so that it can be easily installed and uninstalled.
You find it here. Note that you need a reboot after applying this.
Also note that, since this package patches PMMERGE.DLL, which is a frequently updated DLL, you might need to reapply this patch from time to time.
I have been using this patch since a lot of years, and I have never had problems with it.