🛠️ Hacking Task Manager: Fixing the Slow Column Sorting

Table of Contents

For a long time I’ve had a problem with task manager on Windows 10 where changing the process sorting between name and CPU usage is just extremely slow. On my computer it takes a whopping 2-3 seconds, during which the program just freezes. This is unacceptable for an essential program that I use daily.

I recently picked up a license for the fantastic Superluminal profiler (not sponsored), and I figured this would be a perfect oppurtunity to put my new profiler to use. Let’s see if I can figure out why it is so slow, and prehaps even fix it.

Why Is Sorting So Slow?

When I click on the name or CPU column to sort the process list, Task Manager freezes for a good 2-3 seconds. This made absolutely no sense to me. At the moment of writing this I have 340 processes running on my machine, which is quite a few, but it is not so many that the time it takes to sort them should be noticeable to a human.

I started up Superluminal and captured some data while I was switching between the different sorting modes. After letting it automatically download all of the symbols, we can immediately see the functions that get called when sorting, and sure enough, it is taking over 2 seconds to sort the list view.

profiler shows that sorting takes a long time

I could see that it is doing mostly UI related work, mainly calls to a function called DirectUI::Element::Insert. It makes sense that we are bound by the UI logic, because the sorting should be virtually instant in this context. But is the code just so terribly inefficient that shuffling around the UI elements would take 2 seconds? Let’s dig deeper.

Looking at the call stack I could see most of the time in Insert is actually spent in an event handler DirectUI::Element::OnPropertyChanged. Going deeper, this eventually calls to a function called NotifyAccessibilityEvent, which calls into the kernel. That seemed interesting, because that is a significant amount of time spent on something related to accessibility, but I don’t use any accessibility features on this computer.

I spent some time trying to figure out what exactly this function is doing. Since it is calling down into the kernel I looked up the NtUserNotifyWinEvent function. That function seems undocumented, but there is a function called just NotifyWinEvent, that probably calls the Nt function internally. The documentation states very helpfully:

Signals the system that a predefined event occurred. If any client applications have registered a hook function for the event, the system calls the client’s hook function.

But further down something called Microsoft Active Accessibility is mentioned, that sounded relevant, but it turned out to be another huge rabbit hole of technical overviews and COM APIs, so I decided to go back to the profiler and have a look at that function again.

Searching for NotifyAccessibilityEvent in the call stack again I could see a lot of white in the timeline. But after zooming in, I realized that the function is actually called a lot more often than I thought, just from different call stacks. Just eyeballing it, it sure looks like it could account for 90% of the time spent.

callgraph

As can be seen here, many of the function calls that are not highlighted are also that function: accessibility event everywhere

Fixing The Problem

While I gave up on figuring out why the function is called and what it is doing, I already had some ideas of how to approach fixing the problem. My main idea was that if I get some more information about the function, I could install a function detour and just return, essentially replacing the function with a noop, hoping that it doesn’t do anything vital for Task Manager to function.

Looking over the assembly to try to figure out how the function behaves and what the signature might look like, I got very lucky and noticed something very interesting. At the start of the function there is a breakout case that checks a global variable, and if that is set, it just returns and does nothing. Perfect, that is just what I wanted to do! But even more interestingly, Superluminal had pulled out a symbol name for that global variable, DirectUI::g_fDisableNotifyAccessibilityEvent. Hah, it’s like the developers knew that you might want to disable this function.

disassembly

The fix then should be very simple. I just need to set the global variable to 1, and the existing code should do all the work. So I broke out a debugger, got the program running, found the NotifyAccessibilityEvent function, and resolved the address of the global variable from the cmp instruction. Then I changed the byte at the memory location to 0x1.

debugger

Now, checking back in Task Manager, doing a qick sort test, and voila! Sorting is instant. All I had to do was flip a bit to disable a function related to an accessibility system.

With a solution on hand, the only thing left was to create a patch for it, so that I don’t have to use a debugger every time I want to start Task Manager.

Building A Tool To Patch It

I was a bit curious about the g_fDisableNotifyAccessibilityEvent variable, and if it perhaps has a registry option, or some way to persistently configure it. So I did some digging in the dissasembly, but found that there was no other uses of this variable other than the read in NotifyAccessibilityEvent. Maybe this is just a development option that was left in the release version? No shortcuts then.

I didn’t want to apply a persistent patch to TaskManager.exe, since I didn’t know if this could have any unintended negative effects, and I don’t want Task Manager to stop working. So my idea was that if I could write a program that is able to resolve the address of the NotifyAccessibilityEvent function at runtime, it could do the same thing that I did in the debugger - get the address off the global variable and write a 0x1. Then I can choose to apply the patch whenever I want.

This was the plan:

  1. Find the running taskmgr.exe process
  2. Find the address of the NotifyAccessibilityEvent function
  3. Read the machine code from the start of that function and find the cmp instruction
  4. Resolve the address of the global variable
  5. Write 0x1 into the global variable

The most common and maybe a bit easier way to get access to the process’ memory space is to inject your own DLL. But I didn’t want to deal with that and decided to find a way to do it without DLL injection.

To find the running taskmgr.exe process, we can iterate over the running processes. When we have found the process, we can iterate through its modules to find the DLL that has the function we are looking for. I have created helper functions for this so that we can focus on the interesting parts.

DWORD id = FindProcess(TEXT("Taskmgr.exe"));
if (id == 0)
	return 1;

// Use the process ID to open the process with full permissions so that we can access the process'
// memory.
HANDLE hProcess = OpenProcess(PROCESS_ALL_ACCESS, false, id);

// Now find the module handle for the DLL contaning the NotifyAccessibilityEvent function.
const TCHAR* moduleName = TEXT("C:\\WINDOWS\\system32\\DUI70.dll");
HMODULE hTargetModule = FindModule(hProcess, moduleName);

Since it turns out that NotifyAccessibilityEvent is an export, we don’t need to parse the PDB to find the function address. Instead we can use the GetProcAddress function. For this to work we need to load the DLL into our own process. We can use a flag to tell Windows that we only want to look at the module, not execute anything.

// Load the DLL into our process (but do not execute anything!) so that we can use GetProcAddress.
HMODULE hModuleInMyProcess = LoadLibraryEx(moduleName, NULL, DONT_RESOLVE_DLL_REFERENCES);
EXPECT(hModuleInMyProcess);

// This part would have been simpler if we had gone the DLL injection route, but it turns out
// that it is not that hard to calculate the address of the function in the target process'
// address space.
//
// As explained in this answer, the offset of the function is always the same relative to the
// module handle, so we can subtract the module handle in our process and then add the module 
// handle from the live task manager process.
// https://stackoverflow.com/a/26397667
char* procAddress = (char*)GetProcAddress(hModuleInMyProcess, "NotifyAccessibilityEvent");
void* procAddressInTarget = procAddress - (char*)hModuleInMyProcess + (char*)hTargetModule;
EXPECT(FreeLibrary(hModuleInMyProcess));

Now comes the fun part. With the instruction address we can find the cmp instruction and parse out the address of the global variable that is used as the comparand. This way of doing things is not perfect, as it will break if the content of the function changes in a future update. But it will do for this proof of concept.

// NotifyAccessibilityEvent:
// 00007FFDFBCF7570 | 48:897424 20     | mov qword ptr ss:[rsp+20],rsi       |
// 00007FFDFBCF7575 | 57               | push rdi                            |
// 00007FFDFBCF7576 | 48:81EC 80000000 | sub rsp,80                          |
// 00007FFDFBCF757D | 48:8B05 74C61500 | mov rax,qword ptr ds:[7FFDFBE53BF8] |
// 00007FFDFBCF7584 | 48:33C4          | xor rax,rsp                         |
// 00007FFDFBCF7587 | 48:894424 70     | mov qword ptr ss:[rsp+70],rax       |
// 00007FFDFBCF758C | 833D 29F61500 00 | cmp dword ptr ds:[7FFDFBE56BBC],0   | <- Target instruction
// 00007FFDFBCF7593 | 48:8BFA          | mov rdi,rdx                         |
const int offsetOfInstruction = 0x1c;
const int sizeOfInstruction = 0x7;
unsigned char buffer[offsetOfInstruction + sizeOfInstruction];
size_t nBytes = 0;

// Read in a chunk of machine code and verify that the instruction bytes are what we expect so
// that we don't break something in case the function changes in the future.
ReadProcessMemory(hProcess, procAddressInTarget, buffer, sizeof(buffer), &nBytes);
EXPECT(nBytes == sizeof(buffer));
EXPECT(
	buffer[offsetOfInstruction + 0] == 0x83 && // cmp
	buffer[offsetOfInstruction + 1] == 0x3D && // cmp
	buffer[offsetOfInstruction + 6] == 0x00);  // 0x0 operand

// The first operand is the address to the global variable, but stored as relative to the address
// of the next instruction.
uint32_t relAddr = *(uint32_t*)&buffer[offsetOfInstruction + 2];
// To get the absolute address we add the address of the next instruction to the relative address.
void* globalVariableAddress =
    ((char*)procAddressInTarget + offsetOfInstruction + sizeOfInstruction) + relAddr;

Now we have the pointer we need to write to the variable.

// We expect the value to be 0x0, if it is not, something might be wrong. Read the current value 
// and check.
unsigned char value;
EXPECT(ReadProcessMemory(hProcess, globalVariableAddress, &value, sizeof(value), &nBytes));
EXPECT(nBytes == sizeof(value));

if (value == 0)
{
	unsigned char newValue = 1;
	EXPECT(WriteProcessMemory(hProcess, globalVariableAddress, &newValue,
	                          sizeof(newValue), &nBytes));
	EXPECT(nBytes == sizeof(newValue));
}

That’s it. To apply the patch, we simply have to make sure Task Manager is running, then execute our program with administrator priviliges. If you do not execute it with elevated persmissions it will not be able to write to task manager’s memory, since task manager always runs elevated. Now we could create a script to run the patch whenever we want.

If you want to read the full code of the patch you can find it here: code.cpp.