Memory Leaks Detection: A Different Approach
There are different ways to manage dynamically allocated memory
Experienced C/C++ programmers know what it means to properly manage dynamically allocated memory to avoid memory leaks. Michael presents an alternative approach.
Experienced C/C++ programmers know about the need to properly manage dynamically allocated memory to avoid memory leaks. Unfortunately, many of us still find ourselves in a tough position when there is a memory leak in the software. How is it detected in the first place? Simply looking at the task manager (or other appropriate tool that shows memory-use statistics) indicates that the memory used by the process is constantly growing. That means that the memory use is expected to remain constant. But the program must allocate memory, at least at the beginning of its life. Therefore, a more precise description would be that the process is running for a while, the input rate is constant, and the memory continues to climb. The interpretation of the input rate depends on the purpose of the program; for a web server, for instance, it can be network traffic throughput, or the number of requests per second.
How do you attack this problem? Assume that your first attempts at just looking at the code failed, and you need to get help from an automatic memory-leak detection tool. While your favorite tool may use unique techniques to trace memory allocation/deallocation and different algorithms to organize that information at runtime, it most probably works like this:
1. From the moment the program starts, it traces each memory allocation (probably with additional information, such as a call stack).
2. It registers all released memory.
3. Before the program terminates, it prints out information about each unreleased memory block.
When Good Isn't Good Enough
In many cases, there are two reasons why memory-leak reports won't give you much useful information. First, they complain about many nonexisting memory leaks and miss many real ones.
Say, for instance, that there are many memory allocations that are never released by design. Those are the static members, allocated at the beginning of the process, and Singleton objects created later using a lazy initialization strategy (for instance, your web server may load a static file to the memory when it is first requested). Other examples include custom memory pools, which can allocate up to a certain maximum number of objects, then reuse them through the rest of the process's life. In all these cases, the memory is never released. Although it's a good practice to clean up all resources, it doesn't always happen. I can even give you a reason for releasing those objects—it slows the shutdown process of your program. But what's more important is that all the memory used by the process is automatically reclaimed by the operating system (on most modern systems), and while the process is running, the size of all that static memory is known in advance and is strictly limited.
The second reason is that memory leaks, which would be missed by such reports, can be defined as "Java-style memory leaks." That is, although the object is no longer needed (at least according to the program's logic), it is never deleted because you have a valid pointer to that object, probably lost in some cache or other container. And if this object is properly wrapped with a smart pointer, it will be deleted at the end and won't appear as a memory leak. Needless to say, this is a common reason for constantly growing memory.
Any remaining classical memory leaks, which can be detected in the usual way, may still be missed because they're hard to discover in the background of other false alarms.
Detecting a Better Tool
So let's make a wish list of requirements for a memory-leaks detection tool (I'm only talking about management of runtime info regarding memory use, and not about techniques to trap allocations/deallocations made by the program):
- They should let you begin registering new memory allocations at any arbitrary point of time. You will let the application properly initialize first.
- They should stop monitoring memory activities and get a current-leaks report at any time.
Returning to the web server example, assume that it holds a session context per user—this is some information needed to process all requests made by the same visitor. As you don't want to keep this information forever, you release all related objects after a timeout of 20 minutes from the visitor's last request. Figure 1 shows a schematic lifetime of session objects, assuming that you start monitoring at time 1 and end at time 7.
At time 7 (when you end memory monitoring), session objects for Ron and Superman are still alive, and they were allocated during the monitoring interval. As a result, those session objects would be listed as memory leaks, even if you know that they aren't. That is exactly what you tried to prevent.
As a solution, you need to rethink the second requirement and replace it with:
- You should be able to stop registering new memory allocations, while still updating the state on each released object.
- At time (c), you want to get a report of all memory blocks allocated in time period (a)-(b), which were not released within the (a)-(c) time interval.
For the imaginary web server, assume you've started the process at 10:00. You can begin all monitoring at 10:10; turn off new allocations tracing at 10:15; and only at 10:40 get the final report of all memory leaks (for objects allocated in the five minutes between 10:10 and 10:15). Such flexibility in activation times and separation of new and delete processing are key points in building a better memory-leaks detection tool.
When thinking about additional features it might have, I would add:
- Start monitoring only memory allocation made from a specific thread or group of threads.
Again, this is a good way to focus on specific areas of processing, ignoring unnecessary noise in reports. For example, a web server may have a pool of threads that handles user requests. You suspect that the memory leak is hiding in the code processing those requests, so you want to filter out all allocations made from other parts of the code. In many cases, you still want to process all memory deallocations, even in other threads (for example, you may send objects from processor threads to a special logger thread, and have the latter release those objects).
External Tools Versus Integrated Solutions
There are trade-offs in deciding which tool is most appropriate for your project. The first step is choosing between an independent external tool or an integrated solution that requires some changes in your source code.
The clear advantage of the first option is the ease of use—you just need to plug it in to the application, no extra work required. Going back to the list of expectations from the memory debugging tool, you will probably need to give up the last requirement— selection of threads—because external tools are not aware of the internal layout of the process. You will also need to guess the best times to start/end monitoring, without knowledge of the current state of the program.
On the other hand, adding simple code to your web server allows you to monitor specific threads while it processes a single request. You may prefer to invest some effort now, and build an infrastructure that will make memory debugging much easier in the future. In the long run, it will save you a lot of time.
Existing Tools?
Are there existing tools that implement the functionality I've examined here? To be truthful, I'm not aware of any commercial memory-profiling tool, but there are a couple of open-source projects that implement this approach:
- Windows Leaks Detector (sourceforge.net/projects/winleak) for Win32 attaches to any running process (no source code required), provides a basic UI to start and end monitoring memory activity, and finally produces a report of all memory leaks with full call stack information. It also lets you automatically add debugger breakpoints when memory allocation occurs.
- LeakTracer (www.andreasen.org/LeakTracer) for Linux requires special compilation of your project.
No comments:
Post a Comment