Thursday, October 27, 2016

Slowing Program Execution -- Slow Your Jets


Earlier this week I was assigned debugging a time-critical bug on a financial system.  Reproducing the issue with sufficient consistency is a problem, the suspicion is that slowing the execution of a processing thread would allow reproducing the issue with a high degree of certainty.  The recommendation was to recompile the system with an artificially introduced delay in the processing thread.

As an alternative of creating a one-off build and deploying on a highly used system I thought it would be worthwhile to see if GDB would serve the purpose instead.  The idea would be to attach to the process remotely, introduce a breakpoint within the processing thread which when reached would 'sleep' sufficiently to slow the processing thread sufficiently.

Let's start with a very simple program.


$ cat -n main.cpp
     1 #include <stdio.h>
     2 #include <sys/time.h>
     3 #include <unistd.h>
     4
     5
     6 double currentTime()
     7 {
     8  struct timeval start;
     9  gettimeofday(&start, NULL);
    10  const double mtime=start.tv_sec + (double)start.tv_usec/1000000.0;
    11  return mtime;
    12 }
    13
    14 int main(int argc, char* argv[])
    15 {
    16  printf("%s:%d) main process initializing\n",__FILE__,__LINE__);
    17  while(1)
    18  {
    19    printf("(%s:%d) %lf\n",__FILE__,__LINE__,currentTime());
    20    ::usleep(500000);
    21  }
    22  printf("%s:%d) main process terminating\n",__FILE__,__LINE__);
    23 }



If we assume the while-loop represents the processing thread we wish to slow, setting a breakpoint at line 20 will start us off.  Then, we need to register a sequence of commands with hitting the breakpoint, specifically sleeping for X seconds, then continuing on with the program.  The following GDB command file does just that:


$ cat -n gdb.cmd
     1 b main.cpp:20
     2 commands
     3 silent
     4 !sleep 3.0
     5 cont
     6 end
     7 r


Line 1 adds the breakpoint, followed by lines 2-6 consist of the commands run when tripping the breakpoint.  Line 3, 'silent' disables the usual message about stopping at the breakpoint, if you elect to not silence the message you'll likely need to set pagination off otherwise the debugger will pause once the terminal is filled with debugging content and prompts you to continue.  Line 4 executes a shell command, sleeping 3 seconds, followed by line 5 which continues the program.  Line 6 ends the command sequence.
The last line, line 7, will begin running the program after registering the breakpoint.  As an alternative, you could issue the 'r' command interactively.

Compile the program as you would normally with debugging symbols.

$ gcc -g main.cpp -o main


Run via debugger, and associated GDB script by issuing:

$ gdb -x ./gdb.cmd ./main
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.04) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./main...done.
Breakpoint 1 at 0x4006de: file main.cpp, line 20.
main.cpp:16) main process initializing
(main.cpp:19) 1476848880.178461
(main.cpp:19) 1476848883.681235
(main.cpp:19) 1476848887.184019
(main.cpp:19) 1476848890.686872
(main.cpp:19) 1476848894.189695
(main.cpp:19) 1476848897.692520
(main.cpp:19) 1476848901.195763
(main.cpp:19) 1476848904.699169
(main.cpp:19) 1476848908.201601
(main.cpp:19) 1476848911.704499
(main.cpp:19) 1476848915.207383
(main.cpp:19) 1476848918.710270
(main.cpp:19) 1476848921.326889
^C
(main.cpp:19) 1476848921.826974
(main.cpp:19) 1476848922.327060
(main.cpp:19) 1476848922.828102
(main.cpp:19) 1476848923.329105
(main.cpp:19) 1476848923.829189
(main.cpp:19) 1476848924.329266
Program received signal SIGINT, Interrupt.
(gdb) q
Quit anyway? (y or n) y
$


Notice by the time stamps that the loop is running at ~3 secs, at least until hitting Control-C when resumes normal execution without the debugger breakpoint.

Seems like a viable alternative.  I've intentionally overlooked the complexity of attaching to a remote process but the proof-of-concept seems viable.

Cheers.

No comments:

Post a Comment