This simple video demonstrated the value of visualizing complex algorithms, hopefully something that is incorporated into CS coursework. The value of divide-n-conquer techniques can be more easily understood with a side-by-side comparison.
We can generate an equivalent video by monitoring the list of numbers, generate a graph at each step and render the sequence of graphs into a video.
In order to generate each graph, we could modify the sorting algorithm to create and store a graph image. This approach however would require modification to the source code. An alternative, use a debugger to monitor the list of numbers and print out the list of numbers whenever it is modified. Then, post-run, we could use the debugger output to create individual graphs and stitch them together.
So, our approach will take the form:
1) create C++ sorting algorithm
2) run w/debugger and monitor the state of the input array
3) parse the debugger log, creating a graphical representation of the input array at each step
4) stitch each frame into a video
1 - Sorting Algorithm
Let's create a simple C++ example with a variety of sorting algorithms. We'll create a randomized list of integers and prepare to send it through a sorting routine.
$ cat -n main.cpp
1 #include <stdio.h>
2 #include <stdlib.h>
3 #include <vector>
4
5 #ifndef N
6 #define N 100
7 #endif
8
9 void swap(int& a, int& b)
10 {
11 const int tmp=a;
12 a=b;
13 b=tmp;
14 }
15
16 void display(int list[], int size)
17 {
18 for(int i=0; i<size; i++)
19 {
20 printf("%d\n",list[i]);
21 }
22 }
23
24 //============Selection Sort============
25 void selectionSort(int list[], int size)
26 {
27 for(int i=0; i<size-1; i++)
28 {
29 for(int j=i+1; j<size; j++)
30 {
31 if (list[j] < list[i])
32 {
33 swap(list[j],list[i]);
34 }
35 }
36 }
37 }
38 //======================================
39
40 //==============Quick Sort==============
41 int partition(int *a,int start,int end)
42 {
43 int pivot=a[end];
44 //P-index indicates the pivot value index
45
46 int P_index=start;
47 int i,t; //t is temporary variable
48
49 //Here we will check if array value is
50 //less than pivot
51 //then we will place it at left side
52 //by swapping
53
54 for(i=start;i<end;i++)
55 {
56 if(a[i]<=pivot)
57 {
58 t=a[i];
59 a[i]=a[P_index];
60 a[P_index]=t;
61 P_index++;
62 }
63 }
64 //Now exchanging value of
65 //pivot and P-index
66 t=a[end];
67 a[end]=a[P_index];
68 a[P_index]=t;
69
70 //at last returning the pivot value index
71 return P_index;
72 }
73
74 void Quicksort(int *a,int start,int end)
75 {
76 if(start<end)
77 {
78 int P_index=partition(a,start,end);
79 Quicksort(a,start,P_index-1);
80 Quicksort(a,P_index+1,end);
81 }
82 }
83
84 void quickSort(int list[], int size)
85 {
86 Quicksort(list,0,size);
87 }
88
89 void insertionSort(int list[], int size) {
90
91 int valueToInsert;
92 int holePosition;
93 int i;
94
95 // loop through all numbers
96 for(i = 1; i < size; i++) {
97
98 // select a value to be inserted.
99 valueToInsert = list[i];
100
101 // select the hole position where number is to be inserted
102 holePosition = i;
103
104 // check if previous no. is larger than value to be inserted
105 while (holePosition > 0 && list[holePosition-1] > valueToInsert) {
106 list[holePosition] = list[holePosition-1];
107 holePosition--;
108 printf(" item moved : %d\n" , list[holePosition]);
109 }
110
111 if(holePosition != i) {
112 printf(" item inserted : %d, at position : %d\n" , valueToInsert,holePosition);
113 // insert the number at current hole
114 list[holePosition] = valueToInsert;
115 }
116
117 printf("Iteration %d#:",i);
118 }
119 }
120
121 void sort(int list[], int size)
122 {
123 //selectionSort(list,size);
124 //quickSort(list,size);
125 insertionSort(list,size);
126 }
127
128 int main(int argc, char* argv[])
129 {
130 printf("(%s:%d) main process initializing\n",__FILE__,__LINE__);
131 std::vector<int> L;
132
133 for(int i=0; i<N; ++i)
134 {
135 L.push_back(rand()%100);
136 }
137 sort(L.data(), L.size());
138
139 display(L.data(), L.size());
140
141 printf("(%s:%d) main process terminating\n",__FILE__,__LINE__);
142 }
On line 5-8, we declare N which represents the size of the array size, then on lines 131-136 we allocate a list of integers and populate them with randomized numbers. Line 137 we provide the list as an argument to the sort() method.
2 - Run w/Debugger
Macros are evil so let me justify the use of N in the C++ source file. Typically, such a macro would be replaced with a static constant allowing more type-safe implementation. In this example though the size of the array needs to be known by the GDB commands as well. The makefile assigns the value of N, passing it in to the compiler as a pre-compiler directive and creates the gdb command file which also needs to know the value of N.
$ cat -n Makefile
1 CC=g++
2 MAIN=main
3 N=100
4
5 all: video.mp4
6
7 ${MAIN}: ${MAIN}.cpp
8 ${CC} -D N=${N} -g $< -o $@
9
10 run: ${MAIN}
11 ${SH} ./${MAIN}
12
13 plot: gdb.log
14 ${SH} ./plotter $<
15
16 gdb.cmd:
17 ${SH} echo "set pagination off" > $@
18 ${SH} echo "break sort" >> $@
19 ${SH} echo "commands" >> $@
20 ${SH} echo " watch *(int(*)[${N}])(list)" >> $@
21 ${SH} echo " commands" >> $@
22 ${SH} echo " cont" >> $@
23 ${SH} echo " end" >> $@
24 ${SH} echo " cont" >> $@
25 ${SH} echo "end" >> $@
26 ${SH} echo "run" >> $@
27 ${SH} echo "quit" >> $@
28
29 gdb.log: gdb.cmd ${MAIN}
30 ${SH} gdb -x gdb.cmd ${MAIN} > $@
31
32 video.mp4: plot
33 ${SH} ffmpeg -i frame%05d.png $@
34
35 clean:
36 ${RM} ${MAIN} *.o gdb.log gdb.cmd
37 ${RM} *.png *.mp4
Notice on line 3 we define our array size as 100. On line 8 we pass in the value when compiling the C++ source. On line 20 we create a gdb command file that we will use to run with the debugger. I've recently found it useful for makefiles to create such input files as part of the build process rather than author manually and preserve in version control. The gdb.cmd file is generated as part of the build process and in this case allows the makefile to preserve a consistent value for N in the gdb command file and the C++ source file.
Notice on line 29, we define a gdb.log target, which runs the executable with the necessary gdb command script and generates a gdb log file. Line 20 is the key to the gdb script, it attaches a watchpoint to array, when the array changes it spits out the old/new value of the array.
3 - Parse the Debugger Log
The gdb log file takes the form of:
$ cat -n gdb.log | head -50
1 GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.3) 7.7.1
2 Copyright (C) 2014 Free Software Foundation, Inc.
3 License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
4 This is free software: you are free to change and redistribute it.
5 There is NO WARRANTY, to the extent permitted by law. Type "show copying"
6 and "show warranty" for details.
7 This GDB was configured as "x86_64-linux-gnu".
8 Type "show configuration" for configuration details.
9 For bug reporting instructions, please see:
10 <http://www.gnu.org/software/gdb/bugs/>.
11 Find the GDB manual and other documentation resources online at:
12 <http://www.gnu.org/software/gdb/documentation/>.
13 For help, type "help".
14 Type "apropos word" to search for commands related to "word"...
15 Reading symbols from main...done.
16 Breakpoint 1 at 0x400dcc: file main.cpp, line 125.
17
18 Breakpoint 1, sort (list=0x604270, size=100) at main.cpp:125
19 125 insertionSort(list,size);
20 Watchpoint 2: *(int(*)[100])(list)
21 Watchpoint 2: *(int(*)[100])(list)
22
23 Old value = {83, 86, 77, 15, 93, 35, 86, 92, 49, 21, 62, 27, 90, 59, 63, 26, 40, 26, 72, 36, 11, 68, 67, 29, 82, 30, 62, 23, 67, 35, 29, 2, 22, 58, 69, 67, 93, 56, 11, 42, 29, 73, 21, 19, 84, 37, 98, 24, 15, 70, 13, 26, 91, 80, 56, 73, 62, 70, 96, 81, 5, 25, 84, 27, 36, 5, 46, 29, 13, 57, 24, 95, 82, 45, 14, 67, 34, 64, 43, 50, 87, 8, 76, 78, 88, 84, 3, 51, 54, 99, 32, 60, 76, 68, 39, 12, 26, 86, 94, 39}
24 New value = {83, 86, 86, 15, 93, 35, 86, 92, 49, 21, 62, 27, 90, 59, 63, 26, 40, 26, 72, 36, 11, 68, 67, 29, 82, 30, 62, 23, 67, 35, 29, 2, 22, 58, 69, 67, 93, 56, 11, 42, 29, 73, 21, 19, 84, 37, 98, 24, 15, 70, 13, 26, 91, 80, 56, 73, 62, 70, 96, 81, 5, 25, 84, 27, 36, 5, 46, 29, 13, 57, 24, 95, 82, 45, 14, 67, 34, 64, 43, 50, 87, 8, 76, 78, 88, 84, 3, 51, 54, 99, 32, 60, 76, 68, 39, 12, 26, 86, 94, 39}
When the list array is modified the watchpoint breakpoint is tripped which results in lines 23-24, outputting the old/new values.
4 - Stitch Frames into Video
Last bit, we need a simple Python script that is capable of parsing the gdb log file and generate a image representative of the state of the array.
$ cat -n plotter
1 #!/usr/bin/python
2 import sys;
3 import re;
4 import matplotlib.pyplot as plt;
5
6 def plot(fileName):
7 print "plotting %s"%(fileName);
8 with open(fileName,'r') as fp:
9 C=fp.read();
10
11 i=0;
12 for line in C.split('\n'):
13 m=re.match('New value = {(.*)}.*',line);
14 if m:
15 L=[int(el) for el in m.group(1).split(',')];
16 print L;
17 plt.ylim(0,100);
18 plt.bar(range(len(L)),L,label='');
19 plt.savefig('frame%05d.png'%(i));
20 plt.close();
21 i+=1;
22 pass;
23
24 #---main---
25 plot(sys.argv[1]);
This script will parse the 'New value =' lines and generate a bar graph that represents the order of the array. For example, an initial value of a 10 element array could look like this;
Generating a series of images, during each step of the sorting routine and stitching them together into a video using FFmpeg.
Final results, i give you;
Selection Sort
Insertion Sort
Quick Sort
No comments:
Post a Comment