Breakpoint single-step tracking is an inefficient debugging method

The interactive debugger of the breakpoint single-step track is a major invention in the history of software development. But I think it is the same as the graphical interaction interface, is used to reduce the threshold for learning. Essentially is an extremely inefficient debugging method.

My time (more than ten years of development experience before 2005) is extremely dependent on such debuggers, from Turbo C to Visual C ++, all versions are carefully used. Any tool is very natural after ten years. I think I can use this kind of tool that I want to use this kind of tool. However, after 2005, it was prominent to cross-platform development, perhaps because I didn’t find the right graphics tool on the Linux platform at the beginning, I have some time to reflect on the problem. GDB is powerful, but the graphical interactive shell at the time is not like today’s version so perfect. At that time, the mainstream Insight DDD had some small problems, which was not very easy to use. I started to convert what I usually develop. In addition to improving your code quality: write a simple, there is no problem with code, more use of the Code Review, consciously increase the log output to locate the bug.

Later, development focus is developed from the client graphics to the server, and the disadvantages running by the debugger interrupt program are more revealing. For software of the C / S structure, the code runs on one side is running, and the use of the human interaction frequency single-step tracking, and the other is operated by the interaction of the machine, which is very difficult to keep the software running process is normal.

Some new development work under these years will slowly join in the development of Windows. I have found that after another ten years of training, even if I use the interactive debugger, I will not understand any advantage. Often fingers press the machine’s operation on the tracking commissioning button, but think of the code in front of the screen. I often didn’t have to trigger the position of the bug, I have suddenly realized that I found a wrong place. This kind of thing is much, naturally questioning past methods, what causes the debugger to inefficient.

Sometimes talking with people, talk about how to locate bugs. I am always half joke, you open the editor, stare at the code. Want, BUG naturally highlighted. This is a joke, but in my concept, all debug methods are not Code Review. Whether it is written by yourself, or the code of someone else intervening. The first important thing is to understand the overall structure of the program.

The programs always consist of a sequential code segment to be executed in a segment order. The code segment executed sequentially is very stable, and its code segment entry is input status determines the output result. What we care is what input status is, most of them can skip the process, directly look at the results. Because this code has a unique execution process, no matter how long it is. The presence of the branch structure will make the execution stream different data processing based on different intermediate states. All branch points need to be considered when considering the correctness of the code. What conditions causing the code to go to this branch, what conditions causing the code to go to the branch. It can be said that the branch determines the complexity of the code. This is the case that the MCCABE code complexity is roughly the same.

A software’s overall MCCABE complexity is a certain limit of the ultra-human brain can be processed. However, usually we can divide the software, highly athletically low-coupling structures reduce software complexity. A highly polymerized module, can be isolated from the outside, so that we focus on the inside of the module to analyze. When the scale of the focus code is small enough, all the flows containing all branch structures can be handled by the brain. For execution flows with the prodigator assistance observation process, each time the execution process with the real input data is used to run along the unique path. To locate bugs, we need to design an input state that can trigger bugs. For a partial module, this is not always easy. But by the brain analysis, a module is different. When the MCCABE complexity is not high, it is almost a parallel to process all execution paths. That is, when you scan the code, the brain is actually analyzed at the same time, while you can make a branch of less important branches. Of course, like all skills, the width (complexity) of the analytical speed and can be analyzed, and the correctness of the pruning is needed to be developed. The excessive interactive debugging tool will affect this training, the brain is affected by the tool, will be more concerned about the state of the eyes: Where is the current run, (in order to improve the debugging efficiency) Where is the next breakpoint, now this group variable What is the value … Not too concerned: If the input is another situation, how the program will run. Because the tool has already cut off these no processes, waiting for your design to enter the next time you will show you.

Interactive debugging tools typically lack backtracking capabilities, that is, they usually react out, without recording past. This can be perfected by improved tools, and some can’t. A common scene is, you set the next breakpoint position, when the debugger stopped, found that the state is abnormal, only determine the problem between the last breakpoint to the current position, but want to go back to the end What happened, what is the intermediate state, but the tool is powerless. If the operation of the brain deduction process, everything is still a static map, the backtracks and the frontier difference, just focusing on a location on the timeline. That’s why it’s a good training programmer to see where the bug is, but the debugger uses the master to repeatedly run two or three times to find the BUG. It is of course enough training in the brain, which is more difficult to use the debugger than the training, but it is worth it. I don’t know if other students have a similar experience: I am in the school student competition in the Middle School, and the test paper is not all programming questions, especially in the preliminary stage. It is generally a paper test. There are many topics to give programs and input, write. Output results. Thanks for this experience, I have to conduct such training when I start school. When junior high school, the time you can touch the real machine is in a small time, most of the time is still in traditional studies. In order to write the game program you play, I can only hold the code on the book when I was in class. If you have finished writing, I will simulate it in the brain, see if there is a bug, can change it before you get it, you can use a limited maximum time. These experiences make me feel that reading code is not so boring, it is a way to improve efficiency.

As the main positioning bug with Code Review, you can promote your program that you write less complexity (more less than error). Because knowing the complex limits that you can deal with with your current capacity brain. I have seen a interview show for Linus in reducing branch. He talks about the code grade, raised a small example: a handler for the linked list. The heads of the linked list are generally different from the middle structure, and the nodes other than the head have a NEXT pointer reference to the next node, and the head node is an exception, which is referenced by different data structures. In the back example of LINUS, the code determines whether the head pointer is empty; and in the front example, the NEXT pointer is implemented with a pointer reference variable, for the head node, which references different data structural variables, so Avoid more exceptions (for head nodes) judgment. The code can be dealt with. In the small fragment of only 5, 6 lines of code, it seems that the semantics are very clear, and it is more difficult to judge, but Linus emphasizes this problem. I think this is actually in the instinct of reducing code complexity to writing code.

For someone else’s projects, you can’t control the quality of the code. But long-term Code Review training can help you quickly divide the software module. Usually, you need to use your knowledge of the relevant domain, and similar software usual design patterns, preset software possible modules. This process requires understanding of the field, should not override the code implementation details. As soon as I start, the roughly running process of the debugger first running the software is a method I am not recommended. In this way, the field is too narrow, and there is a lot of time only observed. In fact, it doesn’t have to be attached to the top or from the next place. You can take the file structure of the source code to be a module to be guessed, and then pick a module, find the associated part and then touch the . For items that need to be built, the time to test the programs can even be completed synchronously in the first time waiting for compilation, without waiting for the construction of the construction and running, and even don’t need to download code to the local, Github, friendly The web interface is already comfortable to read in the browser, there is an ipad to be comfortable to lying on the bed.

One reason I don’t like C ++ is: C ++ code from a part to read, it is difficult to explain. Its code literal is likely to have a variety of actual operations, and the deterministic is insufficient. Function name overload, operator overload is hidden in local code. Even if you see a variable name, if you don’t go to the context and header files, it is difficult to determine that this is a local variable or a class member variable (the scope of the former is very different, the brain is doing analysis The strategy of the twig is completely different; see a variable, originally thought it is an input value until it sees finally, find it can do output, look back at a function declaration, in fact it is a reference amount. If you use a template generic, it is more terrible, and even the data type is uncertain. What is the associated operation after the local code cannot know the template instantiation. Reading C ++ projects often requires mutual reference between the code, increasing too much burden on the brain.

So, is it enough for the brain Code REVIEW? If your own ability is infinite, I think it is possible. By accumulating experience, the complexity of the code that I can directly indexically read more than the year of the year. But there is always a time when there is no longer. At this time, the best way is to join the log output as auxiliary means. Imagine that we are trying to know when using the mutual debugging tool? Nothing is the running path of the program, is it true that it is here, and when the program is running here, what is the state of the variable, there is no abnormal situation. The log output is actually doing the same job. Output a line of logs on the key path to express the running path of the program. Output important variables in the log, you can query the program running status at the time. How effective output logs naturally need training skills. Don’t worry too much about the performance of the log output, the final software has 20% of performance fluctuations for the performance of the software to be insignificant.

Compared with the plug-in debugging tool, the log has a good backtrack inquiry. As an aid of Code Review, our brain actually needs only a correction for judging: Verify that the program is in the brain, and the internal state is normal. Unlike the debugging tool, the log does not interrupt the running process, software running in parallel in parallelism, such as a C / S structure, is more important.

In fact, the retention status information is also very important in interaction debugging tools. I believe that many people are like me, sometimes add some temporary global variables when debugging procedures, write some intermediate states into these variables. Need to see these status values ??during the interaction debugging process. This temporary state temporary container is actually acting as a log.

The advantage of text logs is that you can use the text handling tool to do information secondary extraction. GREP AWK VIM PYTHON Lua is a good means of analyzing logs. If the log is huge and there is on a remote machine, you are likely to find a more effective and fast means. Many times, constantly re-running the cost of bugs, and analyzing the log after a detailed log.

So, do you learn to use the exchange of mutual debugging tools? I think it is still important. Occasionally, it can also play a fare. Especially when the program crashes, the status of the crash in the ATTACH is observed in the process. Most of the operating system can also DUMP the process status of the process when the crash is analyzed. These require you to use debug tools. However, through the static state of the gray snake line, what happened before the collapse, but it also needs enough understanding of the code itself. Because there is not much time, I think the GDB of the command line is enough. In terms of analyzing damaged stack frames, writing scripting analysis of some complex data structures, the command line version is more flexible, and the scope of application is also wide. And the inconvenience in interaction, the increased learning cost is acceptable.