In this technique, one analyzes the code (static analysis) or traces generated by
running the code (dynamic analysis) to learn about the design, and indirectly about
how software engineers think and work. One might compare the programming or
architectural styles of several software engineers by analyzing their use of various
constructs, or the values of various complexity metrics.
Advantages: The source code is usually readily available and contains a very large
amount of information ready to be mined.
Disadvantages: To extract useful information from source code requires parsers
and other analysis tools; we have found such technology is not always mature –
although parsers used in compilers are of high quality, the parsers needed for certain
kinds of analysis can be quite different, for example they typically need to analyze
the code without it being pre-processed. We have developed some techniques for
dealing with this surprisingly difficult task (Somé and Lethbridge, 1998). Analyzing
old legacy systems created by multiple programmers over many years can make it
hard to tease apart the various independent variables (programmers, activities etc.)
that give rise to different styles, metrics etc.
Examples: Keller et al. (1999) use static analysis techniques involving templatematching
to uncover design patterns in source code – they point out, “… that it is
these patterns of thought that are at the root of many of the key elements of largescale
software systems, and that, in order to comprehend these systems, we need to
recover and understand the patterns on which they were built.”
Williams et al. (2000) were interested in the value added by pair programming
over individual programming. As one of the measures in their experiment, they
looked at the number of test cases passed by pairs versus individual programmers.
They found that the pairs generated higher quality code as evidence by a significantly
higher number of test cases passed.
Reporting guidelines: The documents (e.g. source code) that provide the basis for
the analysis should be carefully described. The nature of the processing on the data
also needs to be detailed. Additionally, any special processing considerations
should be described.
0 comments:
Post a Comment