A Code Duplication Survey - Which Industry Has the Highest Technical Debt?

103 15
Duplicated code is the immediate product of cut & paste development, the most common form of software reuse to date.
By duplicating code, developers implement new functionality using already available code with similar specifications.
To this end, available code is first cloned and then slightly modified to meet new requirements.
Typical modifications include variable renaming, insertion/deletion of a small number of lines of code, and a different formatting of the source code.
The main advantage of cut & paste development as a form of software reuse is the high coding productivity and the small intellectual strain on the developer.
However on long term, code duplication aggregates in a high technical debt for the developer company, leading to un-maintainable software.
A recent cross industry survey has tried to assess the size of the technical debt based on existing code duplication.
The trigger of this study was the high amount of duplication observed by some companies in the commercial embedded world.
Anonymous sources have reported up to 25% code duplication, which is rather surprising given the resource constrained nature of the segment.
The study has been performed on open source projects covering 6 segments: Office, Enterprise Financial, Databases, Communication, Network/Embedded, and Software Development.
From each segment, three relevant projects have been selected for analysis based on popularity as reflected by SourceForge and Google.
The result of the analysis shows that the Financial/Enterprise industry segment has the highest technical debt.
With up to 35% duplication on average, the software projects in this segment run a very high risk of becoming un-maintainable.
The runner-up is the Network/Embedded segment with up to 15% duplication, a rather high figure for the field.
For the remaining segments, the detected amount of duplication revolves around 5%.
The open question of this study is: How does the commercial world compare with the Open Source arena? Arguably, open source projects are rather concerned with technology development and less with making products run in every possible context.
This can imply that more duplication is to be found in the commercial world, as a result of making same functionality available in more contexts (e.
g.
, porting parts of the software to other platforms).
This hypothesis is supported by the higher amount of duplication reported in the commercial embedded segment compared to the study results.
Consequently, the actual code duplication amount in the commercial Financial/Enterprise segment could be even higher than reported by the study.
In conclusion, code duplication can be a major cause for a high technical debt and can lead to un-maintainable software.
Therefore, one should constantly monitor the amount of duplication and factor out common functionality when possible.
Source...
Subscribe to our newsletter
Sign up here to get the latest news, updates and special offers delivered directly to your inbox.
You can unsubscribe at any time

Leave A Reply

Your email address will not be published.