Code bloat
Encyclopedia
Code bloat is the production of code
Code
A code is a rule for converting a piece of information into another form or representation , not necessarily of the same type....

 that is perceived as unnecessarily long, slow, or otherwise wasteful of resources. Code bloat can be caused by inadequacies in the language in which the code is written, inadequacies in the compiler
Compiler
A compiler is a computer program that transforms source code written in a programming language into another computer language...

 used to compile the code, or by a programmer. Therefore, code bloat generally refers to source code size (as produced by the programmer), but sometimes is used to refer instead to the generated code size or even the binary file
Binary file
A binary file is a computer file which may contain any type of data, encoded in binary form for computer storage and processing purposes; for example, computer document files containing formatted text...

 size.

Common Causes

Often, bloated code can result from a programmer who simply uses more lines of code than the optimal solution to a problem.

Some reasons for programmer derived code bloat are:
  • overuse of object oriented (OOP) constructs—such as classes and inheritance can lead to messy and confusing designs, often taking many more lines of code than an optimal solution.
  • incorrect usage of design patterns
    Design pattern (computer science)
    In software engineering, a design pattern is a general reusable solution to a commonly occurring problem within a given context in software design. A design pattern is not a finished design that can be transformed directly into code. It is a description or template for how to solve a problem that...

     -- OOP developers will often attempt to "force" design patterns as solutions to problems that do not need them
  • overuse of OOP methods/functions/procedures—breaking an algorithm up into many methods is a way to allow developers to reuse these methods to solve other problems. However, this often adds code bloat and makes the code difficult, if not impossible, to read and debug and reduces algorithmic efficiency
    Algorithmic efficiency
    In computer science, efficiency is used to describe properties of an algorithm relating to how much of various types of resources it consumes. Algorithmic efficiency can be thought of as analogous to engineering productivity for a repeating or continuous process, where the goal is to reduce...

    .
  • declarative programming
    Declarative programming
    In computer science, declarative programming is a programming paradigm that expresses the logic of a computation without describing its control flow. Many languages applying this style attempt to minimize or eliminate side effects by describing what the program should accomplish, rather than...

     -- implementing a declarative programming style in an imperative or OOP language often leads to code bloat.
  • excessive loop unrolling -- without justification through improved performance
  • excessive use of multiple conditional If statements—instead of, for instance, using a lookup table
    Lookup table
    In computer science, a lookup table is a data structure, usually an array or associative array, often used to replace a runtime computation with a simpler array indexing operation. The savings in terms of processing time can be significant, since retrieving a value from memory is often faster than...



Some naïve implementations of the template
Template (programming)
Templates are a feature of the C++ programming language that allow functions and classes to operate with generic types. This allows a function or class to work on many different data types without being rewritten for each one....

 system employed in C++
C++
C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...

 are examples of inadequacies in the compiler
Compiler
A compiler is a computer program that transforms source code written in a programming language into another computer language...

 used to compile the language.
A naïve compiler implementing this feature can introduce versions of a method of a template class for every type
Data type
In computer programming, a data type is a classification identifying one of various types of data, such as floating-point, integer, or Boolean, that determines the possible values for that type; the operations that can be done on values of that type; the meaning of the data; and the way values of...

 it is used with. This in turns leads to compiled methods that may never be used, thus resulting in code bloat. More sophisticated compilers and linkers detect the superfluous copies and discard them, or avoid generating them at all, reducing the bloat. Thus template code can result in smaller binaries because a compiler is allowed to discard this kind of dead code
Dead code
Dead code is a computer programming term for code in the source code of a program which is executed but whose result is never used in any other computation...

.


Some examples of native compiler derived bloat include:
  • dead code
    Dead code
    Dead code is a computer programming term for code in the source code of a program which is executed but whose result is never used in any other computation...

     -- code which is executed but whose result is never used.
  • redundant calculations—re-evaluating expressions that have already been calculated once. Such redundant calculations are often generated when implementing "bounds checking" code to prevent buffer overflow
    Buffer overflow
    In computer security and programming, a buffer overflow, or buffer overrun, is an anomaly where a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory. This is a special case of violation of memory safety....

    . Sophisticated compilers calculate such things exactly once, eliminating the following redundant calculations, using common subexpression elimination
    Common subexpression elimination
    In computer science, common subexpression elimination is a compiler optimization that searches for instances of identical expressions , and analyses whether it is worthwhile replacing them with a single variable holding the computed value.- Example :In the following code: a = b * c + g; d = b * c...

     and loop-invariant code motion
    Loop-invariant code motion
    In computer programming, loop-invariant code consists of statements or expressions which can be moved outside the body of a loop without affecting the semantics of the program...

    .

Examples

The following JavaScript algorithm has a large number of redundant variables, unnecessary logic and inefficient string concatenation.


// Complex
function TK2getImageHTML(size, zoom, sensor, markers) {
var strFinalImage = "";
var strHTMLStart = 'The map';
var strURL = "http://maps.google.com/maps/api/staticmap?center=";
var strSize = '&size='+ size;
var strZoom = '&zoom='+ zoom;
var strSensor = '&sensor='+ sensor;

strURL += markers[0].latitude;
strURL += ",";
strURL += markers[0].longitude;
strURL += strSize;
strURL += strZoom;
strURL += strSensor;

for (var i = 0; i < markers.length; i++) {
strURL += markers[i].addMarker;
}

strFinalImage = strHTMLStart + strURL + strHTMLEnd;
return strFinalImage;
};


The same logic can be stated more efficiently as follows:


// Simplified
TK2.getImageHTML = function(size, zoom, sensor, markers) {
var url = [ 'http://maps.google.com/maps/api/staticmap',
'?center=', markers[0].latitude, ',', markers[0].longitude,
'&size=', size,
'&zoom=', zoom,
'&sensor=', sensor ];
for (var i = 0; i < markers.length; i++) {
url.push(markers[i].addMarker);
}
return 'The map';
}

Code density of different languages

The difference in code density between various computer languages is so great that often less memory
Memory
In psychology, memory is an organism's ability to store, retain, and recall information and experiences. Traditional studies of memory began in the fields of philosophy, including techniques of artificially enhancing memory....

 is needed to hold both a program written in a "compact" language (such as a domain-specific programming language
Domain-specific programming language
In software development and domain engineering, a domain-specific language is a programming language or specification language dedicated to a particular problem domain, a particular problem representation technique, and/or a particular solution technique...

, Microsoft P-Code
Microsoft P-Code
Microsoft's P-Code, short for packed code, is an intermediate language that provides an alternate binary format to native code for any compiled binary . Its primary goal is to produce smaller files. P-Code binaries require an additional runtime library to execute...

, or threaded code
Threaded code
In computer science, the term threaded code refers to a compiler implementation technique where the generated code has a form that essentially consists entirely of calls to subroutines...

), plus an interpreter
Interpreter (computing)
In computer science, an interpreter normally means a computer program that executes, i.e. performs, instructions written in a programming language...

 for that compact language (written in native code), than to hold that program written directly in native code.

Performance implications

In many cases, when two programs implement the same functionality, the larger program will also run slower than the smaller program.
There are however a few cases where there is a space-time tradeoff
Space-time tradeoff
In computer science, a space–time or time–memory tradeoff is a situation where the memory use can be reduced at the cost of slower program execution...

 -- in these cases, a larger program can run faster than a smaller one.

Reducing bloat

Some techniques for reducing code bloat include:
  • refactoring
    Refactoring
    Code refactoring is "disciplined technique for restructuring an existing body of code, altering its internal structure without changing its external behavior", undertaken in order to improve some of the nonfunctional attributes of the software....

     commonly-used code sequence into a subroutine, and calling that subroutine from several locations, rather than copy and pasting
    Copy and paste programming
    Copy and paste programming is a pejorative term to describe highly repetitive computer programming code apparently produced by copy and paste operations...

     the code at each of those locations,
  • re-using subroutines that have already been written (perhaps with additional parameters) , rather than re-writing them again from scratch as a new routine.

See also

  • Overloading in Polymorphism (computer science)
  • Software bloat
    Software bloat
    Software bloat is a process whereby successive versions of a computer program include an increasing proportion of unnecessary features that are not used by end users, or generally use more system resources than necessary, while offering little or no benefit to its users.-Causes:Software developers...

  • minimalism (computing)
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK