Obfuscated code
Encyclopedia
Obfuscated code is source
Source code
In computer science, source code is text written using the format and syntax of the programming language that it is being written in. Such a language is specially designed to facilitate the work of computer programmers, who specify the actions to be performed by a computer mostly by writing source...

 or machine code
Machine code
Machine code or machine language is a system of impartible instructions executed directly by a computer's central processing unit. Each instruction performs a very specific task, typically either an operation on a unit of data Machine code or machine language is a system of impartible instructions...

 that has been made difficult to understand for humans. Programmers may deliberately obfuscate
Obfuscation
Obfuscation is the hiding of intended meaning in communication, making communication confusing, wilfully ambiguous, and harder to interpret.- Background :Obfuscation may be used for many purposes...

 code to conceal its purpose (security through obscurity
Security through obscurity
Security through obscurity is a pejorative referring to a principle in security engineering, which attempts to use secrecy of design or implementation to provide security...

) or its logic to prevent tampering, deter reverse engineering
Reverse engineering
Reverse engineering is the process of discovering the technological principles of a device, object, or system through analysis of its structure, function, and operation...

, or as a puzzle
Puzzle
A puzzle is a problem or enigma that tests the ingenuity of the solver. In a basic puzzle, one is intended to put together pieces in a logical way in order to come up with the desired solution...

 or recreational challenge for someone reading the source code. Programs known as obfuscators transform readable code into obfuscated code using various techniques.

Overview

The architecture and characteristics of some languages may make them easier to obfuscate than others. C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

, C++
C++
C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...

, and Perl
Perl
Perl is a high-level, general-purpose, interpreted, dynamic programming language. Perl was originally developed by Larry Wall in 1987 as a general-purpose Unix scripting language to make report processing easier. Since then, it has undergone many changes and revisions and become widely popular...

 are some examples of languages easy to obfuscate.

Recreational obfuscation

Writing and reading obfuscated source code can be a brain teaser
Brain teaser
A brain teaser is a form of puzzle that requires thought to solve. It often requires thinking in unconventional ways with given constraints in mind; sometimes it also involves lateral thinking. Logic puzzles and riddles are specific types of brain teasers....

 for programmers. A number of programming contests reward the most creatively obfuscated code: the International Obfuscated C Code Contest
International Obfuscated C Code Contest
The International Obfuscated C Code Contest is a programming contest for the most creatively obfuscated C code. It was held annually between 1984 and 1996, and thereafter in 1998, 2000, 2001, 2004, 2005 and 2006....

, Obfuscated Perl Contest
Obfuscated Perl contest
The Obfuscated Perl Contest was a competition for programmers of Perl which was held annually between 1996 and 2000. Entrants to the competition aimed to write "devious, inhuman, disgusting, amusing, amazing, and bizarre Perl code"...

, and International Obfuscated Ruby Code Contest.

Types of obfuscations include simple keyword substitution, use or non-use of whitespace to create artistic effects, and self-generating or heavily compressed programs.

Short obfuscated Perl
Perl
Perl is a high-level, general-purpose, interpreted, dynamic programming language. Perl was originally developed by Larry Wall in 1987 as a general-purpose Unix scripting language to make report processing easier. Since then, it has undergone many changes and revisions and become widely popular...

 programs may be used in signatures
Signature block
A signature block is a block of text automatically appended at the bottom of an e-mail message, Usenet article, or forum post. This has the effect of "signing off" the message and in a reply message of indicating that no more response follows...

 of Perl programmers. These are JAPHs ("Just another Perl hacker
Just another Perl hacker
Just another Perl hacker, or JAPH, typically refers to a Perl program which prints "Just another Perl hacker," . Short JAPH programs are often used as signatures in online forums, or as T-shirt designs...

").

Examples

This is a winning entry from the International Obfuscated C Code Contest written by Ian Phillipps in 1988 and subsequently reverse engineered by Thomas Ball.


/*
LEAST LIKELY TO COMPILE SUCCESSFULLY:
Ian Phillipps, Cambridge Consultants Ltd., Cambridge, England
  1. include

main(t,_,a)
char
a;
{
return!

0 t<3?

main(-79,-13,a+
main(-87,1-_,
main(-86, 0, a+1 )

+a)):

1,
t<_?
main(t+1, _, a )

main ( -94, -27+t, a )
&&t 2 ?_
<13 ?

main ( 2, _+1, "%s %d %d\n" )
9:16:

t<0?
t<-72?
main( _, t,
"@n'+,#'/*{}w+/w#cdnr/+,{}r/*de}+,/*{*+,/w{%+,/w#q#n+,/#{l,+,/n{n+,/+#n+,/#;\
  1. q#n+,/+k#;*+,/'r :'d*'3,}{w+K w'K:'+}e#';dq#'l q#'+d'K#!/+k#;\

q#'r}eKK#}w'r}eKK{nl]'/#;#q#n'){)#}w'){){nl]'/+#n';d}rw' i;# ){nl]!/n{n#'; \
r{#w'r nc{nl]'/#{l,+'K {rw' iK{;[{nl]'/w#q#\
\
n'wk nw' iwk{KK{nl]!/w{%'l##w#' i; :{nl]'/*{q#'ld;r'}{nlwb!/*de}'c ;;\
{nl'-{}rw]'/+,}##'*}#nc,',#nw]'/+kd'+e}+;\
  1. 'rdq#w! nr'/ ') }+}{rl#'{n' ')# }'+}##(!!/")

t<-50?
_*a ?
putchar(31[a]):

main(-65,_,a+1)
main((*a

'/') + t, _, a + 1 )

0
main ( 2, 2 , "%s")
  • a

'/'||

main(0,

main(-61,*a, "!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:\nuwloca-O;m .vpbks,fxntdCeghiry")

,a+1);}


It is a C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

 program that when compiled and run will generate the 12 verses of The 12 Days of Christmas
The Twelve Days of Christmas (song)
"The Twelve Days of Christmas" is an English Christmas carol that enumerates a series of increasingly grand gifts given on each of the twelve days of Christmas. Although first published in England in 1780, textual evidence may indicate the song is French in origin...

. It contains all the strings required for the poem in an encoded form within the code.

A non-winning entry from the same year, the next example illustrates creative use of whitespace; it generates mazes of arbitrary length:


char*M,A,Z,E=40,J[40],T[40];main(C){for(*J=A=scanf(M="%d",&C);
-- E; J[ E] =T
[E ]= E) printf("._"); for || (printf("\n|"
) , A = 39 ,C --
) ; Z || printf (M ))M[Z]=Z[A-(E =A[J-Z])&&!C
& A T[ A]
|6<<27
Modern C compilers don't allow constant strings to be overwritten, which can be avoided by changing "*M" to "M[3]" and omitting "M=".

The following example by Óscar Toledo Gutiérrez, Best of Show entry in the 19th IOCCC, implements a 8080 emulator complete with terminal and disk controller, capable of booting CP/M-80 and running CP/M applications,

  1. include

#define n(o,p,e)=y=(z=a(e)%16 p x%16 p o,a(e)p x p o),h(
#define s 6[o]
#define p z=l[d(9)]|l[d(9)+1]<<8,1<(9[o]+=2)||++8[o]
#define Q a(7)
#define w 254>(9[o]-=2)||--8[o],l[d(9)]=z,l[1+d(9)]=z>>8
#define O )):((
#define b (y&1?~s:s)>>"\6\0\2\7"[y/2]&1?0:(
#define S )?(z-=
#define a(f)*((7&f)-6?&o[f&7]:&l[d(5)])
#define C S 5 S 3
#define D(E)x/8!=16+E&198+E*8!=x?
#define B(C)fclose((C))
#define q (c+=2,0[c-2]|1[c-2]<<8)
#define m x=64&x?*c++:a(x),
#define A(F)=fopen((F),"rb+")
unsigned char o[10],l[78114],*c=l,*k=l
#define d(e)o[e]+256*o[e-1]
  1. define h(l)s=l>>8&1|128&y|!(y&255)*64|16&z|2,y^=y>>4,y^=y<<2,y^=~y>>1,s|=y&4

+64506; e,V,v,u,x,y,z,Z; main(r,U)char**U;{

{ { { } } } { { { } } } { { { } } } { { { } } }
{ { { } } } { { { } } } { { { } } } { { { } } }
{ { { } } } { { { } } } { { { } } } { { { } } }
{ { { } } } { { { } } } { { { } } } { { { } } }
{ { { } } } { { { } } } { { { } } } { { { } } }
{ { { } } } { { { } } } { { { } } } { { { } } }
{ { ; } } { { { } } } { { ; } } { { { } } }
{ { { } } } { { { } } } { { { } } } { { { } } }
{ { { } } } { { { } } } { { { } } } { { { } } }
{ { { } } } { { { } } } { { { } } } { { { } } }
{ { { } } } { { { } } } { { { } } } { { { } } }
{ { { } } } { { { } } } { { { } } } { { { } } }
{ { { } } } { { { } } } { { { } } } { { { } } }

for(v A((u A((e A((r-2?0:(V A(1[U])),"C")
),system("stty raw -echo min 0"),fread(l,78114,1,e),B(e),"B")),"A")); 118-(x*c++); (y=x/8%8,z=(x&199)-4 S 1 S 1 S 186 S 2 S 2 S 3 S 0,r=(y>5)*2+y,z=(x&
207)-1 S 2 S 6 S 2 S 182 S 4)?D(0)D(1)D(2)D(3)D(4)D(5)D(6)D(7)(z=x-2 C C C C
C C C C+129 S 6 S 4 S 6 S 8 S 8 S 6 S 2 S 2 S 12)?x/64-1?((0 O a(y)=a(x) O 9
[o]=a(5),8[o]=a(4) O 237*c++?((int (*))(2-*c++?fwrite:fread))(l+*k+1[k]*
256,128,1,(fseek(y=5[k]-1?u:v,((3[k]|4[k]<<8)<<7|2[k])<<7,Q=0),y)):0 O y=a(5
),z=a(4),a(5)=a(3),a(4)=a(2),a(3)=y,a(2)=z O c=l+d(5) O y=l[x=d(9)],z=l[++x]
,x[l]=a(4),l[--x]=a(5),a(5)=y,a(4)=z O 2-*c?Z||read(0,&Z,1),1&*c++?Q=Z,Z=0:(
Q=!!Z):(c++,Q=r=V?fgetc(V):-1,s=s&~1|r<0) O++c,write(1,&7[o],1) O z=c+2-l,w,
c=l+q O p,c=l+z O c=l+q O s^=1 O Q=q[l] O s|=1 O q[l]=Q O Q=~Q O a(5)=l[x=q]
,a(4)=l[++x] O s|=s&16|9159?Q+=96,1:0,y=Q,h(s<<8)
O l[x=q]=a(5),l[++x]=a(4) O x=Q%2,Q=Q/2+s%2*128,s=s&~1|x O Q=l[d(3)]O x=Q /
128,Q=Q*2+s%2,s=s&~1|x O l[d(3)]=Q O s=s&~1|1&Q,Q=Q/2|Q<<7 O Q=l[d(1)]O s=~1
&s|Q>>7,Q=Q*2|Q>>7 O l[d(1)]=Q O m y n(0,-,7)y) O m z=0,y=Q|=x,h(y) O m z=0,
y=Q^=x,h(y) O m z=Q*2|2*x,y=Q&=x,h(y) O m Q n(s%2,-,7)y) O m Q n(0,-,7)y) O
m Q n(s%2,+,7)y) O m Q n(0,+,7)y) O z=r-8?d(r+1):s|Q<<8,w O p,r-8?o[r+1]=z,r
[o]=z>>8:(s=~40&z|2,Q=z>>8) O r[o]--||--o[r-1]O a(5)=z=a(5)+r[o],a(4)=z=a(4)
+o[r-1]+z/256,s=~1&s|z>>8 O ++o[r+1]||r[o]++O o[r+1]=*c++,r[o]=*c++O z=c-l,w
,c=y*8+l O x=q,b z=c-l,w,c=l+x) O x=q,b c=l+x) O b p,c=l+z) O a(y)=*c++O r=y
,x=0,a(r)n(1,-,y)s<<8) O r=y,x=0,a(r)n(1,+,y)s<<8))));
system("stty cooked echo"); B((B((V?B(V):0,u)),v)); }


An example of a JAPH
Just another Perl hacker
Just another Perl hacker, or JAPH, typically refers to a Perl program which prints "Just another Perl hacker," . Short JAPH programs are often used as signatures in online forums, or as T-shirt designs...

:


@P=split//,".URRUU\c8R";@d=split//,"\nrekcah xinU / lreP rehtona tsuJ";sub p{
@p{"r$p","u$p"}=(P,P);pipe"r$p","u$p";++$p;($q*=2)+=$f=!fork;map{$P=$P[$f^ord
($p{$_})&6];$p{$_}=/ ^$P/ix?$P:close$_}keys%p}p;p;p;p;p;map{$p{$_}=~/^[P.]/&&
close$_}%p;wait until$?;map{/^r/&&<$_>}%p;$_=$d[$q];sleep rand(2)if/\S/;print


This slowly displays the text "Just another Perl / Unix hacker", multiple characters at a time, with delays. An explanation can be found here.

Some Python
Python (programming language)
Python is a general-purpose, high-level programming language whose design philosophy emphasizes code readability. Python claims to "[combine] remarkable power with very clear syntax", and its standard library is large and comprehensive...

 examples can be found in the official Python programming FAQ.

Disadvantages of obfuscation

At best, obfuscation merely makes it time-consuming, but not impossible, to reverse engineer a program. In Java
Java (programming language)
Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...

 it also limits the use of the Reflection API on the obfuscated code.

Obfuscating software

A variety of tools exists to perform or assist with code obfuscation.
These include experimental research tools created by academics, hobbyist tools,
commercial products written by professionals, and open-source software
Open-source software
Open-source software is computer software that is available in source code form: the source code and certain other rights normally reserved for copyright holders are provided under a software license that permits users to study, change, improve and at times also to distribute the software.Open...

.
There also exist deobfuscation tools that attempt to perform the reverse
transformation.

Although the majority of commercial obfuscation solutions work by transforming
either program source
code, or platform-independent bytecode as used by
Java and
.NET, there are also some that work with C and
C++ - languages that are typically compiled to native code.

See also

  • AARD code
    AARD code
    The AARD code was a segment of obfuscated machine code that is included in several executables, including the installer and WIN.COM, in a beta release of Microsoft Windows 3.1. It was a block of code which was XOR encrypted, self-modifying, and deliberately obfuscated, using various undocumented...

  • ActionScript code protection
    ActionScript code protection
    ActionScript code protection. ActionScript is the main language for developing flash products.Code obfuscation is the process of transforming code into a form that is unintelligible to human...

  • Decompilation
  • Esoteric programming language
    Esoteric programming language
    An esoteric programming language is a programming language designed as a test of the boundaries of computer programming language design, as a proof of concept, or as a joke...

  • Quine
  • Polymorphic code
    Polymorphic code
    In computer terminology, polymorphic code is code that uses a polymorphic engine to mutate while keeping the original algorithm intact. That is, the code changes itself each time it runs, but the function of the code will not change at all...

  • Hardware obfuscation
    Hardware obfuscation
    Hardware obfuscation is a technique by which the description or the structure of electronic hardware is modified to intentionally conceal its functionality, which makes it significantly more difficult to reverse-engineer...


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK