Computer Virus-Antivirus Coevolution.pdf
(
310 KB
)
Pobierz
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
Computer Virus—
Coevolution
The battle to conquer computer viruses is far from won,
but new and improved antidotes are controlling the field.
“latest and greatest” viruses, the virus authors invent new
and more devious ways to hide their progeny.
This coevolution has led to the creation of the most
complex class of virus to date: the
polymorphic
computer
virus. The polymorphic virus avoids detection by mutating
itself each time it infects a new program; each mutated
infection is capable of performing the same tasks as its par-
ent, yet it may look entirely different.
These cunning viruses simply cannot be detected cost-
effectively using traditional antivirus scanning algorithms.
Fortunately, the antivirus producers have responded, as
they have in the past, with an equally creative solution to
the polymorphic virus threat. Many antivirus programs are
now starting to employ a technique known as
generic decryp-
tion
to detect even the most complex polymorphic viruses
quickly and cost effectively.
A computer virus is a self-replicating computer pro-
Carey Nachenberg
A
S RECENTLY AS SIX YEARS AGO
,
COMPUTER
viruses were considered an urban myth by
many. At the time, only a handful of PC
viruses had been written and infection was
relatively uncommon. Today the situation is
very different. As of November 1996, virus
writers have programmed more than 10,000 DOS-based
computer viruses.
In addition to the sheer increase in the number of
viruses, the virus writers have also become more clever.
Their newer creations are significantly more complex and
difficult to detect and remove. These “improvements” can
be at least partially attributed to the efforts of antivirus
producers. As antivirus products improve and detect the
46
January 1997/Vol. 40, No. 1 COMMUNICATIONS OF THE ACM
—Antivirus
gram that spreads by attaching itself to executable files or
system areas on diskettes. Recently, we have also encoun-
tered a new type of virus that infects application data files
that contain macros. These viruses are constructed entirely
of application macros and use the macro language to prop-
agate themselves.
In addition to their ability to replicate, some computer
viruses also deliver a
payload
—a portion of the virus pro-
gram that is designed to damage the host machine, display
a message, or do some other mischief without the computer
operator’s consent. This article focuses primarily on how
computer viruses replicate and obscure themselves.
The vast majority of computer viruses have been
designed specifically for IBM-based PCs running the DOS
and Windows operating systems. In terms of sophistication
and functionality, these DOS-based viruses are generations
ahead of viruses written for other operating systems and
platforms. Consequently, this article examines how the
antivirus community has tackled these DOS-based viruses.
Nonetheless, the concepts presented apply for viruses
found on all operating systems and computer platforms.
The first file-infecting viruses were simple machine lan-
guage programs that had the ability to attach and spread
identical copies of themselves from program to program.
When the user executed an infected program, the virus
would take control of the computer and infect additional
files. After the virus completed its mischief, it would trans-
fer control to the host program and allow it to function
normally. This type of virus is called a “parasitic” computer
virus, since it does not kill its host; instead, the host acts as
a carrier.
Since these viruses replicated identical copies of their
machine code and data each time they infected a new pro-
gram, they proved easy for antivirus products to detect.
Every time the antivirus researcher, received a new virus,
they would analyze the virus and identify a sequence of
machine-language instructions that were both unique to
the virus and present in every one of its infections; this
sequence of bytes is known as a virus signature. The
researcher could then insert this signature into the
antivirus program so that it could search for the virus.
The first antivirus programs were glorified string-
searching programs. The antivirus program would open
each executable file on the computer and scan its entire
contents for a current set of virus signatures. If any of the
signatures were found anywhere in the target program, the
antivirus program would report the infection to the user. If
none of the signatures were found, the antivirus program
would report that the file was uninfected.
This detection technique worked well for quite a while.
However, as the number of viruses steadily increased,
antivirus programs had to become more clever in order to
run at reasonable speeds. After all, scanning for hundreds
of virus signatures through hundreds of kilobytes of exe-
cutable programs was a very slow process, especially on
4.77 megahertz computers! These speed problems led to
one of the great innovations in early antivirus programs.
M
OST COMPUTER VIRUSES
,
INCLUDING MANY
of today’s complex viruses, either append
themselves onto the end or place them-
selves at the beginning of their host exe-
cutable file. Furthermore, almost all
computer viruses are less than 4KB in
length. Antivirus researchers recognized these patterns and
optimized their virus scanners accordingly. The result of
this improvement was antivirus programs that could detect
the vast majority of file viruses by scanning less than 8KB
of each executable program. This technique dramatically
increased the speed and efficiency of antivirus products and
is still being used today in some products.
While these improvements significantly improved the
speed of the virus scanner, even more optimizations were
possible. For example, almost all file-infecting viruses
infect at the entry-point of their host program. The entry-
point of a program is the location in the program of the
first machine-language instruction that is executed when
the program is launched.
As an analogy, consider a notepad with a list of instruc-
tions on how to complete a certain task, such as sending a
fax. (See Diagram 1A.) In order to send a fax, a person can
simply follow the instructions on the pad from start to fin-
ish. Computers interpret programs, which are basically
sequences of simple instructions, in much the same way. In
our notepad example, the entry-point of the notepad is the
line that contains the first instruction in the list. This
would be the first instruction that a human would look at
when they start on their task.
When a virus infects a new program, it places itself at
the entry-point of the host program or modifies the
machine-language instructions at the host’s entry-point to
transfer control to its virus body. In the latter case, the virus
body is usually located at the end of the host program. (See
Diagram 1B.)
When the user executes a program, the operating sys-
tem loads the program into the computer’s RAM and then
instructs the CPU to start executing the instructions at the
program’s entry-point. Viruses infect at the entry-point of
47
COMMUNICATIONS OF THE ACM
January 1997/Vol. 40, No. 1
their host because it guarantees that when the user executes
a program, the virus is immediately given control of the
computer.
Since most computer viruses were found to infect or
modify the entry-point of a host program, antivirus authors
realized they could make their AV programs even faster.
They redesigned their scanners to scrutinize only the
instructions that can be reached directly from the entry-
point in each executable program. This type of honed
search is known as entry-point scanning. The entry-point
scanner works in the following manner:
cuting and in memory.
When the virus infects a new program, it re-encrypts a
copy of the virus body (using a complementary encryption
program that is also carried along in the virus body) and
appends this onto the host program. It also appends the
decryption routine onto the host program. Most encrypt-
ing viruses use a different encryption key each time they
infect a new program; consequently, the encrypted virus
body appears different in each infection. This means the
virus decryption program constitutes the only consistent,
visible sequence of instructions that are passed from infec-
tion to infection. (See Diagram 2.)
The encrypted virus posed new problems for antivirus
researchers. In the past, we could detect viruses by search-
ing for virus signatures extracted from the virus body.
Clearly, this is not an option with the encrypted virus since
it doesn’t maintain a consistent virus body from infection
to infection. However, the encrypted virus does propagate
an identical copy of its small decryption routine from infec-
tion to infection. Could this be enough to detect the virus?
As it turns out, the answer is “yes” in most cases. The
decryption routines used by most encrypting viruses are
often unique. Most of them are made up of at least 10 to
15 distinct bytes. In fact, many antivirus researchers can
look at a disassembly of a virus decryption routine and
immediately identify the virus’ strain without decrypting
and examining the rest of the virus!
Because of this, we were able to use the same virus sig-
nature scanning technique to detect many of the encrypt-
ing viruses. However, there was one drawback: most virus
1
. Establish a variable E contains the target program’s
entry-point location.
2.
The entry-point scanner examines the machine-language
instruction at location E in the suspected program.
3.
If this instruction transfers control to another instruction
(as in Diagram 1B), then set E to the location of the des-
tination instruction and go back to step 2.
4.
Search the bytes at location E for virus signatures.
Therefore, if this algorithm were applied to the program in
Diagram 1A, the entry-point scanner would finish with an
E value of 1, and search for viruses starting at the very first
instruction. If applied to the program in Diagram 1B, the
entry-point scanner would finish with an E value of 6, and
search starting for viruses at the first viral instruction.
Clearly this technique greatly reduces the number of
bytes that must be searched for viruses and dramatically
improves antivirus performance. This entry-point
scanning technique continues to be used by many
popular antivirus programs.
Unfortunately, these early advances were soon
countered by the virus writers who realized they
could make their viruses more difficult to detect if
the viruses did not spread exact copies of themselves
from file to file. This led to the creation of the
encrypted
virus.
The encrypted virus consists of two parts: a small
program known as the virus decryption routine, and
an encrypted virus body. The virus decryption routine
is composed of a sequence of machine-language
instructions that can decrypt the encrypted virus
body. The encrypted virus body is just an encrypted
version of the same virus machine-language instruc-
tions used by the simple viruses described earlier.
If a user executes an infected program, the virus
decryption routine immediately executes and
decrypts the machine-language instructions that
comprise the rest of the virus. The decryption rou-
tine then transfers control to the newly decrypted
virus body and allows the virus to execute normally. As you
can see, the virus body is visible only when the virus is exe-
A.
Before
Insert document in fax machine. (Program entry-point)
Dial the phone number.
Hit the SEND button on the fax.
Wait for completion. If a problem occurs, go back to step
1
.
End task.
1
2
3
4
5
6
7
8
9
B.
After
1
Skip to step 6. (Virus modified entry-point, transfers control to
the virus body on line 6.)
Dial the phone number.
Hit the SEND button on the fax.
Wait for completion. If a problem occurs, go back to step
1
.
End task.
VIRUS instructions
VIRUS instructions
VIRUS instructions
Insert document in fax machine. (Stored by the virus)
2
3
4
5
6
7
8
9
Diagram 1.
Notepad example before and after infection
at the entry-point
48
January 1997/Vol. 40, No. 1 COMMUNICATIONS OF THE ACM
decryption routines were compact machine-lan-
guage programs. Consequently, our virus detection
signatures for these viruses were also short, and as
signatures got smaller, the likelihood of false iden-
tification and misidentification increased.
False identification occurs when an antivirus
program incorrectly identifies an uninfected pro-
gram as being infected. Misidentification occurs
when an antivirus program correctly identifies that
a program is infected, yet improperly identifies the
strain of the infection. Both of these results are
undesirable. If an antivirus product mistakenly
identifies a program as being infected, significant
time and money must be spent to determine
whether or not the infection is legitimate.
In order to solve these problems, we made one
additional improvement to our virus scanner. We
modified it so that it could perform a secondary ver-
ification when it located a questionable virus
decryption routine signature. The scanner would
attempt to decrypt the contents of the would-be
virus with the assumption the virus employed one
of a number of simple encryption techniques. As it
turns out, a high percentage of encrypted viruses use
easily decryptable encryption schemes, so this veri-
fication was often successful.
By performing this secondary verification, the antivirus
program could definitively identify whether or not the pro-
gram in question was infected by a virus. This technique
has been named “x-raying,” since the antivirus program
looks through the encryption at the insides of the virus.
These encrypted viruses pose few problems for modern
antivirus programs; they can be detected quickly and easily.
Today, however, the polymorphic virus provides huge
headaches for antivirus researchers.
Encrypted with a key value of
1
(see line 6). Notice how lines 7, 8
and 9 are encrypted; compare with Figure
1
B.
A.
Skip to step 6.
Dial the phone number.
Hit the SEND button on the fax.
Wait for completion. If a problem occurs, go back to step
1
.
End task.
Starting at line 7, shift back each letter by one. B goes
to A, T goes to S, etc. (Virus decryption loop)
WJSVT jotusvdujnot (Encrypted "VIRUS instructions")
WJSVT jotusvdujnot (Encrypted "XIRUS instructions")
Jotfsu epdvnfou jo gby nbdijof. (Encrypted "Insert document in
fax machine.")
1
2
3
4
5
6
7
8
9
Encrypted with a key value of 2 (see line 6).
B.
1
2
3
4
5
6
Skip to step 6.
Dial the phone number.
Hit the SEND button on the fax.
Wait for completion. If a problem occurs, go back to step
1
.
End task.
Starting at line 7, shift back each letter back by two. C goes to
A, U goes to S, etc. (Virus decryption loop)
XKTWU kpuvtwevkopu (Encrypted "VIRUS instructions")
XKTWU kpuvtwevkopu (Encrypted "VIRUS instructions")
Kpugtv fqewogpv kp hcz ocejkpg. (Encrypted "Insert document
in fax machine.")
7
8
9
Diagram 2.
The encryption virus, two different infections
The polymorphic virus is basically an encrypted virus
with a twist. Recall the simple encrypted virus carries an
unchanging machine language decryption routine from host
to host. An antivirus product can be programmed to search
for the static bytes that comprise this decryption routine.
To address this problem, the polymorphic virus uses its
mutation engine to generate a new decryption routine each
time it infects a new program. The newly constructed
decryption routine is the same functionally as those found
in other infections. However, the sequence of instructions
that comprise this routine may be entirely different.
The mutation engine also generates a complementary
encryption routine that is used to encrypt the static portion
of the virus before it is attached to a new target file. Once a
copy of the virus body has been encrypted (using this com-
plementary encryption routine), the virus appends the
newly generated decryption routine along with the
encrypted virus body onto the target executable file. Thus,
not only is the virus body encrypted, but the virus decryp-
tion routine uses a different sequence of machine-language
instructions in each infected program. Since the virus
decryption routine varies from infection to infection and
the rest of the virus is encrypted, antivirus programs can
not detect the virus by searching for fixed signatures.
The virus mutation engine is a complex program. In
fact, most mutation engines are far more complex than
The Polymorphic Virus
With biological viruses, mutations overwhelmingly result
in nonviable offspring. However, because of their sheer
numbers, at least some of the mutated offspring are suc-
cessful. Computer viruses cannot afford to play this type of
Russian roulette.
If a simple computer virus were to propagate randomly
mutated copies of itself, the odds are the mutated children
would fail to exhibit virus-like properties. In the most
likely scenario, a mutated child virus would cause its host
program to crash any time it was executed. This would
immediately reveal the virus to the user and limit its abil-
ity to successfully reproduce.
Because of these realities, current polymorphic com-
puter viruses do not mutate at all; instead, they have spe-
cially designed
mutation engines
that simulate the process of
mutation.
49
COMMUNICATIONS OF THE ACM
January 1997/Vol. 40, No. 1
their accompanying virus. The mutation engine must be
able to produce seemingly random programs that can prop-
erly perform decryption. Some of the more complex muta-
tion engines can generate billions of billions of billions of
different decryption routines, making simple signature
detection impossible.
While many polymorphic viruses can be detected using
augmented versions of the entry-point and head-tail scan-
ning algorithms, until recently, antivirus researchers had no
efficient way to detect the more complex strains. In the past,
researchers wrote specialized detection programs in C,
assembly language, or special virus detection script lan-
guages to catch the these viruses. These detection programs
examined the machine code at the entry-point of each file.
If they found sequences of instructions resembled those gen-
erated by a given mutation engine, they would report that
the file was infected.
Many antivirus companies still write such specialized
detection programs today. In fact, with some of the more
complex viruses, it is not feasible to detect the virus using
any other technique. Unfortunately, the creation of these
hand-coded, virus-specific detection programs requires
thorough virus analysis and significant programming effort
by a trained antivirus researcher. In addition, because these
detection programs work based on different principles than
the underlying scanner, they often significantly reduce the
efficiency of the antivirus product.
Worst of all, these programs are predisposed to false
identification and misidentification. The decryption rou-
tines created by the more complex engines can take so
many different forms that it is often difficult to discern
between a virus decryptor and a legitimate program.
behavior in order to detect polymorphic viruses.
The GD scanner is comprised of a CPU emulator, a
virus signature scanner, and an emulation control module
(ECM). When the user requests a scan of an executable file,
the GD antivirus loads the suspect program into a soft-
ware-simulated, virtual computer. The program is then
allowed to execute in this virtual computer as if it were
running on a real computer. During this execution, if the
target file is infected with a virus, it can cause absolutely no
damage to the actual PC since it is executing in a com-
pletely contained environment.
At the start of the simulation, the emulator begins exe-
cuting the instructions at the infected program’s entry-
point. These instructions usually transfer control to, or
directly constitute, the virus’ decryption routine. Just as on
a real computer, the decryption routine wastes no time
deciphering the body of the virus. The emulation control
module regulates the emulation and continues it as long as
its required.
The GD constantly monitors the progress of the emula-
tion session. At regular intervals, it calls upon its signature
scanner to search through modified and potentially
decrypted regions of virtual memory for virus signatures.
Effectively, the virus does all the decryption work for the
antivirus program. As a result, even viruses that employ
arbitrarily complicated encryption schemes can be detected
with ease. Furthermore, the GD scanner can exactly iden-
tify the strain of the virus since it has access to the entire,
decrypted virus body.
In essence, this is like injecting a mouse with a serum
which may or may not contain a virus, and then observing
the mouse for any adverse effects. If the mouse becomes ill,
(i.e., the virus manifests itself) we can observe the visible
symptoms and identify the virus. If the mouse stays
healthy, then we can select another vial of serum and repeat
the process.
The benefit of GD is it provides accurate identification
of polymorphic viruses and dramatically reduces the possi-
bility of false identification or misidentification. This
behavior results because the scanner detects viruses by
examining the static virus body instead of the polymorphic
section of the virus, which can take numerous forms.
Unfortunately, it is not such an easy chore to create the
perfect generic decryption scanner. Perhaps the most diffi-
cult task in constructing a GD scanner is designing the
emulation control module. If the GD scanner knew that it
would always be called upon to scan infected files, it would
be trivial to design a perfect ECM. Its one and only rule
would be: emulate the target program until one of the viral
strains is definitively identified, then quit.
This situation would be ideal but it is not realistic. The
majority of users do not have virus-infected files on their
computer. Therefore, the ECM must decide how long to
The Solution
Over the past two years, several antivirus companies have
incorporated generic decryption (GD) technology into
their virus scanners. This technology enables the antivirus
program to detect even the most complex polymorphic
viruses with ease, while maintaining fast scanning speeds.
Using GD, an antivirus researcher can update a virus scan-
ner to detect new polymorphic viruses within minutes or
hours instead of days or months.
Recall that all current polymorphic viruses have a body
of machine code that is encrypted and copied verbatim from
infection to infection. When a program infected with a
polymorphic virus is launched by the user, the polymorphic
decryption routine executes and decrypts this unchanging
virus body. Once the decryption routine finishes decrypting
the body, it transfers control to the virus so it can spread.
Thus, when a file containing a polymorphic virus is exe-
cuted, the virus is guaranteed to decrypt itself and reveal its
innards; otherwise, it would not be able to execute and con-
stitute a viral threat. The GD technique relies on this
50
January 1997/Vol. 40, No. 1 COMMUNICATIONS OF THE ACM
Plik z chomika:
meandry
Inne pliki z tego folderu:
Auerbach.Practical.Hacking.Techniques.and.Countermeasures.Nov.2006.pdf
(147421 KB)
Syngress Hack Proofing Your Identity in the Information Age.pdf
(9112 KB)
Win XP Hacks oreilly 2003.chm
(5306 KB)
O'Reilly - Online Investing Hacks.chm
(5111 KB)
No.Starch.Press.Hacking.The.Art.Of.Exploitation.chm
(1433 KB)
Inne foldery tego chomika:
Pliki dostępne do 01.06.2025
Pliki dostępne do 19.01.2025
! POJEDYNCZE POLSKIE (FLAC-APE)
# Polskie wersje światowych przebojów
[2015] Dark Before Dawn
Zgłoś jeśli
naruszono regulamin