You want to hide portions of your binary using self-modifying code without rewriting existing code in assembler.
The most effective use of self-modifying code is to overwrite a section of vital code with another section of vital code, such that both vital sections do not exist at the same time. This can be time-consuming and costly to develop; a more expedient technique can be achieved with C macros that decrypt garbage bytes in the code section to proper executable code at runtime. The process involves encrypting the protected code after the binary has been compiled, then decrypting it only after it has been executed.
The code presented in this recipe applies to FreeBSD, Linux, NetBSD, OpenBSD, and Solaris. The concepts apply to Unix and Windows in general.
For the code presented in this recipe, we'll be using RC4 to perform our encryption. We've chosen to use RC4 because it is fast and easy to implement. You will need to use the RC4 implementation from Recipe 5.23 or an alternative implementation from somewhere else to use the code we will be presenting.
The actual code to decrypt and replace the code in memory is minimal.
The complexity arises from having to obtain the code to be encrypted,
encrypting it, and making it accessible to the code that will be
decrypting and executing it. A set of macros provides the means to
mark replaceable code, and a single function,
spc_smc_decrypt(
)
, performs the decryption of the code. Because
we're using RC4, encryption and decryption are
performed in exactly the same way, so spc_smc_decrypt(
)
can also be used for encryption, which
we'll do later on.
#include <errno.h> #include <stdio.h> #include <string.h> #include <sys/types.h> #include <sys/mman.h> #define SPC_SMC_START_BLOCK(label) void label(void) { } #define SPC_SMC_END_BLOCK(label) void _##label(void) { } #define SPC_SMC_BLOCK_LEN(label) (int)_##label - (int)label #define SPC_SMC_BLOCK_ADDR(label) (unsigned char *)label #define SPC_SMC_START_KEY(label) void key_##label(void) { } #define SPC_SMC_END_KEY(label) void _key_##label(void) { } #define SPC_SMC_KEY_LEN(label) (int)_key_##label - (int)key_##label #define SPC_SMC_KEY_ADDR(label) (unsigned char *)key_##label #define SPC_SMC_OFFSET(label) (long)label - (long)_start extern void _start(void); /* returns number of bytes encoded */ int spc_smc_decrypt(unsigned char *buf, int buf_len, unsigned char *key, int key_len) { RC4_CTX ctx; RC4_set_key(&ctx, key_len, key); /* NOTE: most code segments have read-only permissions, and so must be modified * to allow writing to the buffer */ if (mprotect(buf, buf_len, PROT_WRITE | PROT_READ | PROT_EXEC)) { fprintf(stderr, "mprotect: %s\n", strerror(errno)); return(0); } /* decrypt the buffer */ RC4(&ctx, buf_len, buf, buf); /* restore the original memory permissions */ mprotect(buf, buf_len, PROT_READ | PROT_EXEC); return(buf_len); }
The use of mprotect(
)
, or
an equivalent operating system routine for modifying the permissions
of a page of memory, is required on most modern operating systems to
write to the code segment. This is an inherent weakness of the
self-modifying code technique: the call to mprotect(
)
is suspicious, and it is trivial to write a utility that
searches the disassembly of a program for calls to mprotect(
)
that enable write access or take an address in the code
segment as the first parameter. The use of mprotect(
)
should be obfuscated (see Recipe 12.3 and Recipe 12.9).
Once the binary has been compiled, the protected code will have to be encrypted before it can be executed. The following code demonstrates a utility for encrypting a portion of an ELF executable file based on the contents of another portion of the file. The usage is:
smc_encryptfilename
code_offset
code_len
key_offset
key_len
In the command,
code_offset
and
code_len
are the location in the file of the code to be encrypted and the
code's length, and
key_offset
and
key_len
are the location in the file of the key with which to encode the code
and the key's length.
#include <errno.h> #include <fcntl.h> #include <limits.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #include <sys/types.h> #include <sys/mman.h> #include <sys/stat.h> /* ELF-specific stuff */ #define ELF_ENTRY_OFFSET 24 /* e_hdr e_entry field offset */ #define ELF_PHOFF_OFFSET 28 /* e_hdr e_phoff field offset */ #define ELF_PHESZ_OFFSET 42 /* e_hdr e_phentsize field offset */ #define ELF_PHNUM_OFFSET 44 /* e_hdr e_phnum field offset */ #define ELF_PH_OFFSET_OFF 4 /* p_hdr p_offset field offset */ #define ELF_PH_VADDR_OFF 8 /* p_hdr p_vaddr field offset */ #define ELF_PH_FILESZ_OFF 16 /* p_hdr p_size field offset */ static unsigned long elf_get_entry(unsigned char *buf) { unsigned long entry, p_vaddr, p_filesz, p_offset; unsigned int i, phoff; unsigned short phnum, phsz; unsigned char *phdr; entry = *(unsigned long *) &buf[ELF_ENTRY_OFFSET]; phoff = *(unsigned int *) &buf[ELF_PHOFF_OFFSET]; phnum = *(unsigned short *) &buf[ELF_PHNUM_OFFSET]; phsz = *(unsigned short *) &buf[ELF_PHESZ_OFFSET]; phdr = &buf[phoff]; /* iterate through program headers */ for ( i = 0; i < phnum; i++, phdr += phsz ) { p_vaddr = *(unsigned long *)&phdr[ELF_PH_VADDR_OFF]; p_filesz = *(unsigned long *)&phdr[ELF_PH_FILESZ_OFF]; /* if entry point is in this program segment */ if ( entry >= p_vaddr && entry < (p_vaddr + p_filesz) ) { /* calculate offset of entry point */ p_offset = *(unsigned long *)&phdr[ELF_PH_OFFSET_OFF]; return( p_offset + (entry - p_vaddr) ); } } return 0; } int main(int argc, char *argv[ ]) { unsigned long entry, offset, len, key_offset, key_len; unsigned char *buf; struct stat sb; int fd; if (argc < 6) { printf("Usage: %s filename offset len key_offset key_len\n" " filename: file to encrypt\n" " offset: offset in file to start encryption\n" " len: number of bytes to encrypt\n" " key_offset: offset in file of key\n" " key_len: number of bytes in key\n" " Values are converted with strtol with base 0\n", argv[0]); return 1; } /* prepare the parameters */ offset = strtoul(argv[2], 0, 0); len = strtoul(argv[3], 0, 0); key_offset = strtoul(argv[4], 0, 0); key_len = strtoul(argv[5], NULL, 0); /* memory map the file so we can access it via pointers */ if (stat(argv[1], &sb)) { fprintf(stderr, "Stat failed: %s\n", strerror(errno)); return 2; } if ((fd = open(argv[1], O_RDWR | O_EXCL)) < 0) { fprintf(stderr, "Open failed: %s\n", strerror(errno)); return 3; } buf = mmap(0, sb.st_size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0); if ((int)buf < 0) { fprintf(stderr, "Open failed: %s\n", strerror(errno)); close(fd); return 4; } /* get entry point : here we assume ELF example */ entry = elf_get_entry(buf); if (!entry) { fprintf(stderr, "Invalid ELF header\n"); munmap(buf, sb.st_size); close(fd); return 5; } /* these are offsets from the entry point */ offset += entry; key_offset += entry; printf("Encrypting %d bytes at 0x%X with %d bytes at 0x%X\n", len, offset, key_len, key_offset); /* Because we're using RC4, encryption and decryption are the same operation */ spc_smc_decrypt(buf + offset, len, buf + key_offset, key_len); /* mem-unmap the file */ msync(buf, sb.st_size, MS_SYNC); munmap(buf, sb.st_size); close(fd); return 0; }
This program incorporates an ELF file-header parser in the
elf_get_entry( )
routine. The program header table entries of
the ELF header are searched for the loadable segment containing the
entry point. This is done to translate the entry point virtual
address into an offset from the start of the file. This is necessary
because the offsets generated by the
SPC_SMC_OFFSET
macro are relative to the program
entry point (_start
).
The following code provides an example of using the code
we've presented in this recipe. The program decrypts
itself at runtime, using bogus_routine(
)
as a key for decrypting test_routine(
)
.
#include <stdio.h> #include <unistd.h> SPC_SMC_START_BLOCK(test) int test_routine(void) { int x; for (x = 0; x < 10; x++) printf("decrpyted!\n"); return x; } SPC_SMC_END_BLOCK(test) SPC_SMC_START_KEY(test) int bogus_routine(void) { int x, y; for (x = 0; x < y; x++) { y = x + 256; y /= 32; x = y * 2 / 24; } return 1; } SPC_SMC_END_KEY(test) int main(int argc, char *argv[ ]) { spc_smc_decrypt(SPC_SMC_BLOCK_ADDR(test), SPC_SMC_BLOCK_LEN(test), SPC_SMC_KEY_ADDR(test), SPC_SMC_KEY_LEN(test)); #ifdef UNENCRYPTED_BUILD /* This printf( ) displays the parameters to pass to the smc_encrypt utility on * stdout. The printf( ) must be removed, and the program recompiled before * running smc_encrypt. Having the printf( ) at the end of the file prevents * the offsets from changing after recompilation. */ printf("(offsets from _start)offset: 0x%X len 0x%X key 0x%X len 0x%X\n", SPC_SMC_OFFSET(SPC_SMC_BLOCK_ADDR(test)), SPC_SMC_BLOCK_LEN(test), SPC_SMC_OFFSET(SPC_SMC_KEY_ADDR(test)), SPC_SMC_KEY_LEN(test)); exit(0); #endif test_routine( ); return 0; }
As mentioned in the comment just prior to the printf(
)
call in main( )
, this program should
be compiled with UNENCRYPTED_BUILD
defined, then
executed to obtain the parameters to the
smc_encrypt
utility:
/bin/sh>cc -I. smc.c smc_test.c -D UNENCRYPTED_BUILD /bin/sh>./a.out (offsets from _start)offset: 0xB0 len 0x36 key 0xEB len 0x66
The program is then recompiled, with
UNENCRYPTED_BUILD
not defined in order to remove
the printf( )
and exit( )
statements. The smc_encrypt
utility is then run on
the resulting binary to produce a working program:
/bin/sh>cc -I. smc.c smc_test.c /bin/sh>smc_encrypt a.out 0xB0 0x36 0xEB 0x66
Self-modifying code is one of the
most potent techniques available for protecting binary code; however,
it makes the build process more complex, as you can see in the above
example. In addition, some processor architectures (such as the x86
line before the Pentium II) cache instructions and do not invalidate
this cache when the code segment is written to. To be compatible with
these older architectures, you will need to use one of the three
ring3 serializing instructions
(cpuid
, iret
, and
rsm
) to invalidate the cache. This can be
performed with a macro:
#define INVALIDATE_CACHE asm volatile( \ "pushad \n" \ "cpuid \n" \ "popad \n")
The pushad
and popad
instructions are needed because the
cpuid
instruction overwrites the four
general-purpose registers. Once again, as with the call to
mprotect( )
, note that the use of the
cpuid
instruction is suspicious and will draw
attention to the code of the protection. It is better to place the
call to the decrypted code far enough away (16 bytes should be
sufficient, because only 486 and Pentium CPUs will be affected) from
the actual decryption routine so that the decrypted code will not be
in the instruction cache.
This implementation of self-decrypting code is a simple one; it could
be defeated by pulling the decryption code from the binary,
decrypting the protected code, then replacing the call to the
decryption routine with nop
instructions. This is
possible because the size of the encrypted code is the same as the
decrypted code; a more robust solution would be to use a stronger
encryption method or a compression method, and extract the protected
code to a dynamically allocated region of memory. However, such a
method requires extensive manipulation of the object files before and
after linking. You might consider using a commercially available
binary packer to reduce development and testing time.