Author Topic: About these imports (ordinals / hash) (Read 5351 times)

rain_storm · « **on:** July 06, 2008 »

About import by hash to my understanding this is how shell code does import by hash -

1] get base address of kernel32 (a few ways exist best one of all requires only 16 bytes but doesnt work with win9x)
2] use own version of GetProcessAddress to locate process by looking up the export RVA tables
3] also convert RVA to VA so that your code can call the process
4] find LoadLibaryA using above GetProcessAddress
5] Load new libraries using LoadLibraryA - find process using own version of GetProcessAddress

now how does the hash fit into all this why does the hash not change from ME,XP,Vista and also across service packs. Import by ordinals I dont understand at all but I know enough to know that it is unreliable (info gained from in4k) why is import by hash not as unreliable as import by ordinals. How does the hash relate to the address? these hashes are far too large to be offsets within the DLL so I assume that the hash is matched against one of the feilds within the export RVA table does that mean that the hash is sort of like a finger print for the process? is it possible for a hash from Kernel32 to be identical to a hash from opengl32 if so how then do you avoid looking up the wrong library

Edit - Also do you think that this is acceptable behaviour? http://www.pouet.net/prod.php?which=50602
That prod uses an external program to find the process address for each process called. The funny thing is the coder is doing some hard core coding to manually compress the PE header but if he went just a little more hardcore he could have included the code to load the libraries manually and this code would easily have fitted into the padding (82 bytes of padding) that he left in the header

Sorry if this is too low level for you guys

mentor · « **Reply #1 on:** July 06, 2008 »

you are pretty much right about everything.

the hash is basically a fingerprint of the function you want to import. you hash the one thing that is guaranteed not to change across windows versions, the name. There can be collisions, but if your hash function is good and you allocate sufficiently many bits for your hash codes, this will never occur in practice.
Collisions between functions in different dll files is not a problem, if you associate each hash with only one dll.
You can always choose some hash function without collisions for the version of the dll files that you have got.
The only thing that can really go wrong then is that another function is introduced into a dll with a name hashing to the same as some function you are importing. Assuming that your hash is perfect and 32-bit, the probability of a new function colliding with one of the m functions you are using is m*2^-32. Even if a bunch of new functions are introduced, this is still negligible.

The ordinal of a function is an integer that is guaranteed to uniquely identify the function within the dll file. This is usually just the index of the function. These ordinals can (and will) change with windows versions and is thus not a particular safe importing mechanism.

ROXOR: No, it is a horrible hack.

Cheers
-mentor

rain_storm · « **Reply #2 on:** July 06, 2008 »

Thank you mentor this has cleared up a lot of confusion for me, now I have a better understanding. It seemed so difficult before but now I see that it is actually very simple. If things go wrong I can simply use a new hash function and keep repeat until I get it right. I see know how there can be collisions you are basically storing the function name in an encrypted and compressed format. There is possibility that two completely different names encrypt to the same hash much like a checksum is not as unique as they would have you believe.

As for the Roxor prod I was considering goin that route but some further research has shown that there are better ways of importing without an import table. Just testing the waters before I dive in

rain_storm · « **Reply #3 on:** July 07, 2008 »

Okay I have slept on it on this is what I have so far

There will be the DLLName's stored as asciiz. The offset to asciiz for DLLName begins the struct for hashes. I will use a dword to store each hash to be loaded for the DLL. The struct is terminated by a zero hash. that looks like this -

Code: [Select]

LibraryStruct: dd offset kernel32Name
               dd 0x???????? ; ExitProcess
               dd 0x???????? ; LoadLibraryA
               dd 0x00000000 ; zero hash terminator

I loop through an array of these structs loading first the library then each process from that library until the zero hash is reached then an outer loop for loading more libraries

I assume that the ProcessName is a zero terminated string the maximum possible StrLen for a ProcessName = 0xFF. This way I can store the StrLen in a single byte of the hash.

The asciiz codes are added to each other in a register. Because max StrLen = 0xFF (and all ascii used will be below 0xFF) the sum total of adding all letters together cannot possibly exceed 0xFFFF which uses up 2 bytes more of the hash

Now we know that our ProcessName has StrLen = XX and that ProcessName sumTotal = XXXX

Preaty safe but will fail if two ProcessNames use same letters, same number of times, but in a different order. I have one byte of hash left to solve this case of possible collision. I need to store something that is linked to the sequence of the letters in this one byte and I need to keep the decryption code as tight as possible. Can I just use Xor and a rotate here or is there a better way?

anybody got any ideas?

mentor · « **Reply #4 on:** July 07, 2008 »

Ok, your scheme seems reasonable well thought out. Here are some of my thoughts.
-Why have a pointer to the library name, when you could just as well have the name there itself.

-Using a full word to signal the end of a list seems a bit excessive, if the list is not meant to be compressed. You could use a byte counter instead.

-Addition is a commutative operator, so any permutation of the string will give the same result, which is obviously not a desirable property for a hash function

. Besides, you are not really distributing the names very well, as most names will have roughly the same length and sum. Having multiple checks, like length and sum, is probably not so good either, if you want to keep things tiny.

In crinkler we use a simple xor-rotate scheme, where new characters are xor'ed into the hash and rotated. This is obviously not commutative in general.

Code: [Select]

;esi: ptr to name
xor edi, edi
.hashloop:
	rol edi, 6
	xor eax, eax
	lodsb
	xor edi, eax
	dec eax
	jge .hashloop
;edi:hash code

Some other stuff to think about:
-If you are already hacking the PE and have some annoying unused fields left, why not use them to store some hashes? Interpreting non-hashes as hashes is not a problem, if you never call the 'fake' imports - so why not just import the whole PE header?

cheers
-mentor

stormbringer · « **Reply #5 on:** July 07, 2008 »

some good hash functions here: http://burtleburtle.net/bob/hash/evahash.html

rain_storm · « **Reply #6 on:** July 07, 2008 »

StormBringer thank you for this link I have glanced through it the info in there comes thick and fast so I will be giving that a more thorough reading asap

mentor you are involved in the development of crinkler? excellent tool btw very popular on this forum I love using it

I will try to adapt your advice to my needs I recognise that routine from in4k (or something very similar 'rol edi,6' is ringing stong bells but last time i saw that I had no idea what it was doing) I will use this for learning only I also want to feel the satisfaction of DIY and thanks for sharing it will come in MEGA useful cos its much smaller than what I was thinking of

I was planning on storing the DLL names inside the header file (wherever they will fit), I want something sequencial for accessing them I did not want to be pointing esi at string then back to hashes ... hey wait a minute

of course storing the string inside the struct IS easier I can just roll from one to the other. Okay I shut my mouth now cos this is one time I should be listening

BTW another thing I plan on using, I dont think this has ever been done before if it has then I wanna know what its called, I will store my invokes as structures, everything params/VA get loaded then pushed on top of stack (ret is used to call process) in this way a single invoke function is shared by all process calls, the opcode for pushing each parameter is eliminated.

Here is a 16 bit mockup if you run it slowly in a debugger you will see what it is doing
at first it calls two dummy routines in a sort of initialise phase
then calls a dummy routine in an infinate loop

the dummys simply consume the parameters then return it gives a concept of what I am going for. Using this and a well optmised getprocessaddress is a good 512 byter feasable? (uncompressed)

Code: [Select]

 
 ORG 0x100

 D1 = 0x011C
 D2 = 0x0121
 
 BEGINS:LEA   SI,COMMND     ; BE 14 00 ;
        CALL  INVOKE        ; E8 07 00 ;
 ETERNL:PUSH  SI            ; 56       ;
        CALL  INVOKE        ; E8 03 00 ;
        POP   SI            ; 5E       ;
        JMP   ETERNL        ; EB F9    ;
 INVOKE:XOR   AX,AX         ; 33 C0    ; AX = 0
        LODSB               ; AA       ; AL = NUM_PARAMS
        XCHG  AX,CX         ; 91       ; CX = NUM_PARAMS
        JCXZ  RETURN        ; E3 08    ; RETURN ON NUM_PARAMS = 0
        PUSH  OFFSET INVOKE ; 68 0F 01 ; USE INVOKE AS RETURN ADDRESS
 PARAMS:PUSH  [SI]          ; FF 34    ; PUSH PARAMS | VA
        LODSW               ; AD       ; SI++
        LOOP  PARAMS        ; E2 FC    ; LOOP FOR ALL ELEMENTS OF STRUCT
 RETURN:RET                 ; C3       ; LOOP INVOKE | RETURN TO CALLER

 DUMMY1:POP   DX            ;
        POP   DX            ;
        POP   DX            ;
        POP   DX            ;
        RET                 ;

 DUMMY2:POP   DX            ;
        POP   DX            ;
        RET                 ;

 COMMND DB    0x05          ; NUMPARAMS+1
        DW    0x0000        ; PARAM 4
        DW    0x0000        ; PARAM 3
        DW    0x0000        ; PARAM 2
        DW    0x0000        ; PARAM 1
        DW    D1            ; VA DUMMY 1
        DB    0x03          ; NUMPARAMS+1
        DW    0x0000        ; PARAM 2
        DW    0x0000        ; PARAM 1
        DW    D2            ; VA DUMMY 2
        DB    0x00          ; END OUTER LOOP
        DB    0x03          ; NUMPARAMS+1
        DW    0x0000        ; PARAM 2
        DW    0x0000        ; PARAM 1
        DW    D2            ; VA DUMMY 2
        DB    0x00          ; END OUTER LOOP
        DB    0x00          ; END ALL

rain_storm · « **Reply #7 on:** July 07, 2008 »

Okay trash that idea it is too hard too work with, needs modification to use return values, and does not compare well to simply keeping parameters in registers, The thinking behind it was that it may save bytes here and there but you can push 4 parameters in one dword if they are kept in registers

I like that idea about importing the entire header I start with somewhere where I can fit the DLL name skip the unuseable stuff and then put in my hashes in the places I can use. repeat for each DLL. This has saved me a lot of work mentor and I surely would not have picked up on most of that by myself thanks for the tips

Edit - There is plenty of room in the headers I have shoved code in there and execute the headers most of the used feilds act like nop's but the few bytes that mess things up can easily be skipped over with a short jump. find kernelbase fits into the unused part of the pe header + some bytes at start of optional header. some of getprocess address is in the optional header / section table. when i tested on another PC it crashed. And I thought I was just 3 bytes away from messagebox in 256 bytes. Oh well thats the joys of learning. The routines Im am learning from are already very well optimised but some safety nets were not needed stripping out those is all the optimising I have been able to do so far. If only getprocessaddress didnt hog up all the registers, preserving the stack would be nice too

Author Topic: About these imports (ordinals / hash) (Read 5351 times)

rain_storm

About these imports (ordinals / hash)

mentor

Re: About these imports (ordinals / hash)

rain_storm

Re: About these imports (ordinals / hash)

rain_storm

Re: About these imports (ordinals / hash)

mentor

Re: About these imports (ordinals / hash)

stormbringer

Re: About these imports (ordinals / hash)

rain_storm

Re: About these imports (ordinals / hash)

rain_storm

Re: About these imports (ordinals / hash)