IT박스

char 배열에 저장된 기계어 코드를 호출하는 방법은 무엇입니까?

itboxs 2021. 1. 5. 07:50
반응형

char 배열에 저장된 기계어 코드를 호출하는 방법은 무엇입니까?


네이티브 기계어 코드를 호출하려고합니다. 지금까지 내가 가진 것입니다 (버스 오류가 발생합니다).

char prog[] = {'\xc3'}; // x86 ret instruction

int main()
{
    typedef double (*dfunc)();

    dfunc d = (dfunc)(&prog[0]);
    (*d)();
    return 0;
}

함수를 올바르게 호출하고 ret 명령어를 얻습니다. 그러나 ret 명령을 실행하려고하면 SIGBUS 오류가 발생합니다. 실행을 위해 지워지지 않은 페이지에서 코드를 실행하고 있기 때문입니까?

그래서 내가 여기서 뭘 잘못하고 있니?


첫 번째 문제는 프로그램 데이터가 저장된 위치가 실행 가능하지 않다는 것입니다.

적어도 Linux에서 결과 바이너리는 전역 변수의 내용을 "데이터"세그먼트 또는 여기 에 배치합니다. 이는 대부분의 일반적인 경우 실행되지 않습니다 .

두 번째 문제는 호출하는 코드가 어떤 식 으로든 유효하지 않다는 것입니다. C에서 메서드를 호출하는 특정 절차가 있는데, 호출 규칙 이라고합니다 (예를 들어 "cdecl"을 사용할 수 있습니다). 호출 된 함수가 단순히 "ret"하는 것만으로는 충분하지 않을 수 있습니다. 스택 정리 등을 수행해야 할 수도 있습니다. 그렇지 않으면 프로그램이 예기치 않게 작동합니다. 첫 번째 문제를 통과하면 문제가 될 수 있습니다.


prog가 실행되는 페이지를 만들려면 memprotect를 호출해야합니다. 다음 코드는이 호출을 수행하고 prog에서 텍스트를 실행할 수 있습니다.

#include <unistd.h>
#include <stdio.h>
#include <malloc.h>
#include <stdlib.h>
#include <errno.h>
#include <sys/mman.h>

char prog[] = {
   0x55,             // push   %rbp
   0x48, 0x89, 0xe5, // mov    %rsp,%rbp
   0xf2, 0x0f, 0x10, 0x05, 0x00, 0x00, 0x00,
       //movsd  0x0(%rip),%xmm0        # c <x+0xc>
   0x00,
   0x5d,             // pop    %rbp
   0xc3,             // retq
};

int main()
{
    long pagesize = sysconf(_SC_PAGE_SIZE);
    long page_no = (long)prog/pagesize;
    int res = mprotect((void*)(page_no*pagesize), (long)page_no+sizeof(prog), PROT_EXEC|PROT_READ|PROT_WRITE);
    if(res)
    {
        fprintf(stderr, "mprotect error:%d\n", res);
        return 1;
    }
    typedef double (*dfunc)(void);

    dfunc d = (dfunc)(&prog[0]);
    double x = (*d)();
    printf("x=%f\n", x);
    fflush(stdout);
    return 0;
}

As everyone already said, you must ensure prog[] is executable, however the proper way to do it, unless you're writing a JIT compiler, is to put the symbol in an executable area, either by using a linker script or by specifying the section in the C code if the compiler allows , e.g.:

const char prog[] __attribute__((section(".text"))) = {...}

Virtually all C compilers will let you do this by embedding regular assembly language in your code. Of course it's a non-standard extension to C, but compiler writers recognise that it's often necessary. As a non-standard extension, you'll have to read your compiler manual and check how to do it, but the GCC "asm" extension is a fairly standard approach.

 void DoCheck(uint32_t dwSomeValue)
 {
    uint32_t dwRes;

    // Assumes dwSomeValue is not zero.
    asm ("bsfl %1,%0"
      : "=r" (dwRes)
      : "r" (dwSomeValue)
      : "cc");

    assert(dwRes > 3);
 }

Since it's easy to trash the stack in assembler, compilers often also allow you to identify registers you'll use as part of your assembler. The compiler can then ensure the rest of that function steers clear of those registers.

If you're writing the assembler code yourself, there is no good reason to set up that assembler as an array of bytes. It's not just a code smell - I'd say it is a genuine error which could only happen by being unaware of the "asm" extension which is the right way to embed assembler in your C.


Essentially this has been clamped down on because it was an open invitation to virus writers. But you can allocate and buffer and set it up with native machinecode in straight C - that's no problem. The issue is calling it. Whilst you can try setting up a function pointer with the address of the buffer and calling it, that's highly unlikely to work, and highly likely to break on the next version of the compiler if somehow you do manage to coax it into doing what you want. So the best bet is to simply resort to a bit of inline assembly, to set up the return and jump to the automatically generated code. But if the system protects against this, you'll have to find methods of circumventing the protection, as Rudi described in his answer (but very specific to one particular system).


One obvious error is that \xc3 is not returning the double that you claim it's returning.


You can eliminate the crash by allowing the compiler to store the array in the read-only section of your process memory (if it's known at compile time). For example by declaring the array const.

Example:

const char prog[] = {'\xc3'}; // x86 ret instruction

int main()
{
    typedef double (*dfunc)();

    dfunc d = (dfunc)(&prog[0]);
    (*d)();
    return 0;
}

Alternatively you can compile the code with disabled stack protection gcc -z execstack.

Related question:

ReferenceURL : https://stackoverflow.com/questions/39867396/how-to-call-machine-code-stored-in-char-array

반응형