The difference between bmap_blit and bmap_create:

Bmap_blit blits into video memory. It does not allocate system memory.

Bmap_create stores the image in system memory.

Both methods use video memory, but only bmap_create allocates also system memory. This - and not some compression by the video driver as I previously suspected - is the reason why you can blit twice as many bitmaps than create.

1.6 GB system memory limit is normal. Theoretically, video memory has no per-process limit, but in practice it has since the d3d memory manager also uses system memory. This might also depend on the video driver.