That's not right. If you see this it means that it takes 21 cycles if it loops and 16 cycles if it doesn't. For branches (jr cc) it's the same - 5 extra cycles if the branch is taken. (call cc is 7 extra).
Taking in to account these extra 5 cycles it might appear that this is a rubbish way to clear memory - but then consider this 'simpler' example:
ld a,0
ld hl,screensize
ld de,screen_ptr
ld bc,65535
loop:
ld (de),a 7
inc de 6
add hl,bc 11
jr nz loop 12
Wow! That's 36 cycles!
(You have to use add hl,bc because dec hl doesn't set any condition flags.)
The Z80 doesn't have any pages, it has a flat 64Kb address space. On the Spectrum 48K this is split into 3 main areas, 16Kb ROM, 16Kb screen, variables, and program data, 2x16Kb of general purpose memory.
It's the 2nd 16Kb that is interesting. The memory bandwidth for that RAM bank is shared between the CPU and the ULA (the display chip). Depending which bit of the screen is being updated by the ULA can mean you stall reading for up to 6 extra cycles. That might make the LDIR clear a bit worse because it is doing 6Kb of reads as well as 6Kb of writes, at an average of about 3 extra cycles per move, but that's still only 24.
Check out the 'Contended Memory' section here
http://www.worldofspectrum.org/faq/reference/48kreference.htmBy the way, Spectaculator is shareware and cost £16 to register. ZX SPIN is free, and it has a built-in assembler - you might want to try that!
Jim