It's software, no hardware tricks. I'm doing this to work on the Acorn Electron, it's the BBC's slightly crippled little brother. The Electron was my first comp for Christmas '83 I think.
The BBC might be able to do this in hardware with some custom screenmodes but the Electron definitely couldn't, and the electron generally ran at about half the clock rate of the BBC.
Even though it was a bit crippled it was made to be compatible with the BBC and the memory layout for the screen was a bitch, it was designed more for text than graphics so each consecutive 8 bytes go down vertically then it jumps back up to the top of the next character (or half character depending on colour mode), in the mode I'm using it's just 4 colours and lowest res so each byte contains 4 pixels in a stupid layout, the rightmost pixel comes from bits 0 and 4 and left pixel bits 3 and 7, processing that for per pixel scrolling is way too much work for a 1MHz 6502 so I've created 4 buffers each at 2K to store the current screen then select which one to use depending on X AND 3 but the buffers have a more logical layout than the screen and I've come up with a way to copy them pretty quickly (am looking to see if I can speed it up a bit more too)
The other thing is that in the mode I'm using the pixels aren't square so I'm doubling them up one above the other to make them square which helps with speed and cuts down on the memory of my buffers.
Whatever I do with it it's just gonna end up half the screen height and not the full width either, and just be 128x64 pixels (after doubling up)