Optimizing Data Movement Through Software Control of General-Purpose Hardware Caches