Summary: This commit allows transpose to compute in-place by leveraging buffer. We also adds CRD mode for space-depth transpose (i.e., pixel shuffle).