360PanT: Training-Free Text-Driven 360-Degree Panorama-to-Panorama Translation

University College London
WACV 2025


Preserving boundary continuity in the translation of 360-degree panoramas remains a significant challenge for existing text-driven image-to-image translation methods. These methods often produce visually jarring discontinuities at the translated panorama’s boundaries, disrupting the immersive experience. To address this issue, we propose 360PanT, a training-free approach to text-based 360-degree panorama-to-panorama translation with boundary continuity. Our 360PanT achieves seamless translations through two key components: boundary continuity encoding and seamless tiling translation with spatial control. Firstly, the boundary continuity encoding embeds critical boundary continuity information of the input 360-degree panorama into the noisy latent representation by constructing an extended input image. Secondly, leveraging this embedded noisy latent representation and guided by a target prompt, the seamless tiling translation with spatial control enables the generation of a translated image with identical left and right halves while adhering to the extended input’s structure and semantic layout. This process ensures a final translated 360-degree panorama with seamless boundary continuity. Experimental results on both real-world and synthesized datasets demonstrate the effectiveness of our 360PanT in translating 360-degree panoramas.

Visual Results of Different Methods

To easily recognize the continuity or discontinuity between the leftmost and rightmost sides of the generated image, we copy the leftmost area indicated by the blue dashed box and paste it onto the rightmost side of the image.



Our 360PanT comprises two primary components: boundary continuity encoding and seamless tiling translation with spatial control. The boundary continuity encoding component embeds the boundary continuity information of \(I_{in}\) into the noisy latent feature \(x_{T}\). Subsequently, guided by the target prompt \(C\), \(x_T\) undergoes seamless tiling translation with spatial control to produce the denoised translated latent feature \(x_0\). Finally, the translated 360-degree panorama \(I_{out}\), aligned with the target prompt \(C\), is achieved by cropping from the translated image \(\hat{I}_{out}\).

Visual Results Using Other Control Conditions

