Realism and Fidelity: Two Sides of a Coin in
Deep Joint Source-Channel Coding

Haotian Wu, Weichen Wang, Di You, Pier Luigi Dragotti, and Deniz Gündüz
Imperial College London, Department of Electrical and Electronic Engineering

Abstract

Deep joint source-channel coding (DeepJSCC) offers a promising approach to improving transmission efficiency by jointly leveraging source semantics and channel conditions. While prior work has focused on fidelity under varying channel conditions, recent diffusion-based approaches improve perceptual quality at the cost of high complexity and limited adaptability. In this work, we reveal that fidelity and perceptual realism can be unified in an adaptive DeepJSCC scheme through SNR-aware optimization, eliminating the need for separate models. Specifically, we propose \( \text{W}^2 \)-DeepJSCC, a unified, channel-adaptive framework that dynamically balances fidelity and perceptual realism based on channel conditions. It introduces two key innovations: a saliency-guided perception–fidelity adapter (SG-PFA) and wavelet Wasserstein distortion (WA-WD). SG-PFA enables a single model to adapt across varying channel conditions, preserving semantic realism under poor channel conditions while enhancing fidelity under good ones. WA-WD, inspired by foveal and peripheral vision, provides fine-grained control in the wavelet domain. As a plug-and-play module, \( \text{W}^2 \)-DeepJSCC integrates seamlessly with existing DeepJSCC architectures. Experiments show that \( \text{W}^2 \)-DeepJSCC significantly outperforms baselines in perceptual metrics while maintaining strong fidelity at high SNRs. Prototype verification further highlights its advantages, demonstrating that the proposed method delivers competitive fidelity and perception with low complexity, making it a promising alternative for future deployments. Additionally, a user study further confirms that WA-WD aligns more closely with human perception than existing metrics.

Can a single DeepJSCC model adaptively balance
fidelity and realism across various channel conditions?

Performance of \( \text{W}^2 \)-DeepJSCC.

Visual comparison

Fig. 1 Visual comparison between different schemes when SNR = 0 dB and \( R = 1/24 \). \( \text{W}^2 \)-DeepJSCC achieves the best perceptual realism, closely matching the original image.

Proposed WA-WD for all DeepJSCC schemes.

Description of the image

Fig. 2 A perceptually aligned optimization objective inspired by foveal/peripheral vision, providing fine-grained control over fidelity-realism in the wavelet domain.

Proposed Saliency-Guided Perception–Fidelity Adapter.

Description of the image

Fig. 3 An SNR-aware modulation mechanism for the proposed loss function that preserves perceptual realism at low SNR while gradually shifting toward high-fidelity reconstruction at high SNR.

Numerical results across various scenarios

Description of the image

Fig. 4 Experimental resutls on Kodak dataset.

Description of the image

Fig. 5 User study to evaluate the human preference.

Description of the image

Fig. 6 Visual comparisons at SNR = 0 dB and R =1/24, where our method demonstrates clear advantages across the entire image. Additional examples are available in the paper.

Description of the image

Fig. 7 Prototype verification under different transmitter power for Kodak dataset, where the bandwidth ratio is 1/2. Proposed method offers a clear advantage, such as textures on trees and even interior details beneath the river surface that are lost in baseline reconstructions.

BibTeX


  @inproceedings{wu2025realism,
    title={Realism and Fidelity: Two Sides of a Coin in Deep Joint Source-Channel Coding},
    author={Wu, Haotian and Wang, Weichen and You, Di and Dragotti, Pier Luigi and Gunduz, Deniz},
    booktitle={NeurIPS 2025 Workshop: AI and ML for Next-Generation Wireless Communications and Networking}
  }