Clearly the most finely sampled band (H1) should be the most coarsely quantized -- it contains the most data, and quantization errors only affect single pels (they do not propagate through further bands). In general, the coarser the sampling, the larger the range of influence of a quantization error, so the closer should be the quantizer levels. But how should the step size evolve from one band to the next? Calculation of optimal bit assignments based on the subband p.d.f.s (as in [4], for example) is not possible, because of the dependence of fine-band prediction on coarse-band quantization. Instead, the approach here is based on an argument about interaction between the levels and experiments based on PSNR. It takes no account of quantizer noise visibility, and may therefore be subjectively suboptimal. The basic idea is to avoid needless introduction of non-zero values into the pyramid because of earlier quantization errors.
At any level of the binary pyramid, the accuracy of the prediction is constrained by the quantization error of earlier levels. The quantizer spacing of the current level should be set so that any value which would be quantized to 0 using the quantizer of the preceding level with an exact prediction, is quantized to 0 with the inexact prediction actually obtained. If the range of prediction error caused by earlier quantization error can be calculated, then the appropriate spacing for the current level is the previous level spacing plus the maximum error in the prediction.
Two opposite corners of the predictor surround square come from the immediately preceding level, and if closest-opposite-pair is used, their average will form the prediction about half the time. Assuming that quantizer spacing increases with level, this will be the worse case. The worst quantization noise from the previous level will be half the quantizer spacing for the level. Because a two-pel average is used, the prediction error variance will be less than the quantization noise variance, but the worse case will still be an error of a half the previous level's spacing. The appropriate spacing for the current level is therefore one-and-a-half times the previous level spacing.
The above argument is based on the worst case. Allowing for some erroneous non-zero quantization, and using a statistical approach, based on the variances of the preceding levels' difference histograms, would yield a more effective data-rate/distortion tradeoff. However, the result that the current level's spacing should be a multiple of the previous level's would still be valid. Thus, if Sn is the quantizer step size for band H(n), given a size for H1, the other spacings can be calculated by S(n+1) = aS(n), where a is a constant. Rather than go further into analysis, a set of experiments was conducted to determine the appropriate scale constant.
Quantizers were defined for a = 0.85, 0.8, 0.75, and 0.67 (the theoretical worst case from above, where S(n) = 1.5S(n+1)), and a range of values of H1. Each test image was then coded and the results graphed. The same trend was observed in all cases. a = 0.8 gives best rate/distortion performance between 32 and 34 dB, with 0.85 being better up to 36 dB and 0.75 being better at low data rates. The reason for the difference is attributable for the different shapes of the difference histograms of the various bands. The coarser bands have higher variances, so the effect of coarse quantization on finer levels' prediction is greater. At low data rates, even the coarse bands are quantized coarsely and this therefore has a disproportionate effect on quality.
The quantizers used in the evaluation programs for BTPC use a = 0.8 for higher data rates and a = 0.75 for lower data rates.
Next (Coder) | Previous (Shape-based prediction) | Top (BTPC)