using Xilinx DDS and MULTIPLIER cores, I simulated a system with scalable output. However after the scaler I notice a loss of amplitude of a factor two, even when the output scaler was at maximum positive value (0111..) or minimum negative value (1000..).
Thinking hard, I see that only one case ( 1000... * 1000..) delivers a result with both the top bits different. So if I clip the value range for the scale factor at the low end to exclude 1000..., the second top bit will never be different than the top bit and for an M bit * N bit multiplication using only (M+N-1) bits of the result will not drop any information.
Is that right?