Yes, it's possible (it's called deconvolution). First step, characterize the room response. Second step, fourier transform the recording and the transient response function.
Now here's the tricky part: the room multiplies the pure input FFT by the response FFT (and you hear the inverse of this product). What deconvolution requires, is that you divide the contaminated input FFT by the response FFT, then invert the transform. The problem, of course, is that the response FFT has zeroes, and the recording includes some noise. I leave the consequence to your imagination, it's too distressing to discuss.
So, in practice, one examines the response FFT and identifies a few components from the range that DO lend themselves to modeling/deconvolution. More elaborate techniques can be employed to good effect in special cases (like, maximum entropy filtering, which retains quiet periods especially well).
Echo reduction is potentially easier; you do a little low-pass filtering and delay, then subtract from the original signal. It's hard to get the phases right, but not impossible. If you leave out the low-pass filter, it IS impossible.
Easier yet, is to remove the boomy sound by bandpass filtering (but this doesn't correctly handle transients/fast changes).