The MOSFET idea for the row seems good to me. You could use BJTs, but at 3.2A each, you will probably need a second BJT to reduce the current needed from your row decoder. With just one BJT there, you'd probably need to plan 100mA of drive. So another BJT to cut that to a few mA would be indicated. The MOSFET helps in terms of current drive and at the slow rates you are talking about, drive current won't likely be a problem.
Power loss really isn't going to be a problem, either way. I think you are going to have excess voltage to toss away somewhere, anyway. And if you had a really low-impedance MOSFET to keep down power loss there, it would simply have to be taken up then by the BJT column driver. So there is no good way to win. The BJTs on the row driver would similarly be acceptable. My own personal choice would probably be to use the BJTs. Mostly because they are easily available from a lot of hobbyist sources, are dead cheap (I pay one cent for a 2N2222 in 1000 qty), and because I'm well used to them.
The column drivers will need to include the current limiting. My own choice is something like I showed you (the A, B, C, and D sections) where the BJT is operated with the base "nailed" to a voltage, the emitter following that voltage closely, and the emitter resistor setting the current through your LED. It's all pretty automatic, then, and it works reasonably well. And if you keep the base voltage high enough, the few tenths of a volt of base-emitter voltage change over a fairly wide range of temps won't affect the current enough to notice.
Yup. There's no win there.
Jon