The most basic difference is that a bipolar transistor requires current at the control terminal (the base lead), whereas a mosfet requires none. However, there are advantages to both in different situations.
You generally cannot substitute a bipolar transistor for a fet, because the circuit will not be designed to supply the required base current.
MOSFETs have three leads, a source, a gate, and a drain. Bipolar transistors also have three leads, but they are called emitter, base, and collector. These leads roughly correspond to one another, ie, the emitter is like the source, the base is like the gate, and the collector is like the drain. Making the base (gate) more positive (for NPN and N-MOSFETs) or negative (for PNP or P-MOSFETs) with respect to the emitter (source) causes more current to flow from collector (drain) to emitter (source).
This terminology is totally confusing, and, sadly, you just have to get used to it if you want to talk about these things.
MOSFETs are used to construct CMOS devices, and are thus the main transistor component to microprocessors. They are also good for constructing huge power transistors, which are easier to control due to the lack of required gate current.
Bipolar transistors are generally more useful for analog design, where the lower noise, more easily predicted voltage requirements, and lower control voltages are useful.
For a FET, the electrostatic field of charges on the control terminal (the gate) is used to moderate the output. MOSFETs have a silicon oxide layer that insulates the gate from the charge. JFETs use a reverse-biased PN junction's depletion region to isolate the gate from the source and drain. For bipolar transistors, the movement of charges across PN junctions controls the output.