Did you see the section of that applications note where they mention the existence of hardware based JTAG controllers that are up to 16 times faster than twiddiling the parallel port bits with software? You could buy one of those. They were ISA cards when the note was written, but they are mostly USB today - some are parallel port, but using it to transfer data in parallel to an external hardware serializer.
Another option might be to dispense with JTAG (or use it only to program a small bootstrap routine) and then download the bulk of what you want to program through a faster interface to a program running on the 386EX, and have that program the flash (if you can control the necessary signals while the 386EX is awake and executing code, instead of being a slave to the JTAG TAP interface)
ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here.
All logos and trade names are the property of their respective owners.