Hi
I am currently working on upgrading a legacy embedded application using CFC for real-time storage. When the system was designed (circa 2000/2001), the Hitachi controllers prevalent in the CFC market could sustain high write speeds for small numbers of sectors (ie. 4). Since then though, as published card speeds have improved, the small writes used by this system when applied to newer cards result in missed deadlines; ie. it does not work. My first steps have been to free up some (precious) RAM to increase my buffering. The buffering is now up to 32 sectors with (typical)
16-sector writes, but I am still not getting anywhere close to the performance required.I am writing to the CFC card (512MB, 1GB) in True-IDE mode. There are two modes I operate in. In the first there is one write stream, writing up to
16 sectors at a time (16x256x16-bits). In the second case there are two such streams. Each sector represents 580usec of audio - so a typical write (16 sectors) will occur every 9.2msec. This represents a sustained write-speed of 88.2KB/sec. I am writing to the card using the standard WRITE_SECTORS (0x30) operation code. I am seeing significant delay on the completion of the data transfer before the card comes ready again; long enough that I miss deadlines and have to crash-stop the recording. One write stream works (88.2K a second to contiguous sectors in 32-sector LBA blocks); two fails consistently (interleaved writes to independent LBAs in 32-sector LBA blocks) on all cards I have tried (except the old Hitachi-based cards which work great, but I can no longer obtain).Typically my writes should start aligned on LBA multiples of 16 but this is not currently guaranteed by my code. No write will cross a 32-sector LBA boundary.
Contiguousness of the writes is however guaranteed; ie. I write sectors with LBAs 32768 - 32783, then 32784-32799 (aligned within 32-sector block), or I could complete the write as 32768-32778, then 32779-32885, then
32886-32799. My writes will never cross a 32-sector boundary. In some circumstances I start a write part-way through a 32-sector 'cluster'. Note that my FAT clusters are aligned on 32-LBA boundaries (ie. all sector start LBAs are multiples of 32).The delay is such when I have two such stream running (to different FAT clusters - using a 32 sector cluster size) I cannot meet my deadlines.
My conclusion from this is that the traffic hitting the card is causing a lot of internal management to occur which results in the type of delays I am seeing - I know these cards ought to be able to sustain of the order of several MB/sec of sustained write. I am seeing them fail at 176KB/sec - an order of magnitude off my expectation.
So what I need to figure out is the following: