High USB throughput requirement

- N
- navman
  
  Contact options for registered users
posted
12 years ago

Thu, Jun 2, 2011 9:39 AM

Hi, We are developing a project which requires a sustained USB transfer rate of avg. 4Mbytes/sec. Basically a microcontroller has to read data from 16bit high-speed ADC and send it to USB very quickly. The ADC converts a data value every 0.5usec, so microcontroller has to read this data (16bit) and send it to USB, ie 2byte/0.5usec = 4MByte/sec.

On the PC side also, this 4MB/sec data rate has to be read through and written to a file continuously.

1) Which is the best approach on hardware to achieve this? Any micro controller suitable for achieving this? Almost all popular 32bit uCs like LPC17xx, PIC32 etc have only a full speed USB (12Mbits/sec max). So it is ok if this cannot be an on-chip USB solution. We can do with external high-speed USB controllers such as FT232H.

2) What would be the best approach to handle this data in flow on the PC software side (we plan to use VB.net or C#)?

Your inputs would be most valuable in ensuring we have a working design the first time.

--------------------------------------- Posted through

formatting link

- A
- Arlet Ottens
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Thu, Jun 2, 2011 9:53 AM

Would it be an option to use Ethernet instead ? 100 Mbps Ethernet controllers are fairly easy to find in these popular devices.

- U
- upsidedown
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Thu, Jun 2, 2011 11:40 AM

Since you are only writing some data to disk and you are not doing anything real time control loops with it, buffering the data at the source will greatly reduce protocol overhead.

Even the 100 Mbit/s Ethernet should be able to do it. Putting 512 samples in each raw ethernet (or UDP frame) frame will require about

3900 Ethernet frames to be sent each second, consuming about 40 % of the 100BaseT capacity.

Have you verified that the physical disk drive is capable of such sustained throughput ? HDTV recorders operate at 1-2 Mbytes/s and allow simultaneous read and write access, so this is usually OK. If not, you may have to use some RAID arrangement to write to multiple disks in parallel.

On many virtual memory operating systems, such as VMS, Windows or Linux, using memory mapped files is an effective way of handling very large data sets effectively. After opening the file (preferably on a defragmented drive), there exists a mapping between a programs's virtual memory page and a specific disk block on the disk drive. The program has a huge virtual memory array (gigabytes or terabytes), which can be accessed just as any array element. The change in th memory location will sooner or later be reflected on the disk block.

With standard 4096 byte x86 virtual memory pages, 2048 samples could be written to that array, which would later be transferred to disk. By requesting flushing every 1000 pages, data would be written to disk before the physical memory consumption would be excessive. For your data volumes, it would make sense to use a processor supporting 64 bit addressing and an operating system supporting 64 bit address spaces. I do not know if your preferred tools support 64 bit addressing.

- M
- Mark Borgerson
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Thu, Jun 2, 2011 2:17 PM

What is your definition of "Sustained"? Do you need to collect continuously for minutes, hours,day, or months? ( I ask this, because I've designed systems that collect continuously for 6 months on 20mA of battery power---although at only about 3KB/second).

If the answer is in the hours range, a PC with lots of memory and a few hundred MB dedicated to ping-pong buffers may be the answer.

A really sophisticated solution might involve a custom USB driver stuffing data into very large buffers using DMA. Writing drivers for Windows is a very specialized skill set, though.

Mark Borgerson

- A
- Arlet Ottens
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Thu, Jun 2, 2011 3:02 PM

Why so much memory ? Streaming 4MB/sec to a modern SATA drive should be easy without any special tricks. On my desktop here, I'm easily writing around 40MB/sec.

- V
- Vladimir Vassilevsky
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Thu, Jun 2, 2011 3:02 PM

You do not have sufficient experience and resource to accomplish this project. Hence the best approach would be the off-the-shelf board from NI.

ADI BlackFin has high speed USB.

The best approach is staying with lean and mean C++. C# .Net is not intended for that kind of applications.

Hm. It depends. How much is the value behind your words ?

Vladimir Vassilevsky DSP and Mixed Signal Design Consultant

formatting link

- N
- navman
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Thu, Jun 2, 2011 5:07 PM

Appreciate your opinions. Although, I should say that using ethernet would not be possible in our application. So we have to stick with USB. Also using NI or Blackfin would not be a cost effective solution and maybe an "overkill". We are planning something like this:

A/D ----> PIC32--[parallel mode]--->FT232H ---> USB

It seems the best and cost effective approach at the moment. Any feedback on the above scheme?

The FT232H is a high speed USB supporting 480Mbps. In their datasheet they claim transfer rates of 40MB/s so 4MB/s should be achievable even if it was under very ideal conditions.

--------------------------------------- Posted through

formatting link

- D
- D Yuniskis
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Thu, Jun 2, 2011 5:41 PM

I suspect Mark is addressing the "real-time" aspects *in* the PC and how to ensure they *can* be met regardless of the OP's choice of implementation language, etc. (without resorting to the "gee, the 3GHz dual core didn't work, let's try *4* GHz")

- J
- John Speth
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Thu, Jun 2, 2011 5:46 PM

As others have pointed out, there are some missing key requirements (notably data reliability). We have a product with 500K byte/sec requirement. Data is transmitted device to host over a bulk endpoint. If the PC can't drain the bulk endpoint fast enough, data is dropped. We found the bottleneck is the host side data consumption rate.

You can use an isochronous endpoint for data transmission but you can expect to drop data as a rule when data consumption rates drop below somethreshold.

Somebody else suggested making a custom driver that would DMA the data to memory or disk. That sounds like a slick idea to me. You'll need to fully understand your bottlenecks so that you can properly size your buffers etc.

JJS

- A
- Arlet Ottens
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Thu, Jun 2, 2011 5:52 PM

How is "let's try 4 GHz" any different than "let's try a few hundred MB" ?

- M
- Mikko OH2HVJ
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Thu, Jun 2, 2011 6:33 PM

FT232H in 245 FIFO mode can handle this data rate without a problem and is easy touse. Setting the PC driver latency driver to a lower value than the default 16ms may be needed.

FTDI also has UM232H boards for development and testing.

Cypress FX2LP can do over 30Mbytes/sec, too, but if far more difficult to use.

I'd stay away from .NET for handling data like this. I've seen C# code doing this, but the customer had to buy a 2-core 3GHz PC to process data from an AVR and some other chips of similar scale (to defend MS, the customer code was badly written, too).

Doing reasonable C++ code will handle this even on an older PC.

Good luck! (and prepare for a few rounds just in case..)

--
Mikko OH2HVJ

- D
- D Yuniskis
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Thu, Jun 2, 2011 7:08 PM

The point was that you have more control over getting data from USB into memory than some .NET application trying to get that data *through* the OS, multitasking, etc. and into the disk. If "The Application" can't keep up with the data collection AND STORAGE aspects at 3GHz, your only remedy is kick the processor speed up higher until it *can* (because you have little/no control over userland performance in C# applications, etc.)

I.e., I think Mark's point was that you could write a driver that just allocates a huge buffer and sits there busily *filling* it and get PREDICTABLE/reliable behavior a lot easier than adding some user-land application that has to stream that data off to the disk AS FAST AS it is coming in REGARDLESS OF WHAT ELSE IS GOING ON in the system at the time.

Of course, if Mark meant *otherwise*, he can correct me... :>

- D
- D Yuniskis
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Thu, Jun 2, 2011 7:15 PM

------^^^^^^^^^

And, can you *control* what else is happening in the host while your "critical" application is running? Is it going to start indexing a drive, etc.? Are *you* going to configure and deliver the PC or *hope* that the user knows which services to disable, etc. to ensure capacity is present for your needs?

You're still stuck with *using* that data. I.e., if you use X% (X < 50) of the processor moving data onto disk, then when will you ever have capacity to *process* the data?

This really only works if there is a limit to the total data envelope.

As a cheat, I would be inclined to see if there isn't some existing protocol that you could "trick" into moving (storing) your data on your behalf. (e.g., let it masquerade as streaming video to a DVR application, etc. That way, someone else has done the heavy lifting to meet the real-time aspects)

[I'd also opt for an ethernet based solution -- 10M or even GB -- to get around USB]

- A
- Arlet Ottens
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Thu, Jun 2, 2011 7:21 PM

If you're collecting hours worth of data, a few hundred MB isn't going to be enough to store all the data, so you have to rely on an application to write it to disk anyway.

- D
- D Yuniskis
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Thu, Jun 2, 2011 7:33 PM

That's not the point.

If you fill a 512B buffer and then pass it through the OS to the disk, you take a big hit moving 512B of data. Even if the OS buffers that at some layer in the I/O subsystem.

OTOH, if you buffer a *track* worth of data (say 512KB - 5MB), then that overhead is greatly reduced. The OS can (and probably will

*have* to!) flush it straight to disk instead of just chunking a sector at a time in a buffer/cache.

E.g., my multimedia RTOS deals with nothing smaller than 4KB and *prefers* 4MB "chunks". Any "finer resolution" doesn't buy you (the application) anything -- it's *multimedia* (so the idea of a "short write" is silly).

Likewise, here, the larger the chunks, the more efficiently you can move them through the *existing* OS. Since the disk's data rate must exceed, "on average", the input source's data rate, you can further capitalize on "big buffers" by only creating *two* buffers: the one you are filling and the one that has been filled that you are now emptying (to disk).

[i.e., having two *full* buffers means the system is broken! Think about how trivial this userland code would be -- no FIFOs to manage, etc.]

The larger you can make these, the more variability you can accommodate in the system's response characteristics -- the "elasticity" is built into the overwhelming size of the buffer instead of a multiplicity of buffers as might be the case, otherwise (where memory is more precious).

- A
- Arlet Ottens
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Thu, Jun 2, 2011 7:48 PM

I did my test with 512 byte blocks, and got 47 MB/sec writing to the hard disk. Obviously, the OS will use it's own RAM buffers to temperorily store data before flushing it the hard drive. Unless your system is really crippled, it's not going to break a sweat sustaining

4MB/sec.

And if it's *that crippled*, there's no guarantee that having huge buffers are going to help.

- D
- David Brown
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Thu, Jun 2, 2011 8:24 PM

If you can get a high enough throughput with an FT2232H, then that will be your easiest choice. Since modules based on these devices are easily and cheaply available, get one and try it out before worrying about other possibilities. You'll probably still need a microcontroller between the FTDI chip and the ADC, but that's an easy choice.

Plan to use something else - dotnot is a poor choice for reliable data flow.

If you are using the FTDI chips, you can use their Dxxx drivers - you interface with them as standard DLL's (you can look at the demo code on their website, but the example code and wrapper libraries they have are pretty terrible).

If you are doing something else, use libusb (or libusb-win32 if you are stuck with windows rather than an OS that is more efficient for such jobs). Don't even consider writing your own USB drivers for Windows - it's just not worth the time and effort.

Don't expect a design to work first time - expect to spend some time and effort in prototyping to see that you have got a good and workable solution. Planning for "working design the first time" sounds like a PHB with no clue as to how development works, and is aiming for a delivery time and budget disaster.

- M
- Mark Borgerson
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Fri, Jun 3, 2011 4:55 AM

No particular reason for picking that large amount of memory other than that embedded systems programmers often seem to forget that the PC has no problems giving you a 100MB chunk of memory. Start big, then reduce things if the user absolutely thinks they need to run Explorer and watch flash videos while collecting irreplaceable data from a system that costs $20K/day to deploy and retrieve! ;-)

Mark Borgerson

- M
- Mark Borgerson
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Fri, Jun 3, 2011 5:15 AM

What he said sounds good to me. If a 1MB buffer sounds good,

100MB sounds even better and gives you a buffer against the iniquities of Windows. Even a small Win7 system ought to have several hundred MB of memory for your app. If it has less, your'e pushing the boundaries for a system where you REALLY want that data. If it has that much or more, just restrain your impulse to watch 'Jersey Shores' while you collect a few GB of data that are costing your customer thousands of dollars per day! My usual approach is is to allocate a whomping big buffer, then run the app for a few days and look at the high-water mark. If the result is small, I MIGHT consider reducing the buffer size in a well-controlled data acquisition syste. Not much push to do that if the system runs thereafter with half the available RAM never being used.

Background: I was teaching computer science in the early

80's when a system with a 1M Pixel display, 1MB of main memory, 100MHz clock, and 100MB of disk space was considered an ideal system for development. That was then, this is now. Let not the nattering nabobs of history take away that which Moore's law hath given!

Mark Borgerson

- M
- Mark Borgerson
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Fri, Jun 3, 2011 5:35 AM

Now THAT sounds like a really good idea to me. It takes me back to the days when a company adapted the protocols to write video data to a VCR to doing computer system backups.

I'll pass on that part. I'm not familiar enough with Ethernet protocols to render judgement on the Etherne vs. USB tradeoffs. 'To Paraphrase Tom Selleck in 'Quigley Down Under' "I said I never had much use for one. Never said I didn't know how to use it." (speaking of handguns, just after he shoots the bad guy with a revolver.)

That may actually be overstating my familiarity with Ethernet data transfer. My usage of ethernet in file storage and transfer is hidden under MANY layers of OS protocols. However, I just like the quote and think it has lots of applicability to embedded programming when sophisticated libraries and protocol stacks are widely available.

Mark Borgerson