Here are some more links regarding counter & accumulator carry techniques.
-------------------- Links to early Xilinx counter app notes:
Ultra-Fast Synchronous Counters in XC3000 & XC4000 FPGAs
formatting link
Loadable Binary Counters in a XC3000 FPGA
formatting link
pages 15-18 of Xcell Journal #7
formatting link
-------------------- Haven't found a pdf for XAPP 001 yet: " " High-Speed Synchronous Prescaler Counter " (XAPP 001) " " This simple design provides a very basic non-loadable, " up counter with a count-enable control. However, this " simplicity permits it to be both the densest and the " second fastest design. " " A prescaler (CEP/CET) technique is used to gain speed, " permitting the ripple-carry portion of the counter " eight clock periods in which to settle. Without special " adaptation, however, this technique precludes loading " the counter. As a non-loadable counter, three bits can " be implemented in three CLBs (1 CLB/bit), with the least " significant six bits requiring only four CLBs; this " explains the compactness. Only one TILO delay is incurred " in the ripple-carry path for each three bits. " This technique of making the low N bits run fast, with the upper bits running slower by 2^N, should map well into a compact yet fast implementation of a non-loadable binary counter for your Actel part.
I.e., use something like the pcounter scheme for the low few bits, then make the upper bits with a ripple carry, enabled by the carry out of the low bits.
You probably will need to add special timing constraints to get the tools to understand the multicycle carry, and that the ripple chain is a false path after FF reset.
The advantage of this is that you would now only need to deskew N LSB's of the counter for straight binary output.
-------------------- ORCA-3 FPGAs had an optional register in the dedicated carry chain:
" Fast-carry logic and routing to adjacent PFUs for " nibble-wide, byte-wide, or longer arithmetic functions, " with the option to register the PFU carry-out.
-------------------- More carry-pipelined accumulator references:
( I've mentioned accumulators because they are a more general carry design problem than are counters, and because I know where to look for literature describing high speed pipelined versions.)
"Direct Digital Synthesizers: Theory, Design and Applications", Vankka lib.tkk.fi/Diss/2000/isbn9512253186/isbn9512253186.pdf See pages 48-49 for accumulator pipelining techniques.
"Single Chip 500 MHz Function Generator" P.H. Saul, W. Barber, D.G. Taylor, T. Ward IEE Proceedings, Vol. 138, No. 2, pp 239-243, April 1991
Reprinted in "Direct Digital Frequency Synthesizers", Kroupa (ed), IEEE Press,
1999
Fig. 2 shows the one-bit-per-carry accumulator structure Fig. 5 shows the accumulator output deskew tree
-Brian