The FpgaC internal algorithm is pretty generic, and just decideds pretty early on to force all internal functions to 4-LUTs. This makes it difficult to decide to use support logic like H-LUTs on XC4K, or F5 muxes on Virtex parts, or the carry logic on any of these.
I'm looking for papers which discuss/descript alternative fitting algorithms to better use vendor assist logic in FPGAs, particularly for scheduling logic expressions across multiple LUTs for both space and time specific tradeoffs.