I spent more time than I should have yesterday trying to understand regcomp(), regexec() and regerror() well enough to validate a string containing an e-mail address string to make sure that: its structure is correct and neither the username nor the domain contains characters they shouldn't.
The upshot was that I couldn't do it because I could not write a regex that would detect spaces in the address because apparently regcomp doesn't provide any way to anchor a regex to either end of a string, so I ended up with a negated regex that detects invalid characters in the string and hasn't a clue whether its syntactically correct:
[^.a-zA-Z0-9@_-]
This does the trick, but no thanks to the man pages regex(3), which describes the C functions, and regex(7), which describes the regex syntax. Both are poorly formatted, hard to read, and seem to have omitted useful information, such as the inability of specifying anchor points in strincs that DO NOT contain newlines.
So, can any of you do better, i.e. write a regex that CAN validate the syntax of an e-mail address in terms of its structure and the set of permitted characters on the username and domain parts (the permitted character sets are not the same).
Also, if anybody can suggest a better tutorial on using these functions or suggest another, better, set of C functions for doing the same job, that would be wonderful.
PS: I did check my old reliable standby text - David Curry's "UNIX Systems Programming for SVR4", but it wasn't helpful in this case because, unusually, the set of functions in the C Standard Library have changed both names and parameters since it was written.