OT: Regular Expressions

I'm an amateur at regular expressions :-(

I know I can search for...

'[a-zA-Z0-9\\+\\-\\.\\*\\/\\(\\)\\_]+'

and my editor (UltraEdit) will highlight...

'some_text_characters_and_math_operators'

I want to do a replace operation such that I get...

{some_text_characters_and_math_operators}

(In other words just change the opening and closing characters)

How do I do that?

(UltraEdit supports Perl, Unix and their own flavor of regular expression language.)

Thanks!

...Jim Thompson

--
| James E.Thompson, P.E.                           |    mens     |
| Analog Innovations, Inc.                         |     et      |
| Analog/Mixed-Signal ASIC\'s and Discrete Systems  |    manus    |
| Phoenix, Arizona  85048    Skype: Contacts Only  |             |
| Voice:(480)460-2350  Fax: Available upon request |  Brass Rat  |
| E-mail Icon at http://www.analog-innovations.com |    1962     |
             
 Stormy on the East Coast today... due to Bush\'s failed policies.
Reply to
Jim Thompson
Loading thread data ...

Depending on what regex compiler it uses, either this:

{[a-zA-Z0-9\\+\\-\\.\\*\\/\\(\\)\\_]+}

or this:

\\{[a-zA-Z0-9\\+\\-\\.\\*\\/\\(\\)\\_]+\\}

By default, regex's match characters as-is, but if the compiler interprets a character as special (like + or ]) you put a backslash in front of it to mean "match this character as-is".

Note that some regex compilers do the opposite for some characters; for example Emacs's () are regular characters, but \\( and \\) are special grouping operations. So either read the manual, or just try both ways.

Reply to
DJ Delorie

Both approaches replace 'desired_text' with {[a-zA-Z0-9\\+\\-\\.\\*\\/\\(\\)\\_]+} :-(

I need to retain the desired_text.

...Jim Thompson

--
| James E.Thompson, P.E.                           |    mens     |
| Analog Innovations, Inc.                         |     et      |
| Analog/Mixed-Signal ASIC\'s and Discrete Systems  |    manus    |
| Phoenix, Arizona  85048    Skype: Contacts Only  |             |
| Voice:(480)460-2350  Fax: Available upon request |  Brass Rat  |
| E-mail Icon at http://www.analog-innovations.com |    1962     |
             
 I love to cook with wine     Sometimes I even put it in the food
Reply to
Jim Thompson

Ben at IDM Computer Solutions (UltraEdit) came thru with "tagged" regular expressions... enclose parts to be retained in (...), as in...

'([a-zA-Z0-9\\+\\-\\.\\*\\/\\(\\)\\_]+)'

Replace with {\\1}

That will save me so-o-o-o-o much grunt work processing HSpice libraries into PSpice syntax.

Next, I guess I need to learn scripting ;-)

...Jim Thompson

--
| James E.Thompson, P.E.                           |    mens     |
| Analog Innovations, Inc.                         |     et      |
| Analog/Mixed-Signal ASIC\'s and Discrete Systems  |    manus    |
| Phoenix, Arizona  85048    Skype: Contacts Only  |             |
| Voice:(480)460-2350  Fax: Available upon request |  Brass Rat  |
| E-mail Icon at http://www.analog-innovations.com |    1962     |
             
 I love to cook with wine     Sometimes I even put it in the food
Reply to
Jim Thompson

Ah, I missed that part. Sorry.

Yeah, \\1 substitutes for whatever the (...) matched.

Reply to
DJ Delorie

I'm learning. I have several cookbooks, but I think I need a good introductory text.

...Jim Thompson

-- | James E.Thompson, P.E. | mens | | Analog Innovations, Inc. | et | | Analog/Mixed-Signal ASIC's and Discrete Systems | manus | | Phoenix, Arizona 85048 Skype: Contacts Only | | | Voice:(480)460-2350 Fax: Available upon request | Brass Rat | | E-mail Icon at

formatting link
| 1962 | I love to cook with wine Sometimes I even put it in the food

Reply to
Jim Thompson

If you are going to be parsing complex structures, you might want to look at some more powerful tools. Regular expresions are OK for identifying simple tokens and patterns. But if you try to write RE search and replace scripts to rewrite anything complex, you'll get into trouble quickly.

I've toyed with a Perl module called Parse::RecDescent when I needed to recognize complex statements and extract parameters and variables. The grammar specification starts off at a low level with regular expressions that uniquely identify operators, key words, variables, etc. Higher level expressions are then constructed for each expected combination of the above parts. Eventually, the top level grammar rule represents a parsable file. With each grammar rule, there can be an 'action', which saves or prints the tokens recognized up to that point in any fashion you desire.

Get Perl and then got to

formatting link
to pick up the above module.

--
Paul Hovnanian     mailto:Paul@Hovnanian.com
------------------------------------------------------------------
I used to get high on life but lately I\'ve built up a resistance.
Reply to
Paul Hovnanian P.E.

Thanks! I'll check that out!

...Jim Thompson

--
| James E.Thompson, P.E.                           |    mens     |
| Analog Innovations, Inc.                         |     et      |
| Analog/Mixed-Signal ASIC\'s and Discrete Systems  |    manus    |
| Phoenix, Arizona  85048    Skype: Contacts Only  |             |
| Voice:(480)460-2350  Fax: Available upon request |  Brass Rat  |
| E-mail Icon at http://www.analog-innovations.com |    1962     |
             
 I love to cook with wine     Sometimes I even put it in the food
Reply to
Jim Thompson

if using posix extended regex (which tha above expression appears to be an example of)

replace '([a-zA-Z0-9\\+\\-\\.\\*\\/\\(\\)\\_]+)' with '\\{\\1\\}'

Reply to
Jasen Betts

To elaborate, POSIX specifies two styles of regular expressions: basic and extended.

In basic REs, the characters .[\\*^$ are special, as are the sequences \\(,\\),\\{ and \\}, and \\ followed by a digit.

In extended REs, the characters .[\\*^$(){}+?| are special; there are no special backslash- sequences.

In either case, special characters need to be preceded by a backslash to be used literally.

Some programs (e.g. Emacs) use a hybrid form, which is essentially basic REs except that the sequences \\+, \\? and \\| are interpreted as for +, ?, and | in extended REs.

Various programs and libraries have their own extensions, the most extensive of which is the PCRE (Perl-compatible RE) library.

POSIX regular expression definition:

formatting link

Reply to
Nobody

I do this: 'man 7 regex' and get a linux manual page: here's an on-line copy

formatting link

It took me a few times through it to begin understand it,

The Oreilly book "sed and awk" has a good chapter on regex too.

Reply to
Jasen Betts

I have that working in UltraEdit except the braces don't need to be "escaped".

...Jim Thompson

--
| James E.Thompson, P.E.                           |    mens     |
| Analog Innovations, Inc.                         |     et      |
| Analog/Mixed-Signal ASIC\'s and Discrete Systems  |    manus    |
| Phoenix, Arizona  85048    Skype: Contacts Only  |             |
| Voice:(480)460-2350  Fax: Available upon request |  Brass Rat  |
| E-mail Icon at http://www.analog-innovations.com |    1962     |
             
 I love to cook with wine     Sometimes I even put it in the food
Reply to
Jim Thompson

"Jim Thompson" wrote in message news: snipped-for-privacy@4ax.com...

There are two very good introductory tutorials for regular expressions. The first is by Tom Christiansen, one of the Perl gurus. His description was a usenet post some years ago, and it's widely available around the web. Like, for example, here:

formatting link

The other is by Aaron Kuchling, a Python guru. It's also widely available - a pdf version is here:

formatting link

Tom and Aaron both recommend using verbose mode; even though it's far more readable and understandable, your editor will probably make it difficult. It really is far more readable, though. Just imagine trying to troubleshoot this relatively simple regexp if there were no spaces, no formatting, and no comments (it parses a SPICE instantiation):

spice_instance = r"""\\s* # Instance name (one occurrence). The instance may be either a primitive or a # subcircuit. Subcircuit calls begin with 'x', primitive calls begin with anything else. (?: (?P [a-wyzA-WYZ_] # Primitive names begin with anything except 'x' $INST_ID ) | # OR... (?P [xX] # Subcircuit calls begin with 'x': $INST_ID ) ) \\s+ # Whitespace # Net list (at least one net is required) (?P (?: $NODE_NAME \\s+ # followed by white space )+ ) # Primitive or subcircuit name (one occurrence) (?P $COMP_NAME ) (?:\\s+|$) # followed by white space or EOL, (?!\\s*=) # but not '=' # Parameter = Value pairs or $keywords (zero or more occurrences) (?P (?: $PARAM_NAME # Parameter name \\s*=\\s* # = $VALUE # Value (?:\\s+|\\$|$) # White space or $ or EOL )* # zero or more p-v pairs or keywords ) """

-- Mike --

Reply to
Mike

My situation is more straight forward, essentially all find & replace operations ( = random number of spaces including none):

ParamName='MathEquation'

with...

ParamName={MathEquation}

or renaming:

ParamName1='MathEquation'

with...

ParamName2={MathEquation}

or tossing (not used by PSpice):

ParamName='MathEquation'

to...

;ParamName='MathEquation' \\n+

Some other more subtle things...

I need to learn a scripting language, so I can process a library in one swell foop, rather than eating days of boredom ;-)

...Jim Thompson

--
| James E.Thompson, P.E.                           |    mens     |
| Analog Innovations, Inc.                         |     et      |
| Analog/Mixed-Signal ASIC\'s and Discrete Systems  |    manus    |
| Phoenix, Arizona  85048    Skype: Contacts Only  |             |
| Voice:(480)460-2350  Fax: Available upon request |  Brass Rat  |
| E-mail Icon at http://www.analog-innovations.com |    1962     |
             
 I love to cook with wine     Sometimes I even put it in the food
Reply to
Jim Thompson

[snip]

I forgot to say THANKS for the very good references!

...Jim Thompson

--
| James E.Thompson, P.E.                           |    mens     |
| Analog Innovations, Inc.                         |     et      |
| Analog/Mixed-Signal ASIC\'s and Discrete Systems  |    manus    |
| Phoenix, Arizona  85048    Skype: Contacts Only  |             |
| Voice:(480)460-2350  Fax: Available upon request |  Brass Rat  |
| E-mail Icon at http://www.analog-innovations.com |    1962     |
             
 I love to cook with wine     Sometimes I even put it in the food
Reply to
Jim Thompson

"Jim Thompson" wrote in message news: snipped-for-privacy@4ax.com...

etc, etc.

Understood - that's what a scripting language will do for you, more quickly and efficiently than an editor can. In addition, instead of commenting out some of those equation lines hat PSPICE can't understand, you could either calculate the value and substitute that for the equation, or translate the equation into something PSPICE can understand.

For the simple case you show, a regexp matching valid HSPICE parameter names is [a-zA-Z_][a-zA-Z0-9%$#_]*, and assuming your equations are all pretty simple and you don't need to validate them (so your existing regexp matching the equation is good enough), the matching regexp is something like this:

[a-zA-Z_][a-zA-Z0-9%$#_]*\\s*=\\s*'[a-zA-Z0-9\\+\\-\\.\\*\\/\\(\\)\\_]+'

You need the parenthesis to group the parameter name and the equation:

([a-zA-Z_][a-zA-Z0-9%$#_]*)\\s*=\\s*'([a-zA-Z0-9\\+\\-\\.\\*\\/\\(\\)\\_]+)'

To keep the same parameter name and equation, replacing the quotes with braces, replace that with:

\\1 = {\\2}

-- Mike --

Reply to
Mike

it looks like sed could handle these translations easily.

is ' *' in regex.

s/\\(ParamName *= *\\)'\\([^']*\\)'/\\1{\\2}/;

s/ParamName1\\( *= *\\)'\\([^']*\\)'/ParamName2\\1{\\2}/;

s/ParamName\\( *= *\\)'\\([^']*\\)'/;ParamName\\1{\\2}/;

so the sed script would look like s/\\(ParamName *= *\\)'\\([^']*\\)'/\\1{\\2}/; s/ParamName1\\( *= *\\)'\\([^']*\\)'/ParamName2\\1{\\2}/; s/ParamName\\( *= *\\)'\\([^']*\\)'/;ParamName\\1{\\2}/;

with each line repeated as often as needed.

except probably I'd put the rename and comment-out operations first and do the translates with a catch-all,

Reply to
Jasen Betts

I highly recommend "Mastering Regular Expressions"

A few free tutorials:

--
    W
  . | ,. w ,   "Some people are alive only because
   \\|/  \\|/     it is illegal to kill them."    Perna condita delenda est
---^----^---------------------------------------------------------------
Reply to
Bob Larter

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.