[GRASS-dev] Re: regex problem for r.in.wms

Ivan Shmakov ivan at theory.asu.ru
Mon Mar 10 14:58:52 EDT 2008

>>>>> Hamish  <hamish_b at yahoo.com> writes:

 > Hi, re. r.in.wms XML paring code for layers with spaces in the name

 > given some text like this:

 > DATA="<Name>Foo Bar Baz</Name>"
 > echo "$DATA" | sed -e "s/<Name>\s*\(\w*\)/~\1~/g" -e "s/<\/Name>//g"

 > you get ~Foo~ Bar Baz

 > instead of ~Foo Bar Baz~

 > how to fix that regex?

	First of all, we expand `\w' into ``any letter or digit or the
	underscore character'' [1]:

echo "$DATA" \
    | sed -e "s/<Name>\s*\([[:alpha:][:digit:]_]*\)/~\1~/g" \
          -e "s/<\/Name>//g"
## => ~Foo~ Bar Baz

	Then, we add `[:space:]' to the []-set:

echo "$DATA" \
    | sed -e "s/<Name>\s*\([[:alpha:][:digit:][:space:]_]*\)/~\1~/g" \
          -e "s/<\/Name>//g"
## => ~Foo Bar Baz~

	Finally, I'd recommend to use single quotes for the Sed program,
	since it has no Shell substitutions contained within:

echo "$DATA" \
    | sed -e 's/<Name>\s*\([[:alpha:][:digit:][:space:]_]*\)/~\1~/g' \
          -e 's/<\/Name>//g'
## => ~Foo Bar Baz~

[1] GNU Sed manual (for GNU Sed 4.1.5.)

More information about the grass-dev mailing list