[Qgis-user] Automagically remove html from attribute?

Bernd Vogelgesang bernd.vogelgesang at gmx.de
Sat Oct 6 08:35:00 PDT 2018


> Em sáb, 2018-10-06 às 12:45 +0200, Bernd Vogelgesang escreveu:
>> Hi,
>>
>> We work a lot with gpx files created with the  Locus App on Android.
>>
>> Unfortunately, the "desc" field is created with html tags (for whatever
>> reason), so it is quite a tedious work to extract the plain text
>> informations out of it.
>>
>> Does anyone know a way how to get rid of the html and only preserve the
>> plain text informations?
>>
>> Example:
>>
>> <!-- desc_gen:start -->
>> <font color="#ff000000"><table width="100%"><tr><td width="100%"
>> align="center">
>> <!-- desc_user:start -->
>> This is the information I would like to keep
>> <!-- desc_user:end -->
>> </td></tr><tr><td><table width="100%"></table></td></tr></
>>
>    A REGEXP like  "<[^>]+>" should match all contents between a consecutive
> pair of angle brackets.   It may be necessary to escape some of the
> symbols in REGEXP to avoid misinterpretation.
>
>    It is necessary to avoid REGEXP like "<.*>" because it will match
> everything from the first "<" to the last ">", that may include other
> characters "<" and ">".
>
>    HTH
Hi Fernando,
a many thanks for your hint. REGEX ist definitely the way to go, if it 
was only a little more intuitive.

  regexp_replace( "desc",'<[^>]+>','')

in the field calculator did the trick for me for all entries with 
correct html. So only few entries with crippled html left to process 
manually.

Thanx a lot,
Bernd

>
>> Is the e.g. a way to search for < and > and then delete them an all
>> text
>> within programmatically?
>>
>>
>> Cheers,
>>
>> Bernd
>>
>> _______________________________________________
>> Qgis-user mailing list
>> Qgis-user at lists.osgeo.org
>> List info: https://lists.osgeo.org/mailman/listinfo/qgis-user
>> Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-user
>
>    Roxo
>



More information about the Qgis-user mailing list