[Qgis-user] Automagically remove html from attribute?
Bernd Vogelgesang
bernd.vogelgesang at gmx.de
Sat Oct 6 08:35:00 PDT 2018
> Em sáb, 2018-10-06 às 12:45 +0200, Bernd Vogelgesang escreveu:
>> Hi,
>>
>> We work a lot with gpx files created with the Locus App on Android.
>>
>> Unfortunately, the "desc" field is created with html tags (for whatever
>> reason), so it is quite a tedious work to extract the plain text
>> informations out of it.
>>
>> Does anyone know a way how to get rid of the html and only preserve the
>> plain text informations?
>>
>> Example:
>>
>> <!-- desc_gen:start -->
>> <font color="#ff000000"><table width="100%"><tr><td width="100%"
>> align="center">
>> <!-- desc_user:start -->
>> This is the information I would like to keep
>> <!-- desc_user:end -->
>> </td></tr><tr><td><table width="100%"></table></td></tr></
>>
> A REGEXP like "<[^>]+>" should match all contents between a consecutive
> pair of angle brackets. It may be necessary to escape some of the
> symbols in REGEXP to avoid misinterpretation.
>
> It is necessary to avoid REGEXP like "<.*>" because it will match
> everything from the first "<" to the last ">", that may include other
> characters "<" and ">".
>
> HTH
Hi Fernando,
a many thanks for your hint. REGEX ist definitely the way to go, if it
was only a little more intuitive.
regexp_replace( "desc",'<[^>]+>','')
in the field calculator did the trick for me for all entries with
correct html. So only few entries with crippled html left to process
manually.
Thanx a lot,
Bernd
>
>> Is the e.g. a way to search for < and > and then delete them an all
>> text
>> within programmatically?
>>
>>
>> Cheers,
>>
>> Bernd
>>
>> _______________________________________________
>> Qgis-user mailing list
>> Qgis-user at lists.osgeo.org
>> List info: https://lists.osgeo.org/mailman/listinfo/qgis-user
>> Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-user
>
> Roxo
>
More information about the Qgis-user
mailing list