[Qgis-user] Delimited text plugin with fields containing commas

M.E.Dodd m.e.dodd at open.ac.uk
Tue Jun 21 02:10:13 PDT 2011


I should have said half was entered via the web and the other half came in via a range of different spreadsheets, some of which of course used commas instead of decimal points in numeric fields.  Only about 10,000 records by about 20 fields so was possible to look at by hand to some extent.

-----Original Message-----
From: Alex Mandel [mailto:tech_dev at wildintellect.com] 
Sent: 21 June 2011 09:44
To: qgis-user at lists.osgeo.org
Subject: Re: [Qgis-user] Delimited text plugin with fields containing commas

If it was being entered on the web it should have gone straight into a database. So there should be no need for a text file, it should all be db dumps if you need to move things. Personally I'd recommend putting it in SQlite or Postgres to start then you can move directly to Spatialite, Postgis - or use those from the beginning.

It just raises the need to really think about data entry in project design and to get good at writing fancy python scripts to clean up bad text files people give you based on regex rules and other magic.

Enjoy,
Alex

On 06/21/2011 12:48 AM, M.E.Dodd wrote:
> I've had a whole range of similar problems with a large database of publically entered records from many countries, it contains all sorts of strange characters (in many languages) including most characters you'd think of as possibilities for column separators.  Rather a nightmare to tidy up and analyse, end up stripping out loads of different characters just to get it to read by spreadsheet and gis.  Ideally there might be some completely different separator that could easily be edited in to show the columns and keep most of the commas and other characters within the columns but the big issue is how to do that editing in the first place to correctly identify the columns or easily allow moving of text between columns if an automatic import gets it wrong.
> I have sorted most of it out now, in my case, by long laborious means but if someone could come up with a good way of dealing with this kind of messy file (entered by general public in many countries so with potentially unpredictable strange characters) that would be very useful.  Just before you say you should have been much more restrictive on the web input, we were fairly restrictive but still need to allow quite a range of possible inputs in free text in any language.
> 
> From: John Callahan [mailto:john.callahan at udel.edu]
> Sent: 21 June 2011 00:36
> To: tech at wildintellect.com
> Cc: qgis-user
> Subject: Re: [Qgis-user] Delimited text plugin with fields containing 
> commas
> 
> You're correct.  That way probably would be the preferred work-around.
> 
> - John
> 
> 
> On Mon, Jun 20, 2011 at 5:18 PM, Alex Mandel <tech_dev at wildintellect.com<mailto:tech_dev at wildintellect.com>> wrote:
> I agree quotes should work but I've found many parsers to not follow 
> the expectation on this. As for semicolons I only meant as the 
> delimiter leaving the commas inside your text. That way you can tell 
> the parser that ; is the separator between records.
> 
> Thanks,
> Alex
> 
> On 06/20/2011 01:04 PM, John Callahan wrote:
>> I use semi-colons when I can but have run into situations where 
>> commas are necessary, such as names of places.  I agree with the work-around and I've
>> done that before.   As long as quotes (") are included around the values, it
>> should work, and I believe it was working for a while.
>>
>> - John
>>
>> ***********************************
>> John Callahan, Research Scientist
>> Delaware Geological Survey
>> University of Delaware
>> URL: http://www.dgs.udel.edu
>> *******************************
>>
>>
>> On Mon, Jun 20, 2011 at 3:59 PM, Alex Mandel <tech_dev at wildintellect.com<mailto:tech_dev at wildintellect.com>>wrote:
>>
>>> On 06/20/2011 12:35 PM, John Callahan wrote:
>>>> Has anyone seen this problem with the Delimited Text plugin?  I am 
>>>> seeing this in today's download of QGIS 1.7 standalone on Windows, 
>>>> and on a
>>> recent
>>>> install through OSGeo4W of 1.8-trunk.
>>>>
>>>> "Delimited text" plugin doesn't allow to load csv file with field 
>>>> with commas
>>>> http://hub.qgis.org/issues/2208
>>>>
>>>> - John
>>>
>>> I have had that problem before, with lots csv import tools (not just 
>>> qgis). Are you using commas to separate the values too? I usually 
>>> have much better success changing that to ; or | so instead of 
>>> "2","test,test","1"
>>> "2";"test,test";"1"
>>>
>>> Easiest way to swap out the delimiter is to use 
>>> OpenOffice/LibreOffice and change it when saving.
>>>
>>> Thanks,
>>> Alex

_______________________________________________
Qgis-user mailing list
Qgis-user at lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/qgis-user

-- 
The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England & Wales and a charity registered in Scotland (SC 038302).




More information about the Qgis-user mailing list