[gdal-dev] GDAL with podofo can't process pdf documents generated by ArcGIS

Andreas Oxenstierna ao at t-kartor.se
Mon Aug 20 23:11:00 PDT 2018


Hi

I have also encountered similar issues with PDFs from other Windows 
softwares.
The workaround I use is to recreate the PDF in any available software 
which ignores missing EOFs, endstreams etc.
Programmatically, this can be done as described in 
https://codedprojects.wordpress.com/2017/06/09/how-to-fix-pypdf-error-eof-marker-not-found/

> Hi all,
>
> I'm currently working on a map viewer application that uses GDAL for 
> processing geo-referenced map images.  Up till now I've been 
> successfully using the poppler library for PDF support, but am 
> currently trying to shift to the podofo/poppler hybrid approach 
> (podofo library with poppler pdftoppm utility) to work around 
> poppler's GPL licence restrictions.
>
> I have a collection of sample map PDF documents generated by ESRI 
> ArcMap 10 (different documents from different releases in the 10.x 
> release family), which I could successfully process with GDAL/poppler, 
> but most of which fail to load with GDAL/podofo.  The document loading 
> also fails with the stand-alone podofo pdftoppm utility, both with a 
> version that I've built from podofo 0.9.6 source and with the 0.9.3 
> version installed onto my ubuntu xenial machine from the APT package 
> repository.
>
> The typical error message is as follows:
>
>
> Error: An error 5 ocurred during uncompressing the pdf file.
>
>
> PoDoFo encounter an error. Error: 5 ePdfError_UnexpectedEOF
>     Error Description: End of file was reached unxexpectedly.
>     Callstack:
>     #0 Error Source: 
> /build/libpodofo-NltoF1/libpodofo-0.9.3/src/base/PdfParser.cpp:226
>         Information: Unable to load objects from file.
>     #1 Error Source: 
> /build/libpodofo-NltoF1/libpodofo-0.9.3/src/base/PdfParser.cpp:334
>         Information: Unable to load xref entries.
>     #2 Error Source: 
> /build/libpodofo-NltoF1/libpodofo-0.9.3/src/base/PdfParser.cpp:738
>     #3 Error Source: 
> /build/libpodofo-NltoF1/libpodofo-0.9.3/src/base/PdfTokenizer.cpp:339
>
> which seems to indicate an invalid xref table.
>
>
>
>
> I don't think this is a podofo bug as such, as various online pdf 
> validators I've tried also flag the documents as problematic, but 
> several other bits of pdf software I've tried (notably the poppler 
> library utilities) seem to treat it as a non-fatal recoverable error.
>
> Has anyone else come across this and come up with a work-around or fix?
>
> Example problem file to be found at 
> https://www.dropbox.com/s/khlzgz8o2gxq89y/6090_harvest.pdf?dl=0
>
>
> thanks
>
> Richard.
>
>
>
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/gdal-dev


-- 
Best regards

Andreas Oxenstierna
T-Kartor Geospatial AB
mobile: +46 733 206831
mailto: ao at t-kartor.se
http://www.t-kartor.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20180821/698b0db5/attachment.html>


More information about the gdal-dev mailing list