[gdal-dev] GSOC 2016

sarthak agarwal sarthak0415 at gmail.com
Wed Mar 23 13:22:35 PDT 2016


Hello Dmitry,

I fixed the bug (I guess).
Now coming to my proposal for GSoC, So I was thinking of working on project
#4 *Auto-detection of EPSG codes from incomplete WKT.*

What I understood from the project is that we need to predict the EPSG code
of certain files on the basis of some attributes which are available in the
file.

The attributes can be extracted from the file for which I read this
<http://www.gdal.org/osr_tutorial.html#querying_coordinate_system>.

Now to solve this problem I thought a lot of methods but I think the best
way to solve it will be using machine learning.

The way ML will handle this problem is as follows-

   1. We need to find the EPSG code for a file (testing data)
   2. We have a file with some attributes (projections,datum,etc ).
   3. We need to the guess the best suitable class for that file(EPSG)
   4. Also, we have many files for which we know the attributes and the
   corresponding class (training data).

This problem is now translated into an ML problem which can be solved using
the following models-

1. Bayesian Stastics <https://en.wikipedia.org/wiki/Posterior_probability>

where,
posteriror probability = probability of this file have EPSG code 'a'.
prior probability = probability of occurence of EPSG code 'a'.

likelihood probablity = cases where we saw such attributes when the EPSG
code is 'a'.


2. or we can use a simple knn where k is the number of possible EPSG code
and the dimension of the feature vector is the number of possible
attributes. we need to the find a valid and promising weight function).


3. We can use multi-class SVM.

4. any other suggestion from the community regarding the possible choice of
the algo.

I am thinking of actually implementing all these algo(may add algo in
future depending upon the suggestion) and select the algo which gives the
best performance among all of them.

Please provide me feedback on my proposal and suggestion if I can
add/change anything.
And since very less time is left in the deadline, I would like to convert
it into proposal ASAP with your help.

Regards,
Sarthak
​

On Tue, Mar 22, 2016 at 8:09 PM, sarthak agarwal <sarthak0415 at gmail.com>
wrote:

> Hello Dmitry,
>
> I have made all the necessary changes and now the code is working as
> expected ie If we provide no dbname, it will take the username by default.
>
> Here is the
> https://github.com/OSGeo/gdal/commit/417f4ed2642c56729f93fdb959e2bf1b9f1fdfb1 to
> the fix.
>
> Regards,
> Sarthak
>>
> On Tue, Mar 22, 2016 at 1:49 AM, Dmitry Baryshnikov <bishop.dev at gmail.com>
> wrote:
>
>> Hi Sarthak,
>>
>> The problem is here
>> https://github.com/sarthak-0415/gdal/blob/trunk/gdal/frmts/postgisraster/postgisrasterdriver.cpp#L78
>>
>> Your code produce situation there pszDbnameIn can be NULL. Before your
>> fixes, that cannot be.
>> So the line "CPLString osKey = pszDbnameIn;" produces crash.
>>
>> Don't wait for anybody with proposal. The new GSoC site is right place to
>> discuss proposals.
>>
>> Best regards,
>>     Dmitry
>>
>> 20.03.2016 19:41, sarthak agarwal пишет:
>>
>> Hello to all,
>>
>> Sorry for taking too long (exams and travelling).
>>
>> After running few tests -
>>
>>    - In my opinion, in Both the codes the error is not in
>>    GetConnectionInfo function. if you replace return true with return
>>    false at the last of the function it won’t fail in both cases.
>>    - If you run this
>>    <https://github.com/sarthak-0415/gdal/commit/3e037a84e3392841cda1b4b68d75d205118caa9d>
>>    and this
>>    <https://github.com/sarthak-0415/gdal/commit/26e9383645b177c9e4d2ca8798a3b662901f3b63>
>>    code, it won’t give you the error, the value passed here are correct (NULL
>>    if that’s the case with *ppszDbname, the error is somewhere else and
>>    I am not able to debug it.)
>>    - When I am trying to configure the gdal with enable-debug options
>>    following error comes
>>
>>    make[1]: *** [gdalserver] Error 1
>>    make[1]: *** Waiting for unfinished jobs....
>>    /home/sarthak/gsoc2016/repos/gdal/gdal/.libs/libgdal.so: undefined reference to `CPLMutexHolder::CPLMutexHolder(_CPLMutex**, double, char const*, int, int)'
>>    collect2: error: ld returned 1 exit status
>>
>>
>> My current config options are
>> ./configure --prefix=/home/sarthak/gsoc2016/repos/gdal/install/
>> --with-python=yes -enabl-debug=yes
>>
>> the following error is for gdalServer gdalInfo gdal_translate and
>> gdaladdo. All are having the same problem with undefined reference to
>> `CPLMutexHolder::CPLMutexHolder(_CPLMutex**, double, char const*, int, int)'
>>
>> Please review the code and send me the feedback.
>> Also, I would like to continue with the bug along with my GSoC proposal.
>> For which I may have some ideas and would like to discuss the same with
>> you.. can we talk on IRC since the deadline is in 5 days only.
>>
>> Regards,
>> Sarthak
>>
>> On Wed, Mar 16, 2016 at 4:51 AM, Dmitry Baryshnikov <bishop.dev at gmail.com>
>> wrote:
>>
>>> Hi Sarthak,
>>>
>>> The first version is not working (do you test it?):
>>> <https://github.com/sarthak-0415/gdal/commit/36344cc26f23202cb289390322c1d295697136bd#diff-31df0e62d00ca09f9f11ad2f29e94b54R2541>
>>> https://github.com/sarthak-0415/gdal/commit/36344cc26f23202cb289390322c1d295697136bd#diff-31df0e62d00ca09f9f11ad2f29e94b54R2541
>>> Here you try to get array value with index -1. You need to set ppszDbname
>>> = NULL no DB name present in input parameters.
>>>
>>> The second variant is not working too:
>>> >>> ds = gdal.Open('PG:')
>>> terminate called after throwing an instance of 'std::logic_error'
>>>   what():  basic_string::_M_construct null not valid
>>>
>>> In both cases there is a problem here:
>>> <https://github.com/sarthak-0415/gdal/blob/6264d3fc52242fdce858547cc3a0312b04fd638b/gdal/frmts/postgisraster/postgisrasterdataset.cpp#L2743>
>>> https://github.com/sarthak-0415/gdal/blob/6264d3fc52242fdce858547cc3a0312b04fd638b/gdal/frmts/postgisraster/postgisrasterdataset.cpp#L2743
>>>
>>> Also look there ppszDbname is using, as before modifications the code
>>> expect that ppszDbname cannot be NULL.
>>>
>>> Best regards,
>>>     Dmitry
>>>
>>> 15.03.2016 13:13, sarthak agarwal пишет:
>>>
>>> Hey Dmitry ,
>>> As discussed on the IRC yesterday,
>>> I made the changes in the code.
>>>
>>> I build two versions of the code
>>>
>>>    1.
>>>
>>>    The changes suggested by you (to use the old trunk code and remove
>>>    the additional checks) link
>>>    <https://github.com/sarthak-0415/gdal/commit/36344cc26f23202cb289390322c1d295697136bd>
>>>    travis <https://travis-ci.org/sarthak-0415/gdal/builds/116070409> .
>>>    a. in this version the Dbname is left empty if not provided by the
>>>    user.
>>>    2.
>>>
>>>    The version in which we
>>>    a. if the Dbname is provided by the user then ppzDbname=Dbname.
>>>    b. else use the psql env var PGDATABASE
>>>    c. else use the Username as the database name.
>>>    d. if nothing is available then pass empty string.
>>>    e link
>>>    <https://github.com/sarthak-0415/gdal/commit/6264d3fc52242fdce858547cc3a0312b04fd638b>
>>>    travis <https://travis-ci.org/sarthak-0415/gdal/builds/116055868>
>>>
>>> I think both version should work
>>>
>>> Regards,
>>> Sarthak
>>>
>>>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20160324/94c20fdf/attachment-0001.html>


More information about the gdal-dev mailing list