[gdal-dev] GSOC 2016

Fri Mar 25 01:31:04 PDT 2016

Hi Sarthak,

Thank you for you note, but I already wrote:

 >    Don't wait for anybody with proposal. The new GSoC site is right 
place to discuss proposals.

So I expected to see and comment, if needed, your proposal on this site. 
Let me remind you the site - https://summerofcode.withgoogle.com/

Best regards,
     Dmitry

25.03.2016 10:17, sarthak agarwal пишет:
> The deadline is today.
>
> Sarthak
>
> On Thu, Mar 24, 2016 at 1:52 AM, sarthak agarwal 
> <sarthak0415 at gmail.com <mailto:sarthak0415 at gmail.com>> wrote:
>
>     Hello Dmitry,
>
>     I fixed the bug (I guess).
>     Now coming to my proposal for GSoC, So I was thinking of working
>     on project #4 *Auto-detection of EPSG codes from incomplete WKT.*
>
>     What I understood from the project is that we need to predict the
>     EPSG code of certain files on the basis of some attributes which
>     are available in the file.
>
>     The attributes can be extracted from the file for which I read
>     this
>     <http://www.gdal.org/osr_tutorial.html#querying_coordinate_system>.
>
>     Now to solve this problem I thought a lot of methods but I think
>     the best way to solve it will be using machine learning.
>
>     The way ML will handle this problem is as follows-
>
>      1. We need to find the EPSG code for a file (testing data)
>      2. We have a file with some attributes (projections,datum,etc ).
>      3. We need to the guess the best suitable class for that file(EPSG)
>      4. Also, we have many files for which we know the attributes and
>         the corresponding class (training data).
>
>     This problem is now translated into an ML problem which can be
>     solved using the following models-
>
>     1. Bayesian Stastics
>     <https://en.wikipedia.org/wiki/Posterior_probability>
>
>         where,
>         posteriror probability = probability of this file have EPSG
>         code 'a'.
>         prior probability = probability of occurence of EPSG code 'a'.
>
>         likelihood probablity = cases where we saw such attributes
>         when the EPSG code is 'a'.
>
>
>     2. or we can use a simple knn where k is the number of possible
>     EPSG code and the dimension of the feature vector is the number of
>     possible attributes. we need to the find a valid and promising
>     weight function).
>
>
>     3. We can use multi-class SVM.
>
>     4. any other suggestion from the community regarding the possible
>     choice of the algo.
>
>     I am thinking of actually implementing all these algo(may add algo
>     in future depending upon the suggestion) and select the algo which
>     gives the best performance among all of them.
>
>     Please provide me feedback on my proposal and suggestion if I can
>     add/change anything.
>     And since very less time is left in the deadline, I would like to
>     convert it into proposal ASAP with your help.
>
>     Regards,
>     Sarthak
>
>     
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20160325/4bbb452e/attachment.html>