[gdal-dev] GSOC 2016
Dmitry Baryshnikov
bishop.dev at gmail.com
Fri Mar 25 01:31:04 PDT 2016
Hi Sarthak,
Thank you for you note, but I already wrote:
> Don't wait for anybody with proposal. The new GSoC site is right
place to discuss proposals.
So I expected to see and comment, if needed, your proposal on this site.
Let me remind you the site - https://summerofcode.withgoogle.com/
Best regards,
Dmitry
25.03.2016 10:17, sarthak agarwal пишет:
> The deadline is today.
>
> Sarthak
>
> On Thu, Mar 24, 2016 at 1:52 AM, sarthak agarwal
> <sarthak0415 at gmail.com <mailto:sarthak0415 at gmail.com>> wrote:
>
> Hello Dmitry,
>
> I fixed the bug (I guess).
> Now coming to my proposal for GSoC, So I was thinking of working
> on project #4 *Auto-detection of EPSG codes from incomplete WKT.*
>
> What I understood from the project is that we need to predict the
> EPSG code of certain files on the basis of some attributes which
> are available in the file.
>
> The attributes can be extracted from the file for which I read
> this
> <http://www.gdal.org/osr_tutorial.html#querying_coordinate_system>.
>
> Now to solve this problem I thought a lot of methods but I think
> the best way to solve it will be using machine learning.
>
> The way ML will handle this problem is as follows-
>
> 1. We need to find the EPSG code for a file (testing data)
> 2. We have a file with some attributes (projections,datum,etc ).
> 3. We need to the guess the best suitable class for that file(EPSG)
> 4. Also, we have many files for which we know the attributes and
> the corresponding class (training data).
>
> This problem is now translated into an ML problem which can be
> solved using the following models-
>
> 1. Bayesian Stastics
> <https://en.wikipedia.org/wiki/Posterior_probability>
>
> where,
> posteriror probability = probability of this file have EPSG
> code 'a'.
> prior probability = probability of occurence of EPSG code 'a'.
>
> likelihood probablity = cases where we saw such attributes
> when the EPSG code is 'a'.
>
>
> 2. or we can use a simple knn where k is the number of possible
> EPSG code and the dimension of the feature vector is the number of
> possible attributes. we need to the find a valid and promising
> weight function).
>
>
> 3. We can use multi-class SVM.
>
> 4. any other suggestion from the community regarding the possible
> choice of the algo.
>
> I am thinking of actually implementing all these algo(may add algo
> in future depending upon the suggestion) and select the algo which
> gives the best performance among all of them.
>
> Please provide me feedback on my proposal and suggestion if I can
> add/change anything.
> And since very less time is left in the deadline, I would like to
> convert it into proposal ASAP with your help.
>
> Regards,
> Sarthak
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20160325/4bbb452e/attachment.html>
More information about the gdal-dev
mailing list