[gdal-dev] GSOC 2016

Fri Mar 25 11:43:36 PDT 2016

I have submitted the proposal, please check it once and provide your
feedback.

Sarthak

On Fri, Mar 25, 2016 at 2:01 PM, Dmitry Baryshnikov <bishop.dev at gmail.com>
wrote:

> Hi Sarthak,
>
> Thank you for you note, but I already wrote:
>
> >    Don't wait for anybody with proposal. The new GSoC site is right
> place to discuss proposals.
>
> So I expected to see and comment, if needed, your proposal on this site.
> Let me remind you the site - https://summerofcode.withgoogle.com/
>
> Best regards,
>     Dmitry
>
> 25.03.2016 10:17, sarthak agarwal пишет:
>
> The deadline is today.
>
> Sarthak
>
> On Thu, Mar 24, 2016 at 1:52 AM, sarthak agarwal < <sarthak0415 at gmail.com>
> sarthak0415 at gmail.com> wrote:
>
>> Hello Dmitry,
>>
>> I fixed the bug (I guess).
>> Now coming to my proposal for GSoC, So I was thinking of working on
>> project #4 *Auto-detection of EPSG codes from incomplete WKT.*
>>
>> What I understood from the project is that we need to predict the EPSG
>> code of certain files on the basis of some attributes which are available
>> in the file.
>>
>> The attributes can be extracted from the file for which I read this
>> <http://www.gdal.org/osr_tutorial.html#querying_coordinate_system>.
>>
>> Now to solve this problem I thought a lot of methods but I think the best
>> way to solve it will be using machine learning.
>>
>> The way ML will handle this problem is as follows-
>>
>>    1. We need to find the EPSG code for a file (testing data)
>>    2. We have a file with some attributes (projections,datum,etc ).
>>    3. We need to the guess the best suitable class for that file(EPSG)
>>    4. Also, we have many files for which we know the attributes and the
>>    corresponding class (training data).
>>
>> This problem is now translated into an ML problem which can be solved
>> using the following models-
>>
>> 1. Bayesian Stastics
>> <https://en.wikipedia.org/wiki/Posterior_probability>
>>
>> where,
>> posteriror probability = probability of this file have EPSG code 'a'.
>> prior probability = probability of occurence of EPSG code 'a'.
>>
>> likelihood probablity = cases where we saw such attributes when the EPSG
>> code is 'a'.
>>
>>
>> 2. or we can use a simple knn where k is the number of possible EPSG code
>> and the dimension of the feature vector is the number of possible
>> attributes. we need to the find a valid and promising weight function).
>>
>>
>> 3. We can use multi-class SVM.
>>
>> 4. any other suggestion from the community regarding the possible choice
>> of the algo.
>>
>> I am thinking of actually implementing all these algo(may add algo in
>> future depending upon the suggestion) and select the algo which gives the
>> best performance among all of them.
>>
>> Please provide me feedback on my proposal and suggestion if I can
>> add/change anything.
>> And since very less time is left in the deadline, I would like to convert
>> it into proposal ASAP with your help.
>>
>> Regards,
>> Sarthak
>> 
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20160326/dbb1b2d0/attachment-0001.html>