[GRASS-SVN] r68719 - in grass-addons/grass7: . hadoop hadoop/hd hadoop/hd/hd.db.connect hadoop/hd/hd.esri2vector hadoop/hd/hd.hdfs.in.fs hadoop/hd/hd.hdfs.in.vector hadoop/hd/hd.hdfs.info hadoop/hd/hd.hdfs.out.vector hadoop/hd/hd.hive.csv.table hadoop/hd/hd.hive.execute hadoop/hd/hd.hive.info hadoop/hd/hd.hive.json.table hadoop/hd/hd.hive.load hadoop/hd/hd.hive.select hadoop/hd/hdfsgrass hadoop/hd/hdfswrapper

svn_grass at osgeo.org svn_grass at osgeo.org
Tue Jun 21 06:48:59 PDT 2016


Author: krejcmat
Date: 2016-06-21 06:48:59 -0700 (Tue, 21 Jun 2016)
New Revision: 68719

Added:
   grass-addons/grass7/hadoop/
   grass-addons/grass7/hadoop/hd/
   grass-addons/grass7/hadoop/hd/Makefile
   grass-addons/grass7/hadoop/hd/dependency.py
   grass-addons/grass7/hadoop/hd/hd.db.connect/
   grass-addons/grass7/hadoop/hd/hd.db.connect/Makefile
   grass-addons/grass7/hadoop/hd/hd.db.connect/hd.db.connect.html
   grass-addons/grass7/hadoop/hd/hd.db.connect/hd.db.connect.py
   grass-addons/grass7/hadoop/hd/hd.esri2vector/
   grass-addons/grass7/hadoop/hd/hd.esri2vector/Makefile
   grass-addons/grass7/hadoop/hd/hd.esri2vector/hd.esri2vector.html
   grass-addons/grass7/hadoop/hd/hd.esri2vector/hd.esri2vector.py
   grass-addons/grass7/hadoop/hd/hd.hdfs.in.fs/
   grass-addons/grass7/hadoop/hd/hd.hdfs.in.fs/Makefile
   grass-addons/grass7/hadoop/hd/hd.hdfs.in.fs/hd.hdfs.in.fs.html
   grass-addons/grass7/hadoop/hd/hd.hdfs.in.fs/hd.hdfs.in.fs.py
   grass-addons/grass7/hadoop/hd/hd.hdfs.in.vector/
   grass-addons/grass7/hadoop/hd/hd.hdfs.in.vector/Makefile
   grass-addons/grass7/hadoop/hd/hd.hdfs.in.vector/hd.hdfs.in.vector.html
   grass-addons/grass7/hadoop/hd/hd.hdfs.in.vector/hd.hdfs.in.vector.py
   grass-addons/grass7/hadoop/hd/hd.hdfs.info/
   grass-addons/grass7/hadoop/hd/hd.hdfs.info/Makefile
   grass-addons/grass7/hadoop/hd/hd.hdfs.info/hd.hdfs.info.html
   grass-addons/grass7/hadoop/hd/hd.hdfs.info/hd.hdfs.info.py
   grass-addons/grass7/hadoop/hd/hd.hdfs.out.vector/
   grass-addons/grass7/hadoop/hd/hd.hdfs.out.vector/Makefile
   grass-addons/grass7/hadoop/hd/hd.hdfs.out.vector/hd.hdfs.out.vector.html
   grass-addons/grass7/hadoop/hd/hd.hdfs.out.vector/hd.hdfs.out.vector.py
   grass-addons/grass7/hadoop/hd/hd.hive.csv.table/
   grass-addons/grass7/hadoop/hd/hd.hive.csv.table/Makefile
   grass-addons/grass7/hadoop/hd/hd.hive.csv.table/hd.hive.csv.table.html
   grass-addons/grass7/hadoop/hd/hd.hive.csv.table/hd.hive.csv.table.py
   grass-addons/grass7/hadoop/hd/hd.hive.execute/
   grass-addons/grass7/hadoop/hd/hd.hive.execute/Makefile
   grass-addons/grass7/hadoop/hd/hd.hive.execute/hd.hive.execute.html
   grass-addons/grass7/hadoop/hd/hd.hive.execute/hd.hive.execute.py
   grass-addons/grass7/hadoop/hd/hd.hive.info/
   grass-addons/grass7/hadoop/hd/hd.hive.info/Makefile
   grass-addons/grass7/hadoop/hd/hd.hive.info/hd.hive.info.html
   grass-addons/grass7/hadoop/hd/hd.hive.info/hd.hive.info.py
   grass-addons/grass7/hadoop/hd/hd.hive.json.table/
   grass-addons/grass7/hadoop/hd/hd.hive.json.table/Makefile
   grass-addons/grass7/hadoop/hd/hd.hive.json.table/hd.hive.json.table.html
   grass-addons/grass7/hadoop/hd/hd.hive.json.table/hd.hive.json.table.py
   grass-addons/grass7/hadoop/hd/hd.hive.load/
   grass-addons/grass7/hadoop/hd/hd.hive.load/Makefile
   grass-addons/grass7/hadoop/hd/hd.hive.load/hd.hive.load.html
   grass-addons/grass7/hadoop/hd/hd.hive.load/hd.hive.load.py
   grass-addons/grass7/hadoop/hd/hd.hive.select/
   grass-addons/grass7/hadoop/hd/hd.hive.select/Makefile
   grass-addons/grass7/hadoop/hd/hd.hive.select/hd.hive.select.html
   grass-addons/grass7/hadoop/hd/hd.hive.select/hd.hive.select.py
   grass-addons/grass7/hadoop/hd/hdfsgrass/
   grass-addons/grass7/hadoop/hd/hdfsgrass/Makefile
   grass-addons/grass7/hadoop/hd/hdfsgrass/__init__.py
   grass-addons/grass7/hadoop/hd/hdfsgrass/grass_map.py
   grass-addons/grass7/hadoop/hd/hdfsgrass/hdfs_grass_lib.py
   grass-addons/grass7/hadoop/hd/hdfsgrass/hdfs_grass_util.py
   grass-addons/grass7/hadoop/hd/hdfswrapper/
   grass-addons/grass7/hadoop/hd/hdfswrapper/Makefile
   grass-addons/grass7/hadoop/hd/hdfswrapper/__init__.py
   grass-addons/grass7/hadoop/hd/hdfswrapper/base_hook.py
   grass-addons/grass7/hadoop/hd/hdfswrapper/connections.py
   grass-addons/grass7/hadoop/hd/hdfswrapper/hdfs_hook.py
   grass-addons/grass7/hadoop/hd/hdfswrapper/hive_hook.py
   grass-addons/grass7/hadoop/hd/hdfswrapper/hive_table.py
   grass-addons/grass7/hadoop/hd/hdfswrapper/security_utils.py
   grass-addons/grass7/hadoop/hd/hdfswrapper/settings.py
   grass-addons/grass7/hadoop/hd/hdfswrapper/utils.py
   grass-addons/grass7/hadoop/hd/hdfswrapper/webhdfs_hook.py
Log:
addons hadoop: client for hadoop/hive; in progress

Added: grass-addons/grass7/hadoop/hd/Makefile
===================================================================
--- grass-addons/grass7/hadoop/hd/Makefile	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/Makefile	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,30 @@
+MODULE_TOPDIR = ..
+
+SHELL_OUTPUT := $(shell python dependency.py 2>&1)
+ifeq ($(filter File dependency.py,$(SHELL_OUTPUT)),)
+    $(info $(SHELL_OUTPUT))
+else
+    $(error $(SHELL_OUTPUT))
+endif
+
+SUBDIRS = \
+    hdfswraper \
+    hdfsgrass \
+    hd.esri2vector \
+    hd.db.connect \
+    hd.hdfs.in.fs \
+    hd.hdfs.in.vector \
+    hd.hdfs.info \
+    hd.hdfs.out.vector \
+    hd.hive.csv.table \
+    hd.hive.execute \
+    hd.hive.info \
+    hd.hive.json.table \
+	hd.hive.load \
+	hd.hive.select
+
+include $(MODULE_TOPDIR)/include/Make/Dir.make
+
+default: parsubdirs
+
+install: installsubdirs

Added: grass-addons/grass7/hadoop/hd/dependency.py
===================================================================
--- grass-addons/grass7/hadoop/hd/dependency.py	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/dependency.py	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,26 @@
+owslib=False
+try:
+    import sqlalchemy
+except:
+    print'sqlalchemy library is missing.'
+
+try:
+    from snakebite.client import Client, HAClient, Namenode
+except:
+    print'snakebite library is missing.'
+
+try:
+    import thrift
+except:
+    print('thrift library is missing.')
+
+try:
+    import hdfs
+except:
+    print('hdfs library is missing.')
+
+import hdfs
+import thrift
+from snakebite.client import Client, HAClient, Namenode
+import sqlalchemy
+

Added: grass-addons/grass7/hadoop/hd/hd.db.connect/Makefile
===================================================================
--- grass-addons/grass7/hadoop/hd/hd.db.connect/Makefile	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hd.db.connect/Makefile	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,7 @@
+MODULE_TOPDIR = ../..
+
+PGM = hd.db.connect
+
+include $(MODULE_TOPDIR)/include/Make/Script.make
+
+default: script

Added: grass-addons/grass7/hadoop/hd/hd.db.connect/hd.db.connect.html
===================================================================
--- grass-addons/grass7/hadoop/hd/hd.db.connect/hd.db.connect.html	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hd.db.connect/hd.db.connect.html	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,77 @@
+<h2>DESCRIPTION</h2>
+
+<em>hd.db.connect</em> providing connection manager for GRASS Hadoop Framework
+
+
+<p>
+The module provides storing of connection profiles in
+default GRASS GIS database backend which is SQLite  by default. The usage of
+the database manager is derived from current GRASS db.* modules.
+Thus, based on set up primary connection which is use for all involved modules.
+
+<h2>NOTES</h2>
+
+<h3>Defining connection</h3>
+Parameter <em>driver</em> and <em>conn_id</em>
+are mandatory for each connection profile. Parameter <em>driver</em> defines the
+protocol for communication with database and  <em>conn_id</em> is a free
+unique string of connection profile. Other parameters as <em>host</em>,
+<em>port</em>, <em>login</em>, <em>passwd</em>, <em>schema</em>,
+<em>authmechanism</em> depends on a configuration of database server. After a new
+connection is added, the module automatically set the new one as active.
+
+In case of controlling several Hadoop clusters it is suitable
+to define its connection profiles and switching between by flag <em>-a</em> with parameter
+<em>conn_id</em> and <em>driver</em>.
+
+<h3>Local hosts</h3>
+
+For accessing HDFS from GRASS Hadoop Framework the driver must
+know all external IP addresses of master and workers of cluster.
+After the client accesses HDFS daemon (port 50700) then it receives
+message with local host and port of workers instead of IP address. If the client is
+running from different machine than master, these IP addresses and local host names
+must be defined. In Linux systems the configuration of local hosts  are declared in
+file <em>/etc/hosts</em>.
+
+<h2>EXAMPLES</h2>
+
+Defining connection of Hive database (hiserver2 driver):
+
+<div class="code"><pre>
+hd.db.connect driver=hiveserver2 conn_id=hive_spatial  host=cluster-4-m.c.hadoop port=10000 login=matt schema=default
+</pre>
+</div>
+
+<p>
+    Defining connection of Hadoop Namenode(WebHDFS REST API) :
+
+<div class="code"><pre>
+hd.db.connect.py driver=webhdfs conn_id=hdfs_spatial login=matt host=cluster-4-m.c.hadoop port=50070
+</pre>
+</div>
+
+
+<h2>SEE ALSO</h2>
+
+<em>
+<a href="hd.hdfs.in.fs.html">hd.hdfs.in.fs</a>,
+<a href="hd.hdfs.in.vector.html">hd.hdfs.in.vector</a>,
+<a href="hd.hdfs.out.vector.html">hd.hdfs.out.vector</a>,
+<a href="hd.hdfs.info.html">hd.hdfs.info</a>,
+<a href="hd.hive.execute.html">hd.hive.execute</a>,
+<a href="hd.hive.csv.table.html">hd.hive.csv.table</a>,
+<a href="hd.hive.select.html">hd.hive.select</a>,
+<a href="hd.hive.info.html">hd.hive.info</a>,
+<a href="hd.hive.json.table.html">hd.hive.json.table</a>
+</em>
+
+<p>
+    See also related <a href="http://grasswiki.osgeo.org/wiki/">wiki page</a>.
+
+
+<h2>AUTHOR</h2>
+
+Matej Krejci, <a href="http://geo.fsv.cvut.cz/gwiki/osgeorel">OSGeoREL</a>
+at the Czech Technical University in Prague, developed
+during master thesis project 2016 (mentor: Martin Landa)

Added: grass-addons/grass7/hadoop/hd/hd.db.connect/hd.db.connect.py
===================================================================
--- grass-addons/grass7/hadoop/hd/hd.db.connect/hd.db.connect.py	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hd.db.connect/hd.db.connect.py	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,177 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+############################################################################
+#
+# MODULE:       hd.db.connect
+# AUTHOR(S):    Matej Krejci (matejkrejci at gmail.com
+#
+# COPYRIGHT:    (C) 2016 by the GRASS Development Team
+#
+#               This program is free software under the GNU General
+#               Public License (>=v2). Read the file COPYING that
+#               comes with GRASS for details.
+#
+#############################################################################
+
+#%module
+#% description: Connection manager for Hive and Hadoop database
+#% keyword: database
+#% keyword: hdfs
+#%end
+#%option
+#% key: driver
+#% type: string
+#% required: no
+#% description: Type of database driver
+#% options: hiveserver2, hdfs, webhdfs, hive_cli
+#% guisection: Connection
+#%end
+#%option
+#% key: conn_id
+#% type: string
+#% required: no
+#% description: Identificator of connection(free string)
+#% guisection: Connection
+#%end
+#%option
+#% key: host
+#% description: host
+#% type: string
+#% required: no
+#% guisection: Connection
+#%end
+#%option
+#% key: port
+#% type: integer
+#% required: no
+#% description: Port of db
+#% guisection: Connection
+#%end
+#%option
+#% key: login
+#% type: string
+#% required: no
+#% description: Login
+#% guisection: Connection
+#%end
+#%option
+#% key: passwd
+#% type: string
+#% required: no
+#% description: Password
+#% guisection: Connection
+#%end
+#%option
+#% key: schema
+#% type: string
+#% required: no
+#% description: schema
+#% guisection: Connection
+#%end
+#%option
+#% key: authmechanism
+#% type: string
+#% required: no
+#% options: PLAIN
+#% description: Authentification mechanism type
+#% guisection: Connection
+#%end
+#%option
+#% key: connectionuri
+#% type: string
+#% required: no
+#% description: connection uri string of database
+#% guisection: Connection uri
+#%end
+#%option
+#% key: rmid
+#% type: integer
+#% required: no
+#% description: Remove connection by id
+#% guisection: manager
+#%end
+#%flag
+#% key: c
+#% description: Print table of connection
+#% guisection: manager
+#%end
+#%flag
+#% key: p
+#% description: Print active connection
+#% guisection: manager
+#%end
+#%flag
+#% key: r
+#% description: Remove all connections
+#% guisection: manager
+#%end
+#%flag
+#% key: t
+#% description: Test connection by conn_type
+#% guisection: manager
+#%end
+#%flag
+#% key: a
+#% description: Set active connection by conn_id and driver
+#% guisection: manager
+#%end
+
+import grass.script as grass
+
+from hdfsgrass.hdfs_grass_lib import ConnectionManager
+
+
+def main():
+    # add new connection
+    conn = ConnectionManager()
+    if options['connectionuri']:
+        conn.set_connection_uri(options['connectionuri'])
+        conn.add_connection()
+        conn.test_connection()
+        return
+
+    if options['host'] and options['driver'] and options['conn_id']:
+        conn.set_connection(conn_type=options['driver'],
+                            conn_id=options['conn_id'],
+                            host=options['host'],
+                            port=options['port'],
+                            login=options['login'],
+                            password=options['passwd'],
+                            schema=options['schema']
+                            )
+        conn.add_connection()
+        conn.test_connection()
+        return
+
+    if options['rmid']:
+        conn.remove_conn_Id(options['rmid'])
+        return
+    # print table of connection
+    elif flags['c']:
+        conn.show_connections()
+        return
+    # drop table with connections
+    elif flags['r']:
+        conn.drop_connection_table()
+        conn.show_connections()
+        return
+    # print active connection
+    elif flags['p']:
+        conn.show_active_connections()
+        return
+    elif flags['t']:
+        if options['driver']:
+            conn.test_connection(options['driver'])
+        else:
+            print('< driver > is not set')
+        return
+    elif flags['a']:
+        if not options['driver'] and options['conn_id']:
+            conn.set_active_connection(options['driver'], options['conn_id'])
+        else:
+            grass.fatal("ERROR parameter < driver > and 'conn_id' must be set")
+
+
+if __name__ == "__main__":
+    options, flags = grass.parser()
+    main()


Property changes on: grass-addons/grass7/hadoop/hd/hd.db.connect/hd.db.connect.py
___________________________________________________________________
Added: svn:executable
   + *

Added: grass-addons/grass7/hadoop/hd/hd.esri2vector/Makefile
===================================================================
--- grass-addons/grass7/hadoop/hd/hd.esri2vector/Makefile	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hd.esri2vector/Makefile	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,7 @@
+MODULE_TOPDIR = ../..
+
+PGM = hd.esri2vector
+
+include $(MODULE_TOPDIR)/include/Make/Script.make
+
+default: script

Added: grass-addons/grass7/hadoop/hd/hd.esri2vector/hd.esri2vector.html
===================================================================
--- grass-addons/grass7/hadoop/hd/hd.esri2vector/hd.esri2vector.html	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hd.esri2vector/hd.esri2vector.html	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,45 @@
+<h2>DESCRIPTION</h2>
+
+<em>hd.esri2vector</em> module for conversion Hive table stored serialised Esri GeoJSON to GRASS vector map.
+
+<p>
+
+
+<h2>NOTES</h2>
+
+<h3>Usage</h3>
+By default is exported only geometry features of map. Parameter <em>attribute</em> specify attributes which will be linked to the map.
+Check <a href="hd.hdfs.out.vector.html">hd.hdfs.out.vector</a> for GET Hive table from Hadoop server.
+<h2>EXAMPLES</h2>
+
+Conversion of Hive table stored on local computer.
+<div class="code"><pre>
+hd.esri2vector out=europe_aggregation attributes='count int,bin_id int' path=/path/to/hive/table
+</pre>
+</div>
+
+
+
+<h2>SEE ALSO</h2>
+
+<em>
+<a href="hd.hdfs.in.fs.html">hd.hdfs.in.fs</a>,
+<a href="hd.hdfs.in.vector.html">hd.hdfs.in.vector</a>,
+<a href="hd.hdfs.out.vector.html">hd.hdfs.out.vector</a>,
+<a href="hd.hdfs.info.html">hd.hdfs.info</a>,
+<a href="hd.hive.execute.html">hd.hive.execute</a>,
+<a href="hd.hive.csv.table.html">hd.hive.csv.table</a>,
+<a href="hd.hive.select.html">hd.hive.select</a>,
+<a href="hd.hive.info.html">hd.hive.info</a>,
+<a href="hd.hive.json.table.html">hd.hive.json.table</a>
+</em>
+
+<p>
+    See also related <a href="http://grasswiki.osgeo.org/wiki/">wiki page</a>.
+
+
+<h2>AUTHOR</h2>
+
+Matej Krejci, <a href="http://geo.fsv.cvut.cz/gwiki/osgeorel">OSGeoREL</a>
+at the Czech Technical University in Prague, developed
+during master thesis project 2016 (mentor: Martin Landa)

Added: grass-addons/grass7/hadoop/hd/hd.esri2vector/hd.esri2vector.py
===================================================================
--- grass-addons/grass7/hadoop/hd/hd.esri2vector/hd.esri2vector.py	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hd.esri2vector/hd.esri2vector.py	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,70 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+############################################################################
+#
+# MODULE:       hd.esri2vector
+# AUTHOR(S):    Matej Krejci (matejkrejci at gmail.com
+#
+# COPYRIGHT:    (C) 2016 by the GRASS Development Team
+#
+#               This program is free software under the GNU General
+#               Public License (>=v2). Read the file COPYING that
+#               comes with GRASS for details.
+#
+#############################################################################
+
+#%module
+#% description:  Module for conversion Esri GeoJson to GRASS vector
+#% keyword: database
+#% keyword: hdfs
+#% keyword: hive
+#%end
+#%option G_OPT_V_OUTPUT
+#% key: out
+#% required: yes
+#%end
+#%option
+#% key: path
+#% type: string
+#% description:  path to the folder with files.
+#%end
+#%option
+#% key: attributes
+#% type: string
+#% description: list of attributes with datatype
+#%end
+
+
+
+import os
+import sys
+
+import grass.script as grass
+
+from hdfsgrass.hdfs_grass_lib import GrassMapBuilderEsriToEsri
+
+def main():
+
+    files=os.listdir(options['path'])
+    map_string=''
+    #download and convert  blocks of table
+    for block in files:
+        map='%s_0%s'%(options['out'],block)
+        block=os.path.join(options['path'],block)
+        map_build = GrassMapBuilderEsriToEsri(block,
+                                              map,
+                                              options['attributes'])
+        try:
+            map_build.build()
+            map_string+='%s,'%map
+        except Exception ,e:
+            grass.warning("Error: %s\n     Map < %s >  conversion failed"%(e,block))
+
+    path,folder_name = os.path.split(options['path'])
+    grass.message("For merge map: v.patch output=%s -e --overwrite input=%s"%(folder_name,map_string))
+
+
+if __name__ == "__main__":
+    options, flags = grass.parser()
+    main()
+


Property changes on: grass-addons/grass7/hadoop/hd/hd.esri2vector/hd.esri2vector.py
___________________________________________________________________
Added: svn:executable
   + *

Added: grass-addons/grass7/hadoop/hd/hd.hdfs.in.fs/Makefile
===================================================================
--- grass-addons/grass7/hadoop/hd/hd.hdfs.in.fs/Makefile	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hd.hdfs.in.fs/Makefile	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,7 @@
+MODULE_TOPDIR = ../..
+
+PGM = hd.hdfs.in.fs
+
+include $(MODULE_TOPDIR)/include/Make/Script.make
+
+default: script

Added: grass-addons/grass7/hadoop/hd/hd.hdfs.in.fs/hd.hdfs.in.fs.html
===================================================================
--- grass-addons/grass7/hadoop/hd/hd.hdfs.in.fs/hd.hdfs.in.fs.html	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hd.hdfs.in.fs/hd.hdfs.in.fs.html	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,37 @@
+<h2>DESCRIPTION</h2>
+
+<em>hd.hdfs.in.fs</em> module allows to PUT data from local filesystem to HDFS
+
+<p>
+
+<h2>NOTES</h2>
+
+<h3>Usage</h3>
+The module allows to copy any data from local filesystem to HDFS.
+For uploading external GoeJSON files to HDFS it is
+necessary to modify its standardized format. The serialization  for
+JSON has several formatting requirements.
+
+<h2>SEE ALSO</h2>
+
+<em>
+<a href="hd.hdfs.in.fs.html">hd.hdfs.in.fs</a>,
+<a href="hd.hdfs.in.vector.html">hd.hdfs.in.vector</a>,
+<a href="hd.hdfs.out.vector.html">hd.hdfs.out.vector</a>,
+<a href="hd.hdfs.info.html">hd.hdfs.info</a>,
+<a href="hd.hive.execute.html">hd.hive.execute</a>,
+<a href="hd.hive.csv.table.html">hd.hive.csv.table</a>,
+<a href="hd.hive.select.html">hd.hive.select</a>,
+<a href="hd.hive.info.html">hd.hive.info</a>,
+<a href="hd.hive.json.table.html">hd.hive.json.table</a>
+</em>
+
+<p>
+    See also related <a href="http://grasswiki.osgeo.org/wiki/">wiki page</a>.
+
+
+<h2>AUTHOR</h2>
+
+Matej Krejci, <a href="http://geo.fsv.cvut.cz/gwiki/osgeorel">OSGeoREL</a>
+at the Czech Technical University in Prague, developed
+during master thesis project 2016 (mentor: Martin Landa)

Added: grass-addons/grass7/hadoop/hd/hd.hdfs.in.fs/hd.hdfs.in.fs.py
===================================================================
--- grass-addons/grass7/hadoop/hd/hd.hdfs.in.fs/hd.hdfs.in.fs.py	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hd.hdfs.in.fs/hd.hdfs.in.fs.py	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,63 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+############################################################################
+#
+# MODULE:       hd.hdfs.in.fs
+# AUTHOR(S):    Matej Krejci (matejkrejci at gmail.com
+#
+# COPYRIGHT:    (C) 2016 by the GRASS Development Team
+#
+#               This program is free software under the GNU General
+#               Public License (>=v2). Read the file COPYING that
+#               comes with GRASS for details.
+#
+#############################################################################
+
+#%module
+#% description: Module for transfer file to HDFS
+#% keyword: database
+#% keyword: hdfs
+#% keyword: hive
+#%end
+#%option
+#% key: hdfs
+#% type: string
+#% answer: @grass_data_hdfs
+#% required: yes
+#% description: HDFS path or default grass dataset
+#%end
+#%option
+#% key: driver
+#% type: string
+#% required: yes
+#% options: hdfs,webhdfs
+#% description: HDFS driver
+#%end
+#%option G_OPT_F_INPUT
+#% key: local
+#% guisection: file input
+#%end
+
+
+import os
+
+import grass.script as grass
+
+from hdfsgrass.hdfs_grass_lib import GrassHdfs
+
+
+def main():
+    if options['hdfs'] == '@grass_data_hdfs':
+        LOCATION_NAME = grass.gisenv()['LOCATION_NAME']
+        MAPSET = grass.gisenv()['MAPSET']
+        MAPSET_PATH = os.path.join('grass_data_hdfs', LOCATION_NAME, MAPSET, 'external')
+        options['hdfs'] = MAPSET_PATH
+
+    if options['local']:
+        transf = GrassHdfs(options['driver'])
+        transf.upload(options['local'], options['hdfs'])
+
+
+if __name__ == "__main__":
+    options, flags = grass.parser()
+    main()


Property changes on: grass-addons/grass7/hadoop/hd/hd.hdfs.in.fs/hd.hdfs.in.fs.py
___________________________________________________________________
Added: svn:executable
   + *

Added: grass-addons/grass7/hadoop/hd/hd.hdfs.in.vector/Makefile
===================================================================
--- grass-addons/grass7/hadoop/hd/hd.hdfs.in.vector/Makefile	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hd.hdfs.in.vector/Makefile	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,7 @@
+MODULE_TOPDIR = ../..
+
+PGM = hd.hdfs.in.vector
+
+include $(MODULE_TOPDIR)/include/Make/Script.make
+
+default: script

Added: grass-addons/grass7/hadoop/hd/hd.hdfs.in.vector/hd.hdfs.in.vector.html
===================================================================
--- grass-addons/grass7/hadoop/hd/hd.hdfs.in.vector/hd.hdfs.in.vector.html	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hd.hdfs.in.vector/hd.hdfs.in.vector.html	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,60 @@
+<h2>DESCRIPTION</h2>
+
+<em>hd.hdfs.in.vector</em> module is provide conversion of GRASS map serialized GeoJSON and copy it to HDFS
+
+<p>
+
+
+<h2>NOTES</h2>
+Vector maps in native GRASS format are not
+suitable for serialization which is needed to exploit the potential of
+spatial frameworks for Hadoop. The effective way and in the most cases the only
+possible is to store spatial data in JSON, especially GeoJSON. This format suits well
+for serialization and library for reading is available in catalog of Hive.
+
+Module <a href="hd.hdfs.in.vector.html">hd.hdfs.in.vector</a>  supports transformation of GRASS map to
+GeoJSON format and transfer to HDFS. Behind the module there are two main steps. Firstly, the map is
+converted to GeoJSON using <em>v.out.ogr</em> and edited to format which is
+suitable for parsing by widely used SerDe functions for Hive. After that,
+custom GeoJSON format is uploaded to the  destination on HDFS. By default, the
+HDFS path is set to <em>hdfs://grass_data_hdfs/LOCATION_NAME/MAPSET/vector</em>.
+
+In addition, hd.hdfs.* package also includes module <a href="hd.hdfs.in.fs.html">hd.hdfs.in.fs</a>  which allows
+transfer of external files to HDFS. Usage of this module becomes important for
+uploading CSV or GeoJSON files outside of GRASS. For uploading external GoeJSON files to HDFS it is
+necessary to modify its standardized format. The serialization  for
+JSON has several formatting requirements. See documentation on wiki page.
+
+<h2>EXAMPLES</h2>
+
+PUT vector map to HDFS
+<div class="code"><pre>
+hd.hdfs.in.vector  driver=webhdfs  hdfs=/data map=klad_zm10 layer=1
+</pre>
+</div>
+
+
+
+<h2>SEE ALSO</h2>
+
+<em>
+<a href="hd.hdfs.in.fs.html">hd.hdfs.in.fs</a>,
+<a href="hd.hdfs.in.vector.html">hd.hdfs.in.vector</a>,
+<a href="hd.hdfs.out.vector.html">hd.hdfs.out.vector</a>,
+<a href="hd.hdfs.info.html">hd.hdfs.info</a>,
+<a href="hd.hive.execute.html">hd.hive.execute</a>,
+<a href="hd.hive.csv.table.html">hd.hive.csv.table</a>,
+<a href="hd.hive.select.html">hd.hive.select</a>,
+<a href="hd.hive.info.html">hd.hive.info</a>,
+<a href="hd.hive.json.table.html">hd.hive.json.table</a>
+</em>
+
+<p>
+    See also related <a href="http://grasswiki.osgeo.org/wiki/">wiki page</a>.
+
+
+<h2>AUTHOR</h2>
+
+Matej Krejci, <a href="http://geo.fsv.cvut.cz/gwiki/osgeorel">OSGeoREL</a>
+at the Czech Technical University in Prague, developed
+during master thesis project 2016 (mentor: Martin Landa)

Added: grass-addons/grass7/hadoop/hd/hd.hdfs.in.vector/hd.hdfs.in.vector.py
===================================================================
--- grass-addons/grass7/hadoop/hd/hd.hdfs.in.vector/hd.hdfs.in.vector.py	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hd.hdfs.in.vector/hd.hdfs.in.vector.py	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,77 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+############################################################################
+#
+# MODULE:       hd.hdfs.in.vector
+# AUTHOR(S):    Matej Krejci (matejkrejci at gmail.com
+#
+# COPYRIGHT:    (C) 2016 by the GRASS Development Team
+#
+#               This program is free software under the GNU General
+#               Public License (>=v2). Read the file COPYING that
+#               comes with GRASS for details.
+#
+#############################################################################
+
+#%module
+#% description: Module for export vector feature to hdfs(JSON)
+#% keyword: database
+#% keyword: hdfs
+#%end
+#%option
+#% key: hdfs
+#% type: string
+#% answer: @grass_data_hdfs
+#% required: yes
+#% description: HDFS path or default grass dataset
+#%end
+#%option
+#% key: driver
+#% type: string
+#% required: yes
+#% options: webhdfs,hdfs
+#% description: HDFS driver
+#%end
+#%option G_OPT_V_MAP
+#% key: map
+#% required: yes
+#% label: Name of vector map to export to hdfs
+#%end
+#%option G_OPT_V_TYPE
+#% key: type
+#% required: yes
+#%end
+#%option G_OPT_V_FIELD
+#% key: layer
+#% required: yes
+#%end
+
+
+
+import grass.script as grass
+
+from hdfsgrass.hdfs_grass_lib import JSONBuilder, GrassHdfs
+
+
+def main():
+    transf = GrassHdfs(options['driver'])
+    if options['hdfs'] == '@grass_data_hdfs':
+        options['hdfs'] = transf.get_path_grass_dataset()
+
+    grass.message(options['hdfs'])
+    grass_map = {"map": options['map'],
+                 "layer": options['layer'],
+                 "type": options['type'],
+                 }
+
+    json = JSONBuilder(grass_map)
+    json = json.get_JSON()
+
+    grass.message('upload %s' % json)
+
+    transf.upload(json, options['hdfs'])
+
+
+if __name__ == "__main__":
+    options, flags = grass.parser()
+    main()


Property changes on: grass-addons/grass7/hadoop/hd/hd.hdfs.in.vector/hd.hdfs.in.vector.py
___________________________________________________________________
Added: svn:executable
   + *

Added: grass-addons/grass7/hadoop/hd/hd.hdfs.info/Makefile
===================================================================
--- grass-addons/grass7/hadoop/hd/hd.hdfs.info/Makefile	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hd.hdfs.info/Makefile	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,7 @@
+MODULE_TOPDIR = ../..
+
+PGM = hd.hdfs.info
+
+include $(MODULE_TOPDIR)/include/Make/Script.make
+
+default: script

Added: grass-addons/grass7/hadoop/hd/hd.hdfs.info/hd.hdfs.info.html
===================================================================
--- grass-addons/grass7/hadoop/hd/hd.hdfs.info/hd.hdfs.info.html	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hd.hdfs.info/hd.hdfs.info.html	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,35 @@
+<h2>DESCRIPTION</h2>
+
+<em>hd.hdfs.info</em> allows to get essential information about files and folders stored in HDFS
+
+<p>
+
+<h2>NOTES</h2>
+
+<h3>Usage</h3>
+The module currently supports several  basic operation for checking if path exists, recursive listing of directories and
+creating HDFS directory.
+
+<h2>SEE ALSO</h2>
+
+<em>
+<a href="hd.hdfs.in.fs.html">hd.hdfs.in.fs</a>,
+<a href="hd.hdfs.in.vector.html">hd.hdfs.in.vector</a>,
+<a href="hd.hdfs.out.vector.html">hd.hdfs.out.vector</a>,
+<a href="hd.hdfs.info.html">hd.hdfs.info</a>,
+<a href="hd.hive.execute.html">hd.hive.execute</a>,
+<a href="hd.hive.csv.table.html">hd.hive.csv.table</a>,
+<a href="hd.hive.select.html">hd.hive.select</a>,
+<a href="hd.hive.info.html">hd.hive.info</a>,
+<a href="hd.hive.json.table.html">hd.hive.json.table</a>
+</em>
+
+<p>
+    See also related <a href="http://grasswiki.osgeo.org/wiki/">wiki page</a>.
+
+
+<h2>AUTHOR</h2>
+
+Matej Krejci, <a href="http://geo.fsv.cvut.cz/gwiki/osgeorel">OSGeoREL</a>
+at the Czech Technical University in Prague, developed
+during master thesis project 2016 (mentor: Martin Landa)

Added: grass-addons/grass7/hadoop/hd/hd.hdfs.info/hd.hdfs.info.py
===================================================================
--- grass-addons/grass7/hadoop/hd/hd.hdfs.info/hd.hdfs.info.py	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hd.hdfs.info/hd.hdfs.info.py	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,61 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+############################################################################
+#
+# MODULE:       hd.hdfs.info
+# AUTHOR(S):    Matej Krejci (matejkrejci at gmail.com
+#
+# COPYRIGHT:    (C) 2016 by the GRASS Development Team
+#
+#               This program is free software under the GNU General
+#               Public License (>=v2). Read the file COPYING that
+#               comes with GRASS for details.
+#
+#############################################################################
+
+#%module
+#% description: Module for geting metadata of tables in hive
+#% keyword: database
+#% keyword: hdfs
+#% keyword: hive
+#%end
+
+#%option
+#% key: driver
+#% type: string
+#% required: yes
+#% answer: hiveserver2
+#% description: Type of database driver
+#% options: webhdfs, hdfs
+#%end
+#%option
+#% key: path
+#% type: string
+#% required: no
+#% description: check path
+#% guisection: Connection
+#%end
+#%flag
+#% key: r
+#% description: recursive
+#%end
+
+import grass.script as grass
+
+from hdfsgrass.hdfs_grass_lib import ConnectionManager
+
+
+def main():
+    conn = ConnectionManager()
+    conn.get_current_connection(options["driver"])
+    hive = conn.get_hook()
+
+    if options['path']:
+
+        for path in (hive.check_for_content(options['path'], flags['r'])):
+            grass.message(path)
+
+
+if __name__ == "__main__":
+    options, flags = grass.parser()
+    main()


Property changes on: grass-addons/grass7/hadoop/hd/hd.hdfs.info/hd.hdfs.info.py
___________________________________________________________________
Added: svn:executable
   + *

Added: grass-addons/grass7/hadoop/hd/hd.hdfs.out.vector/Makefile
===================================================================
--- grass-addons/grass7/hadoop/hd/hd.hdfs.out.vector/Makefile	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hd.hdfs.out.vector/Makefile	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,7 @@
+MODULE_TOPDIR = ../..
+
+PGM = hd.hdfs.out.vector
+
+include $(MODULE_TOPDIR)/include/Make/Script.make
+
+default: script

Added: grass-addons/grass7/hadoop/hd/hd.hdfs.out.vector/hd.hdfs.out.vector.html
===================================================================
--- grass-addons/grass7/hadoop/hd/hd.hdfs.out.vector/hd.hdfs.out.vector.html	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hd.hdfs.out.vector/hd.hdfs.out.vector.html	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,51 @@
+<h2>DESCRIPTION</h2>
+
+Module <em>hd.hdfs.in.vector</em> allows to GET Hive table and transform it to GRASS vector map.
+
+
+<p>
+
+<h2>NOTES</h2>
+
+The module allows to GET results from HDFS and create GRASS vector.
+The module download serialized GeoJSON file from HDFS to local filesystem and in the second step
+modifies file to standard GeoJSON format. The second step is important to make
+file consistent according to GeoJSON standard. In that time, format is readable
+by <em>v.in.ogr</em> and can be transformed
+to the native GRASS vector map.
+
+<h2>EXAMPLES</h2>
+
+Copy and transform Hive table stored as Esri GeoJSON to GRASS vector.
+The result map will include table with attributes count and bind_id.
+
+<div class="code"><pre>
+hd.hdfs.out.vector.py driver=webhdfs out=europe_agg2 attributes='count int,bin_id int' hdfs=/user/hive/warehouse/europe_agg1
+</pre>
+</div>
+
+
+
+<h2>SEE ALSO</h2>
+
+<em>
+<a href="hd.hdfs.in.fs.html">hd.hdfs.in.fs</a>,
+<a href="hd.hdfs.in.vector.html">hd.hdfs.in.vector</a>,
+<a href="hd.hdfs.out.vector.html">hd.hdfs.out.vector</a>,
+<a href="hd.hdfs.info.html">hd.hdfs.info</a>,
+<a href="hd.hive.execute.html">hd.hive.execute</a>,
+<a href="hd.hive.csv.table.html">hd.hive.csv.table</a>,
+<a href="hd.hive.select.html">hd.hive.select</a>,
+<a href="hd.hive.info.html">hd.hive.info</a>,
+<a href="hd.hive.json.table.html">hd.hive.json.table</a>
+</em>
+
+<p>
+    See also related <a href="http://grasswiki.osgeo.org/wiki/">wiki page</a>.
+
+
+<h2>AUTHOR</h2>
+
+Matej Krejci, <a href="http://geo.fsv.cvut.cz/gwiki/osgeorel">OSGeoREL</a>
+at the Czech Technical University in Prague, developed
+during master thesis project 2016 (mentor: Martin Landa)

Added: grass-addons/grass7/hadoop/hd/hd.hdfs.out.vector/hd.hdfs.out.vector.py
===================================================================
--- grass-addons/grass7/hadoop/hd/hd.hdfs.out.vector/hd.hdfs.out.vector.py	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hd.hdfs.out.vector/hd.hdfs.out.vector.py	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,114 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+############################################################################
+#
+# MODULE:       hd.hdfs.out.vector
+# AUTHOR(S):    Matej Krejci (matejkrejci at gmail.com
+#
+# COPYRIGHT:    (C) 2016 by the GRASS Development Team
+#
+#               This program is free software under the GNU General
+#               Public License (>=v2). Read the file COPYING that
+#               comes with GRASS for details.
+#
+#############################################################################
+
+#%module
+#% description: Module for creting map from HIVE table. Module convert esri GeoJson to Grass map
+#% keyword: database
+#% keyword: hdfs
+#% keyword: hive
+#%end
+#%option
+#% key: driver
+#% type: string
+#% required: yes
+#% options: webhdfs
+#% description: HDFS driver
+#%end
+#%option
+#% key: table
+#% type: string
+#% description: Name of table for import
+#%end
+#%option
+#% key: hdfs
+#% type: string
+#% description: Hdfs path to the table. See hive.info table -h
+#%end
+#%option G_OPT_V_OUTPUT
+#% key: out
+#% required: yes
+#%end
+#%flag
+#% key: r
+#% description: remove temporal file
+#% guisection: data
+#%end
+#%option
+#% key: attributes
+#% type: string
+#% description: list of attributes with datatype
+#% guisection: data
+#%end
+
+import os
+import sys
+
+import grass.script as grass
+
+from hdfsgrass.hdfs_grass_lib import GrassMapBuilderEsriToEsri, GrassHdfs, ConnectionManager
+from hdfsgrass.hdfs_grass_util import get_tmp_folder
+
+
+import shutil
+
+def main():
+    tmp_dir = os.path.join(get_tmp_folder(), options['out'])
+
+    if os.path.exists(tmp_dir):
+        shutil.rmtree(tmp_dir)
+
+    transf = GrassHdfs(options['driver'])
+    table_path = options['hdfs']
+
+    if options['table']:
+        conn = ConnectionManager()
+        conn.get_current_connection('hiveserver2')
+
+        if not conn.get_current_connection('hiveserver2'):
+            grass.fatal("Cannot connet to hive for table description. "
+                        "Use param hdfs without param table")
+
+        hive = conn.get_hook()
+        table_path = hive.find_table_location(options['table'])
+        tmp_dir = os.path.join(tmp_dir,options['table'])
+
+
+    if not transf.download(hdfs=table_path,
+                           fs=tmp_dir):
+        return
+
+
+    files = os.listdir(tmp_dir)
+    map_string=''
+    for block in files:
+        map='%s_%s'%(options['out'],block)
+        block=os.path.join(tmp_dir,block)
+
+        map_build = GrassMapBuilderEsriToEsri(block,
+                                              map,
+                                              options['attributes'])
+        try:
+            map_build.build()
+            map_string+='%s,'%map
+        except Exception ,e:
+            grass.warning("Error: %s\n     Map < %s >  conversion failed"%(e,block))
+
+    path,folder_name = os.path.split(tmp_dir)
+    grass.message("For merge map: v.patch output=%s -e --overwrite input=%s"%(folder_name,map_string))
+
+
+if __name__ == "__main__":
+    options, flags = grass.parser()
+    main()


Property changes on: grass-addons/grass7/hadoop/hd/hd.hdfs.out.vector/hd.hdfs.out.vector.py
___________________________________________________________________
Added: svn:executable
   + *

Added: grass-addons/grass7/hadoop/hd/hd.hive.csv.table/Makefile
===================================================================
--- grass-addons/grass7/hadoop/hd/hd.hive.csv.table/Makefile	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hd.hive.csv.table/Makefile	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,7 @@
+MODULE_TOPDIR = ../..
+
+PGM = hd.hive.csv.table
+
+include $(MODULE_TOPDIR)/include/Make/Script.make
+
+default: script

Added: grass-addons/grass7/hadoop/hd/hd.hive.csv.table/hd.hive.csv.table.html
===================================================================
--- grass-addons/grass7/hadoop/hd/hd.hive.csv.table/hd.hive.csv.table.html	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hd.hive.csv.table/hd.hive.csv.table.html	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,51 @@
+<h2>DESCRIPTION</h2>
+
+<em>hd.hive.csv.table</em> helps to create Hive table for storing data in CSV format
+
+
+<p>
+
+
+<h2>NOTES</h2>
+
+Spatial Framework from ESRI supports reading  from
+several file format.  The Framework allows creating geometric data type from
+WKB, JSON and GeoJSON. Table can be created from hiveserver2 command
+line or with using module  <em>hd.hive.csv.table</em>.
+Defining feature of table is provided using parameters and flags of module. It helps
+to user make CSV based  table without advanced knowledge of Hive syntax.
+
+
+<h2>EXAMPLES</h2>
+
+Creating table for storing  coordinates.
+
+<div class="code"><pre>
+hd.hive.json.table driver=hiveserver2 table=csv driver=hiveserver2 table=json struct="type string, properties STRUCT<cat:SMALLINT>, geometry string" -e -d
+</pre>
+</div>
+
+
+<h2>SEE ALSO</h2>
+
+<em>
+<a href="hd.hdfs.in.fs.html">hd.hdfs.in.fs</a>,
+<a href="hd.hdfs.in.vector.html">hd.hdfs.in.vector</a>,
+<a href="hd.hdfs.out.vector.html">hd.hdfs.out.vector</a>,
+<a href="hd.hdfs.info.html">hd.hdfs.info</a>,
+<a href="hd.hive.execute.html">hd.hive.execute</a>,
+<a href="hd.hive.csv.table.html">hd.hive.csv.table</a>,
+<a href="hd.hive.select.html">hd.hive.select</a>,
+<a href="hd.hive.info.html">hd.hive.info</a>,
+<a href="hd.hive.json.table.html">hd.hive.json.table</a>
+</em>
+
+<p>
+    See also related <a href="http://grasswiki.osgeo.org/wiki/">wiki page</a>.
+
+
+<h2>AUTHOR</h2>
+
+Matej Krejci, <a href="http://geo.fsv.cvut.cz/gwiki/osgeorel">OSGeoREL</a>
+at the Czech Technical University in Prague, developed
+during master thesis project 2016 (mentor: Martin Landa)

Added: grass-addons/grass7/hadoop/hd/hd.hive.csv.table/hd.hive.csv.table.py
===================================================================
--- grass-addons/grass7/hadoop/hd/hd.hive.csv.table/hd.hive.csv.table.py	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hd.hive.csv.table/hd.hive.csv.table.py	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,128 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+############################################################################
+#
+# MODULE:       hd.hive.csv.table
+# AUTHOR(S):    Matej Krejci (matejkrejci at gmail.com
+#
+# COPYRIGHT:    (C) 2016 by the GRASS Development Team
+#
+#               This program is free software under the GNU General
+#               Public License (>=v2). Read the file COPYING that
+#               comes with GRASS for details.
+#
+#############################################################################
+
+#%module
+#% description: Hive table creator
+#% keyword: database
+#% keyword: hdfs
+#% keyword: hive
+#%end
+
+#%option
+#% key: driver
+#% type: string
+#% required: yes
+#% answer: hiveserver2
+#% description: Type of database driver
+#% options: hiveserver2
+#%end
+#%option
+#% key: table
+#% type: string
+#% required: yes
+#% description: name of table
+#% guisection: table
+#%end
+#%option
+#% key: columns
+#% type: string
+#% required: yes
+#% description: python dictionary {attribute:datatype}
+#% guisection: table
+#%end
+#%option
+#% key: stored
+#% type: string
+#% required: no
+#% answer: textfile
+#% description: output
+#% guisection: table
+#%end
+#%option
+#% key: outputformat
+#% type: string
+#% required: no
+#% description: output
+#% guisection: table
+#%end
+#%option
+#% key: csvpath
+#% type: string
+#% required: no
+#% description: hdfs path specifying input data
+#% guisection: data
+#%end
+#%option
+#% key: partition
+#% type: string
+#% required: no
+#% description: arget partition as a dict of partition columns and values
+#% guisection: data
+#%end
+#%option
+#% key: serde
+#% type: string
+#% description: java class for serialization of json
+#% guisection: table
+#%end
+#%option
+#% key: delimeter
+#% type: string
+#% required: yes
+#% answer: ,
+#% description: csv delimeter of fields
+#% guisection: data
+#%end
+#%flag
+#% key: o
+#% description: Optional if csvpath for loading data is delcared. overwrite all data in table.
+#% guisection: data
+#%end
+#%flag
+#% key: d
+#% description: Firstly drop table if exists
+#% guisection: table
+#%end
+#%flag
+#% key: e
+#% description: the EXTERNAL keyword lets you create a table and provide a LOCATION so that Hive does not use a default location for this table. This comes in handy if you already have data generated. When dropping an EXTERNAL table, data in the table is NOT deleted from the file system.
+#% guisection: table
+
+import grass.script as grass
+
+from hdfsgrass.hdfs_grass_lib import ConnectionManager
+
+
+def main():
+    conn = ConnectionManager()
+
+    conn.get_current_connection(options["driver"])
+    hive = conn.get_hook()
+    hive.create_csv_table(table=options['table'],
+                          field=options['columns'],
+                          partition=options['partition'],
+                          delimiter=options['delimeter'],
+                          stored=options['stored'],
+                          serde=options['serde'],
+                          outputformat=options['outputformat'],
+                          external=flags['e'],
+                          recreate=flags['d'],
+                          filepath=options['csvpath'],
+                          overwrite=flags['o'])
+
+
+if __name__ == "__main__":
+    options, flags = grass.parser()
+    main()


Property changes on: grass-addons/grass7/hadoop/hd/hd.hive.csv.table/hd.hive.csv.table.py
___________________________________________________________________
Added: svn:executable
   + *

Added: grass-addons/grass7/hadoop/hd/hd.hive.execute/Makefile
===================================================================
--- grass-addons/grass7/hadoop/hd/hd.hive.execute/Makefile	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hd.hive.execute/Makefile	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,7 @@
+MODULE_TOPDIR = ../..
+
+PGM = hd.hive.execute
+
+include $(MODULE_TOPDIR)/include/Make/Script.make
+
+default: script

Added: grass-addons/grass7/hadoop/hd/hd.hive.execute/hd.hive.execute.html
===================================================================
--- grass-addons/grass7/hadoop/hd/hd.hive.execute/hd.hive.execute.html	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hd.hive.execute/hd.hive.execute.html	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,45 @@
+<h2>DESCRIPTION</h2>
+
+<em>hd.hive.execute</em> allows to user execute Hive query (HQL).
+
+
+<p>
+The hd.hive.execute allows execution of any Hive command. It works properly only with
+non-optimised query type like: creating table, loading data, dropping databases
+etc. Thus, queries without formatted output.
+
+
+
+<h2>EXAMPLES</h2>
+
+Drop table europe_agg2
+
+<div class="code"><pre>
+hd.db.execute driver=hiveserver2 conn_id=hive_spatial  hql='DROP TABLE europe_agg2'
+</pre>
+</div>
+
+
+<h2>SEE ALSO</h2>
+
+<em>
+<a href="hd.hdfs.in.fs.html">hd.hdfs.in.fs</a>,
+<a href="hd.hdfs.in.vector.html">hd.hdfs.in.vector</a>,
+<a href="hd.hdfs.out.vector.html">hd.hdfs.out.vector</a>,
+<a href="hd.hdfs.info.html">hd.hdfs.info</a>,
+<a href="hd.hive.execute.html">hd.hive.execute</a>,
+<a href="hd.hive.csv.table.html">hd.hive.csv.table</a>,
+<a href="hd.hive.select.html">hd.hive.select</a>,
+<a href="hd.hive.info.html">hd.hive.info</a>,
+<a href="hd.hive.json.table.html">hd.hive.json.table</a>
+</em>
+
+<p>
+    See also related <a href="http://grasswiki.osgeo.org/wiki/">wiki page</a>.
+
+
+<h2>AUTHOR</h2>
+
+Matej Krejci, <a href="http://geo.fsv.cvut.cz/gwiki/osgeorel">OSGeoREL</a>
+at the Czech Technical University in Prague, developed
+during master thesis project 2016 (mentor: Martin Landa)

Added: grass-addons/grass7/hadoop/hd/hd.hive.execute/hd.hive.execute.py
===================================================================
--- grass-addons/grass7/hadoop/hd/hd.hive.execute/hd.hive.execute.py	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hd.hive.execute/hd.hive.execute.py	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,61 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+############################################################################
+#
+# MODULE:       db.hive.execute
+# AUTHOR(S):    Matej Krejci (matejkrejci at gmail.com
+#
+# COPYRIGHT:    (C) 2016 by the GRASS Development Team
+#
+#               This program is free software under the GNU General
+#               Public License (>=v2). Read the file COPYING that
+#               comes with GRASS for details.
+#
+#############################################################################
+
+#%module
+#% description: Execute HIVEsql command
+#% keyword: database
+#% keyword: hdfs
+#% keyword: hive
+#%end
+
+#%option
+#% key: driver
+#% type: string
+#% required: yes
+#% answer: hiveserver2
+#% description: Type of database driver
+#% options: hive_cli, hiveserver2
+#%end
+#%option
+#% key: hql
+#% type: string
+#% required: yes
+#% description: hive sql command
+#%end
+#%flag
+#% key: f
+#% description: fetch results
+#%end
+
+import grass.script as grass
+
+from hdfsgrass.hdfs_grass_lib import ConnectionManager
+
+
+def main():
+
+    conn = ConnectionManager()
+
+    conn.get_current_connection(options["conn_type"])
+    hive = conn.get_hook()
+    result = hive.execute(options['hql'], options['fatch'])
+    if flags['f']:
+        for i in result:
+            print(i)
+
+
+if __name__ == "__main__":
+    options, flags = grass.parser()
+    main()


Property changes on: grass-addons/grass7/hadoop/hd/hd.hive.execute/hd.hive.execute.py
___________________________________________________________________
Added: svn:executable
   + *

Added: grass-addons/grass7/hadoop/hd/hd.hive.info/Makefile
===================================================================
--- grass-addons/grass7/hadoop/hd/hd.hive.info/Makefile	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hd.hive.info/Makefile	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,7 @@
+MODULE_TOPDIR = ../..
+
+PGM = hd.hive.info
+
+include $(MODULE_TOPDIR)/include/Make/Script.make
+
+default: script

Added: grass-addons/grass7/hadoop/hd/hd.hive.info/hd.hive.info.html
===================================================================
--- grass-addons/grass7/hadoop/hd/hd.hive.info/hd.hive.info.html	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hd.hive.info/hd.hive.info.html	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,32 @@
+<h2>DESCRIPTION</h2>
+
+<em>hd.hive.info</em> allows to get essential metadata about Hive database.
+
+<p>
+
+<h2>NOTES</h2>
+The module currently supports several  basic operation for description of tables, table HDFS location or all table description.
+
+<h2>SEE ALSO</h2>
+
+<em>
+<a href="hd.hdfs.in.fs.html">hd.hdfs.in.fs</a>,
+<a href="hd.hdfs.in.vector.html">hd.hdfs.in.vector</a>,
+<a href="hd.hdfs.out.vector.html">hd.hdfs.out.vector</a>,
+<a href="hd.hdfs.info.html">hd.hdfs.info</a>,
+<a href="hd.hive.execute.html">hd.hive.execute</a>,
+<a href="hd.hive.csv.table.html">hd.hive.csv.table</a>,
+<a href="hd.hive.select.html">hd.hive.select</a>,
+<a href="hd.hive.info.html">hd.hive.info</a>,
+<a href="hd.hive.json.table.html">hd.hive.json.table</a>
+</em>
+
+<p>
+    See also related <a href="http://grasswiki.osgeo.org/wiki/">wiki page</a>.
+
+
+<h2>AUTHOR</h2>
+
+Matej Krejci, <a href="http://geo.fsv.cvut.cz/gwiki/osgeorel">OSGeoREL</a>
+at the Czech Technical University in Prague, developed
+during master thesis project 2016 (mentor: Martin Landa)

Added: grass-addons/grass7/hadoop/hd/hd.hive.info/hd.hive.info.py
===================================================================
--- grass-addons/grass7/hadoop/hd/hd.hive.info/hd.hive.info.py	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hd.hive.info/hd.hive.info.py	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,85 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+############################################################################
+#
+# MODULE:       hd.hive.info
+# AUTHOR(S):    Matej Krejci (matejkrejci at gmail.com
+#
+# COPYRIGHT:    (C) 2016 by the GRASS Development Team
+#
+#               This program is free software under the GNU General
+#               Public License (>=v2). Read the file COPYING that
+#               comes with GRASS for details.
+#
+#############################################################################
+
+#%module
+#% description: Module for geting metadata of tables in hive
+#% keyword: database
+#% keyword: hdfs
+#% keyword: hive
+#%end
+
+#%option
+#% key: driver
+#% type: string
+#% required: yes
+#% answer: hiveserver2
+#% description: Type of database driver
+#% options: hiveserver2, hiveserver2
+#%end
+#%option
+#% key: table
+#% type: string
+#% required: no
+#% description: Name of table
+#% guisection: Connection
+#%end
+#%option
+#%flag
+#% key: p
+#% description: print tables
+#% guisection: table
+#%end
+#%flag
+#% key: d
+#% description: describe table
+#% guisection: table
+#%end
+#%flag
+#% key: h
+#% description: print hdfs path of table
+#% guisection: table
+#%end
+
+
+
+import grass.script as grass
+
+from hdfsgrass.hdfs_grass_lib import ConnectionManager
+
+
+
+def main():
+    conn = ConnectionManager()
+    conn.get_current_connection(options["driver"])
+    hive = conn.get_hook()
+    if flags['p']:
+        hive.show_tables()
+    if flags['d']:
+        if not options['table']:
+            grass.fatal("With flag <d> table must be defined")
+        hive.describe_table(options['table'], True)
+
+    if flags['h']:
+        if not options['table']:
+            grass.fatal("With flag <h> table must be defined")
+        print(hive.find_table_location(options['table']))
+
+    if options['path']:
+        hive.check_for_content(options['path'])
+
+
+if __name__ == "__main__":
+    options, flags = grass.parser()
+    main()


Property changes on: grass-addons/grass7/hadoop/hd/hd.hive.info/hd.hive.info.py
___________________________________________________________________
Added: svn:executable
   + *

Added: grass-addons/grass7/hadoop/hd/hd.hive.json.table/Makefile
===================================================================
--- grass-addons/grass7/hadoop/hd/hd.hive.json.table/Makefile	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hd.hive.json.table/Makefile	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,7 @@
+MODULE_TOPDIR = ../..
+
+PGM = hd.hive.json.table
+
+include $(MODULE_TOPDIR)/include/Make/Script.make
+
+default: script

Added: grass-addons/grass7/hadoop/hd/hd.hive.json.table/hd.hive.json.table.html
===================================================================
--- grass-addons/grass7/hadoop/hd/hd.hive.json.table/hd.hive.json.table.html	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hd.hive.json.table/hd.hive.json.table.html	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,49 @@
+<h2>DESCRIPTION</h2>
+
+<em>hd.hive.json.table</em> helps to create Hive table for storing data in GeoJSON format
+
+
+<p>
+
+
+<h2>NOTES</h2>
+Spatial Framework from ESRI supports reading  from
+several file format.  The Framework allows creating geometric data type from
+WKB, JSON and GeoJSON. Table can be created from hiveserver2 command
+line or with using module  <em>hd.hive.json.table</em>.
+Defining feature of table is provided using parameters and flags of module. It helps
+to user make table with GeoJSON table without advanced knowledge of Hive syntax.
+
+<h2>EXAMPLES</h2>
+
+Creating table for storing coordinates.
+
+<div class="code"><pre>
+hd.hive.csv.table driver=hiveserver2 table=csv attributes="x int,y int, Z int}" stored=textfile -e -d
+</pre>
+</div>
+
+
+<h2>SEE ALSO</h2>
+
+<em>
+<a href="hd.hdfs.in.fs.html">hd.hdfs.in.fs</a>,
+<a href="hd.hdfs.in.vector.html">hd.hdfs.in.vector</a>,
+<a href="hd.hdfs.out.vector.html">hd.hdfs.out.vector</a>,
+<a href="hd.hdfs.info.html">hd.hdfs.info</a>,
+<a href="hd.hive.execute.html">hd.hive.execute</a>,
+<a href="hd.hive.csv.table.html">hd.hive.csv.table</a>,
+<a href="hd.hive.select.html">hd.hive.select</a>,
+<a href="hd.hive.info.html">hd.hive.info</a>,
+<a href="hd.hive.json.table.html">hd.hive.json.table</a>
+</em>
+
+<p>
+    See also related <a href="http://grasswiki.osgeo.org/wiki/">wiki page</a>.
+
+
+<h2>AUTHOR</h2>
+
+Matej Krejci, <a href="http://geo.fsv.cvut.cz/gwiki/osgeorel">OSGeoREL</a>
+at the Czech Technical University in Prague, developed
+during master thesis project 2016 (mentor: Martin Landa)

Added: grass-addons/grass7/hadoop/hd/hd.hive.json.table/hd.hive.json.table.py
===================================================================
--- grass-addons/grass7/hadoop/hd/hd.hive.json.table/hd.hive.json.table.py	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hd.hive.json.table/hd.hive.json.table.py	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,111 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+############################################################################
+#
+# MODULE:       hd.hive.json.table
+# AUTHOR(S):    Matej Krejci (matejkrejci at gmail.com
+#
+# COPYRIGHT:    (C) 2016 by the GRASS Development Team
+#
+#               This program is free software under the GNU General
+#               Public License (>=v2). Read the file COPYING that
+#               comes with GRASS for details.
+#
+#############################################################################
+
+#%module
+#% description: Creating Hive spatial tables for storing Json map
+#% keyword: database
+#% keyword: hdfs
+#% keyword: hive
+#%end
+#%option
+#% key: driver
+#% type: string
+#% required: yes
+#% answer: hiveserver2
+#% description: Type of database driver
+#% options: hive_cli,hiveserver2
+#% guisection: table
+#%end
+#%option
+#% key: table
+#% type: string
+#% required: yes
+#% description: name of table
+#% guisection: table
+#%end
+#%option
+#% key: columns
+#% type: string
+#% guisection: table
+#%end
+#%option
+#% key: stored
+#% type: string
+#% required: no
+#% description: output
+#% guisection: table
+#%end
+#%option
+#% key: serde
+#% type: string
+#% required: yes
+#% answer: org.openx.data.jsonserde.JsonSerDe
+#% description: java class for serialization of json
+#% guisection: table
+#%end
+#%option
+#% key: outformat
+#% type: string
+#% description: java class for handling output format
+#% guisection: table
+#%end
+#%option
+#% key: jsonpath
+#% type: string
+#% description: hdfs path specifying input data
+#% guisection: data
+#%end
+#%flag
+#% key: o
+#% description: Possible if filepath for loading data is delcared. True-overwrite all data in table.
+#% guisection: data
+#%end
+#%flag
+#% key: d
+#% description: Firstly drop table if exists
+#% guisection: table
+#%end
+#%flag
+#% key: e
+#% description: The EXTERNAL keyword lets you create a table and provide a LOCATION so that Hive does not use a default location for this table. This comes in handy if you already have data generated. When dropping an EXTERNAL table, data in the table is NOT deleted from the file system.
+#% guisection: table
+#%end
+
+import grass.script as grass
+
+from hdfsgrass.hdfs_grass_lib import ConnectionManager
+
+
+def main():
+    if not options['columns'] and not options['struct']:
+        grass.fatal("Must be defined <attributes> or <struct> parameter")
+
+    conn = ConnectionManager()
+    conn.get_current_connection(options["driver"])
+    hive = conn.get_hook()
+    hive.create_geom_table(table=options['table'],
+                           field=options['columns'],
+                           stored=options['stored'],
+                           serde=options['serde'],
+                           outputformat=options['outformat'],
+                           external=flags['e'],
+                           recreate=flags['d'],
+                           filepath=options['jsonpath'],
+                           overwrite=flags['o'])
+
+
+if __name__ == "__main__":
+    options, flags = grass.parser()
+    main()


Property changes on: grass-addons/grass7/hadoop/hd/hd.hive.json.table/hd.hive.json.table.py
___________________________________________________________________
Added: svn:executable
   + *

Added: grass-addons/grass7/hadoop/hd/hd.hive.load/Makefile
===================================================================
--- grass-addons/grass7/hadoop/hd/hd.hive.load/Makefile	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hd.hive.load/Makefile	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,7 @@
+MODULE_TOPDIR = ../..
+
+PGM = hd.hive.load
+
+include $(MODULE_TOPDIR)/include/Make/Script.make
+
+default: script

Added: grass-addons/grass7/hadoop/hd/hd.hive.load/hd.hive.load.html
===================================================================
--- grass-addons/grass7/hadoop/hd/hd.hive.load/hd.hive.load.html	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hd.hive.load/hd.hive.load.html	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,47 @@
+<h2>DESCRIPTION</h2>
+
+<em>hd.hive.load</em> module allows to insert(load) data stored in HDFS into Hive table
+
+
+<p>
+
+
+<h2>NOTES</h2>
+Module <em>hd.hive.load</em> provides option to load data to the table.
+Availability of this module ensures more space within building of the workflow
+especially using python scripting or  <a href="https://grass.osgeo.org/grass70/manuals/wxGUI.gmodeler.html">graphical modeler of
+GRASS</a>,
+.
+
+<h2>EXAMPLES</h2>
+
+Below is example of HQL command: <em>LOAD DATA INPATH '/data/europe_latest_fix.csv' OVERWRITE INTO TABLE  europe;</em>
+<div class="code"><pre>
+hd.hive.load driver=hiveserver2 table=europe path=/data/europe_latest_fix.csv
+</pre>
+</div>
+
+
+<h2>SEE ALSO</h2>
+
+<em>
+<a href="hd.hdfs.in.fs.html">hd.hdfs.in.fs</a>,
+<a href="hd.hdfs.in.vector.html">hd.hdfs.in.vector</a>,
+<a href="hd.hdfs.out.vector.html">hd.hdfs.out.vector</a>,
+<a href="hd.hdfs.info.html">hd.hdfs.info</a>,
+<a href="hd.hive.execute.html">hd.hive.execute</a>,
+<a href="hd.hive.csv.table.html">hd.hive.csv.table</a>,
+<a href="hd.hive.select.html">hd.hive.select</a>,
+<a href="hd.hive.info.html">hd.hive.info</a>,
+<a href="hd.hive.json.table.html">hd.hive.json.table</a>
+</em>
+
+<p>
+    See also related <a href="http://grasswiki.osgeo.org/wiki/">wiki page</a>.
+
+
+<h2>AUTHOR</h2>
+
+Matej Krejci, <a href="http://geo.fsv.cvut.cz/gwiki/osgeorel">OSGeoREL</a>
+at the Czech Technical University in Prague, developed
+during master thesis project 2016 (mentor: Martin Landa)

Added: grass-addons/grass7/hadoop/hd/hd.hive.load/hd.hive.load.py
===================================================================
--- grass-addons/grass7/hadoop/hd/hd.hive.load/hd.hive.load.py	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hd.hive.load/hd.hive.load.py	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,68 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+############################################################################
+#
+# MODULE:       db.hive.load
+# AUTHOR(S):    Matej Krejci (matejkrejci at gmail.com
+#
+# COPYRIGHT:    (C) 2016 by the GRASS Development Team
+#
+#               This program is free software under the GNU General
+#               Public License (>=v2). Read the file COPYING that
+#               comes with GRASS for details.
+#
+#############################################################################
+
+#%module
+#% description: Load data to Hive table
+#% keyword: database
+#% keyword: hdfs
+#% keyword: hive
+#%end
+
+#%option
+#% key: driver
+#% type: string
+#% required: yes
+#% answer: hiveserver2
+#% description: Type of database driver
+#% options: hive_cli, hiveserver2
+#%end
+#%option
+#% key: table
+#% type: string
+#% required: yes
+#% description: name of table
+#%end
+#%option
+#% key: path
+#% type: string
+#% required: yes
+#% description: path of hdfs file
+#%end
+#%option
+#% key: partition
+#% type: string
+#% required: no
+#% description:  partition as a dict of  columns and values
+#%end
+
+import grass.script as grass
+
+from hdfsgrass.hdfs_grass_lib import ConnectionManager
+
+
+def main():
+    conn = ConnectionManager()
+
+    conn.get_current_connection(options["driver"])
+    hive = conn.get_hook()
+    hive.data2table(filepath=options['path'],
+                    table=options['table'],
+                    overwrite=options['path'],  # TODO
+                    partition=options['partition'])
+
+
+if __name__ == "__main__":
+    options, flags = grass.parser()
+    main()


Property changes on: grass-addons/grass7/hadoop/hd/hd.hive.load/hd.hive.load.py
___________________________________________________________________
Added: svn:executable
   + *

Added: grass-addons/grass7/hadoop/hd/hd.hive.select/Makefile
===================================================================
--- grass-addons/grass7/hadoop/hd/hd.hive.select/Makefile	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hd.hive.select/Makefile	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,7 @@
+MODULE_TOPDIR = ../..
+
+PGM = hd.hive.select
+
+include $(MODULE_TOPDIR)/include/Make/Script.make
+
+default: script

Added: grass-addons/grass7/hadoop/hd/hd.hive.select/hd.hive.select.html
===================================================================
--- grass-addons/grass7/hadoop/hd/hd.hive.select/hd.hive.select.html	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hd.hive.select/hd.hive.select.html	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,42 @@
+<h2>DESCRIPTION</h2>
+
+<em>hd.hive.select</em> module allows to query table of Hive.
+
+
+<p>
+
+
+<h2>NOTES</h2>
+
+<h2>EXAMPLES</h2>
+
+Below is example of HQL query with redirecting output to file
+<div class="code"><pre>
+hd.hive.select driver=hiveserver2 hql='SELECT linkid from mwrecord' out='tmp/linkid.hql'
+</pre>
+</div>
+
+
+<h2>SEE ALSO</h2>
+
+<em>
+<a href="hd.hdfs.in.fs.html">hd.hdfs.in.fs</a>,
+<a href="hd.hdfs.in.vector.html">hd.hdfs.in.vector</a>,
+<a href="hd.hdfs.out.vector.html">hd.hdfs.out.vector</a>,
+<a href="hd.hdfs.info.html">hd.hdfs.info</a>,
+<a href="hd.hive.execute.html">hd.hive.execute</a>,
+<a href="hd.hive.csv.table.html">hd.hive.csv.table</a>,
+<a href="hd.hive.select.html">hd.hive.select</a>,
+<a href="hd.hive.info.html">hd.hive.info</a>,
+<a href="hd.hive.json.table.html">hd.hive.json.table</a>
+</em>
+
+<p>
+    See also related <a href="http://grasswiki.osgeo.org/wiki/">wiki page</a>.
+
+
+<h2>AUTHOR</h2>
+
+Matej Krejci, <a href="http://geo.fsv.cvut.cz/gwiki/osgeorel">OSGeoREL</a>
+at the Czech Technical University in Prague, developed
+during master thesis project 2016 (mentor: Martin Landa)

Added: grass-addons/grass7/hadoop/hd/hd.hive.select/hd.hive.select.py
===================================================================
--- grass-addons/grass7/hadoop/hd/hd.hive.select/hd.hive.select.py	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hd.hive.select/hd.hive.select.py	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,77 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+############################################################################
+#
+# MODULE:       hd.hive.execute
+# AUTHOR(S):    Matej Krejci (matejkrejci at gmail.com)
+#
+# COPYRIGHT:    (C) 2016 by the GRASS Development Team
+#
+#               This program is free software under the GNU General
+#               Public License (>=v2). Read the file COPYING that
+#               comes with GRASS for details.
+#
+#############################################################################
+
+#%module
+#% description: Execute HIVEsql command
+#% keyword: database
+#% keyword: hdfs
+#% keyword: hive
+#%end
+
+#%option
+#% key: driver
+#% type: string
+#% required: yes
+#% answer: hiveserver2
+#% description: Type of database driver
+#% options: hive_cli, hiveserver2
+#%end
+#%option
+#% key: hql
+#% type: string
+#% required: yes
+#% description: hive sql command
+#%end
+#%option
+#% key: schema
+#% type: string
+#% required: no
+#% description: hive db schema
+#%end
+#%G_OPT_F_OUTPUT
+#% key: out
+#% type: string
+#% required: no
+#% description: Name for output file (if omitted output to stdout)
+#%end
+
+import grass.script as grass
+
+from hdfsgrass.hdfs_grass_lib import ConnectionManager
+
+
+def main():
+    conn = ConnectionManager()
+
+    conn.get_current_connection(options["driver"])
+    hive = conn.get_hook()
+
+    if not options['schema']:
+        options['schema'] = 'default'
+
+    out = hive.get_results(hql=options['hql'],
+                           schema=options['schema'])
+
+    if options['out']:
+        with open(out, 'rw') as io:
+            io.writelines(out)
+            io.close()
+    else:
+        print out
+
+
+if __name__ == "__main__":
+    options, flags = grass.parser()
+    main()


Property changes on: grass-addons/grass7/hadoop/hd/hd.hive.select/hd.hive.select.py
___________________________________________________________________
Added: svn:executable
   + *

Added: grass-addons/grass7/hadoop/hd/hdfsgrass/Makefile
===================================================================
--- grass-addons/grass7/hadoop/hd/hdfsgrass/Makefile	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hdfsgrass/Makefile	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,23 @@
+MODULE_TOPDIR = ../..
+
+include $(MODULE_TOPDIR)/include/Make/Other.make
+include $(MODULE_TOPDIR)/include/Make/Python.make
+
+DSTDIR = $(ETC)/hd/hdfsgrass
+
+MODULES = $(wildcard *.py)
+
+PYFILES := $(patsubst %,$(DSTDIR)/%,$(MODULES))
+PYCFILES := $(patsubst %.py,$(DSTDIR)/%.pyc,$(filter %.py,$(MODULES)))
+
+default: $(PYFILES) $(PYCFILES)
+
+install:
+	$(MKDIR) $(INST_DIR)/etc/hd/hdfsgrass
+	@cp -rL $(DSTDIR) $(INST_DIR)/etc/hd
+
+$(DSTDIR):
+	$(MKDIR) -p $@
+
+$(DSTDIR)/%: % | $(DSTDIR)
+	$(INSTALL_DATA) $< $@

Added: grass-addons/grass7/hadoop/hd/hdfsgrass/__init__.py
===================================================================
--- grass-addons/grass7/hadoop/hd/hdfsgrass/__init__.py	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hdfsgrass/__init__.py	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,12 @@
+import os
+
+import grass_map
+import hdfs_grass_lib
+import hdfs_grass_util
+
+
+for module in os.listdir(os.path.dirname(__file__)):
+    if module == '__init__.py' or module[-3:] != '.py':
+        continue
+    __import__(module[:-3], locals(), globals())
+del module

Added: grass-addons/grass7/hadoop/hd/hdfsgrass/grass_map.py
===================================================================
--- grass-addons/grass7/hadoop/hd/hdfsgrass/grass_map.py	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hdfsgrass/grass_map.py	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,131 @@
+import os
+import grass.script as grass
+from grass.exceptions import CalledModuleError
+from grass.pygrass.modules import Module
+from grass.script import parse_key_val
+from subprocess import PIPE
+
+class VectorDBInfo:
+    """Class providing information about attribute tables
+    linked to a vector map"""
+    def __init__(self, map):
+        self.map = map
+
+        # dictionary of layer number and associated (driver, database, table)
+        self.layers = {}
+         # dictionary of table and associated columns (type, length, values, ids)
+        self.tables = {}
+
+        if not self._CheckDBConnection(): # -> self.layers
+            return
+
+        self._DescribeTables() # -> self.tables
+
+    def _CheckDBConnection(self):
+        """Check DB connection"""
+        nuldev = file(os.devnull, 'w+')
+        # if map is not defined (happens with vnet initialization) or it doesn't exist
+        try:
+            self.layers = grass.vector_db(map=self.map, stderr=nuldev)
+        except CalledModuleError:
+            return False
+        finally:  # always close nuldev
+            nuldev.close()
+
+        return bool(len(self.layers.keys()) > 0)
+
+    def _DescribeTables(self):
+        """Describe linked tables"""
+        for layer in self.layers.keys():
+            # determine column names and types
+            table = self.layers[layer]["table"]
+            columns = {} # {name: {type, length, [values], [ids]}}
+            i = 0
+            for item in grass.db_describe(table = self.layers[layer]["table"],
+                                          driver = self.layers[layer]["driver"],
+                                          database = self.layers[layer]["database"])['cols']:
+                name, type, length = item
+                # FIXME: support more datatypes
+                if type.lower() == "integer":
+                    ctype = int
+                elif type.lower() == "double precision":
+                    ctype = float
+                else:
+                    ctype = str
+
+                columns[name.strip()] = { 'index'  : i,
+                                          'type'   : type.lower(),
+                                          'ctype'  : ctype,
+                                          'length' : int(length),
+                                          'values' : [],
+                                          'ids'    : []}
+                i += 1
+
+            # check for key column
+            # v.db.connect -g/p returns always key column name lowercase
+            if self.layers[layer]["key"] not in columns.keys():
+                for col in columns.keys():
+                    if col.lower() == self.layers[layer]["key"]:
+                        self.layers[layer]["key"] = col.upper()
+                        break
+
+            self.tables[table] = columns
+
+        return True
+
+    def Reset(self):
+        """Reset"""
+        for layer in self.layers:
+            table = self.layers[layer]["table"] # get table desc
+            for name in self.tables[table].keys():
+                self.tables[table][name]['values'] = []
+                self.tables[table][name]['ids']    = []
+
+    def GetName(self):
+        """Get vector name"""
+        return self.map
+
+    def GetKeyColumn(self, layer):
+        """Get key column of given layer
+
+        :param layer: vector layer number
+        """
+        return str(self.layers[layer]['key'])
+
+    def GetTable(self, layer):
+        """Get table name of given layer
+
+        :param layer: vector layer number
+        """
+        return self.layers[layer]['table']
+
+    def GetDbSettings(self, layer):
+        """Get database settins
+
+        :param layer: layer number
+
+        :return: (driver, database)
+        """
+        return self.layers[layer]['driver'], self.layers[layer]['database']
+
+    def GetTableDesc(self, table):
+        """Get table columns
+
+        :param table: table name
+        """
+        return self.tables[table]
+
+
+class GrassMap(object):
+    def __init__(self,map):
+       self.map=map
+
+    def get_topology(self,map):
+        vinfo = Module('v.info',
+                        self.map,
+                        flags='t',
+                        quiet=True,
+                        stdout_=PIPE)
+
+        features = parse_key_val(vinfo.outputs.stdout)
+

Added: grass-addons/grass7/hadoop/hd/hdfsgrass/hdfs_grass_lib.py
===================================================================
--- grass-addons/grass7/hadoop/hd/hdfsgrass/hdfs_grass_lib.py	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hdfsgrass/hdfs_grass_lib.py	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,717 @@
+from __future__ import print_function
+from __future__ import unicode_literals
+
+import inspect
+import logging
+import os
+import sys
+#import mmap
+path = os.path.join(os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe()))), os.pardir)
+if not path in sys.path:
+    sys.path.append(path)
+
+from hdfswrapper.connections import Connection
+from hdfswrapper import settings
+from sqlalchemy.exc import IntegrityError
+from sqlalchemy import Table
+from hdfs_grass_util import read_dict, save_dict, get_tmp_folder
+from grass.pygrass.modules import Module
+from grass.script.core import PIPE
+import grass.script as grass
+from grass_map  import VectorDBInfo as VectorDBInfoBase
+
+from dagpype import stream_lines, to_stream,filt, nth
+
+class ConnectionManager:
+    """
+    The handler of connection drivers for Hadoop/Hive.
+    The module provides storing of connection profiles in
+    default GRASS GIS database backend which is SQLite  by default.
+    Database manager for HDFS allows setting  connection id and its driver.
+    So for each type of database (driver) can be
+    stored several user connections distinctive by user defined id (\textit{conn\_ide}
+    parameter) meanwhile  each driver can have only one primary connection.
+    >>> conn = ConnectionManager()
+    >>> conn.set_connection(conn_type='hiveserver2',
+    >>>                   conn_id='testhiveconn1',
+    >>>                   host='172.17.0.2',
+    >>>                   port=10000,
+    >>>                   login='root',
+    >>>                   password='test',
+    >>>                   schema='default')
+    >>> conn.drop_connection_table()
+    >>> conn.show_connections()
+    >>> conn.add_connection()
+    >>> conn.test_connection(conn.get_current_Id())
+    >>> conn.remove_conn_Id('testhiveconn1')
+    >>> grass.message(conn.get_current_Id())
+    """
+
+    def __init__(self):
+
+        self.conn_id = None
+        self.conn_type = None
+        self.host = None
+        self.port = None
+        self.login = None
+        self.password = None
+        self.schema = None
+        self.authMechanism = None
+        self.connected = False
+        self.uri = None
+        self.connectionDict = None
+        self.connection = None
+
+        self.session = settings.Session
+
+    def _connect(self):
+        """
+        Perform connection from initialisation.
+        :return: self.connection which is main class of Connection module
+        """
+        if self.uri:
+            self.connectionDict = {'uri': self.uri}
+        else:
+            self.connectionDict = {'host': self.host}
+            if self.login:
+                self.connectionDict['login'] = self.login
+            if self.schema:
+                self.connectionDict['schema'] = self.schema
+            if self.conn_id:
+                self.connectionDict['conn_id'] = self.conn_id
+            if self.conn_type:
+                self.connectionDict['conn_type'] = self.conn_type
+            if self.password:
+                self.connectionDict['password'] = self.password
+            if self.port:
+                self.connectionDict['port'] = self.port
+        self.connection = Connection(**self.connectionDict)
+
+    def add_connection(self):
+        """
+        Add connection to sql database. If connection already exists, is overwritten.
+        :return:
+        """
+        grass.message('***' * 30)
+        grass.message("\n     Adding new connection \n       conn_type: %s  \n" % self.conn_type)
+        self.session.add(self.connection)
+        try:
+            self.session.commit()
+            self._set_active_connection(self.conn_type, self.connectionDict['conn_id'])
+        except IntegrityError, e:
+            grass.message("       ERROR conn_id already exists. Will be overwritten...\n")
+            grass.message('***' * 30)
+            self.session.rollback()
+            self.session.flush()
+            self.remove_conn_Id(self.connectionDict['conn_id'])
+            self.add_connection()
+            grass.message('***' * 30)
+
+    def set_connection(self, conn_id, conn_type,
+                       host=None, port=None,
+                       login=None, password=None,
+                       schema=None, authMechanism=None):
+        """
+        Set new connection.
+        :param conn_id: id of connection
+        :param conn_type: driver
+        :param host:
+        :param port:
+        :param login:
+        :param password:
+        :param schema:
+        :param authMechanism: plain or kerberos
+        :return:
+        """
+
+        if None in [conn_id, conn_type]:
+            grass.fatal("ERROR: no conn_id or conn_type defined")
+            return None
+        self.conn_id = conn_id
+        self.conn_type = conn_type
+        self.host = host
+        self.port = port
+        self.login = login
+        self.password = password
+        self.schema = schema
+        self.authMechanism = authMechanism
+
+        self._connect()
+
+    @staticmethod
+    def drop_connection_table():
+        """
+        Remove all saved connection
+        :return:
+        """
+        from sqlalchemy import MetaData
+        md = MetaData()
+        connTable = Table('connection', md)
+        try:
+            connTable.drop(settings.engine)
+            os.remove(settings.grass_config)
+            grass.message('***' * 30)
+            grass.message("\n     Table of connection has been removed \n")
+            grass.message('***' * 30)
+        except Exception, e:
+            grass.message('***' * 30)
+            grass.message("\n     No table exists\n")
+            grass.message('***' * 30)
+
+    @staticmethod
+    def show_connections():
+        """
+        Show all saved connection
+        :return:
+        """
+        cn = settings.engine.connect()
+        grass.message('***' * 30)
+        grass.message("\n     Table of connection \n")
+        try:
+            result = cn.execute('select * from connection')
+            for row in result:
+                grass.message("       %s\n" % row)
+            cn.close()
+        except Exception, e:
+            grass.message(e)
+            grass.message("        No connection\n")
+        grass.message('***' * 30)
+
+    def set_active_connection(self, conn_type=None, conn_id=None):
+        """
+        Set connection and than set it like active
+        :param conn_type: driver
+        :param idc: conn_id
+        :return:
+        """
+        self.set_connection(conn_type=conn_type, conn_id=conn_id)
+        self._set_active_connection()
+
+    def _set_active_connection(self, conn_type=None, conn_id=None):
+        """
+        Set active connection
+        :param conn_type: driver
+        :param conn_id: id of connection
+        :return:
+        """
+        if conn_type is None:
+            conn_type = self.conn_type
+        if conn_id is None:
+            conn_id = self.conn_id
+        cfg = read_dict(settings.grass_config)
+        cfg[conn_type] = conn_id
+        save_dict(settings.grass_config, cfg)
+
+    @staticmethod
+    def get_current_Id(conn_type):
+        """
+        Return id of active connection
+        :param conn_type:
+        :return:
+        """
+        cfg = read_dict(settings.grass_config)
+        if cfg:
+            if cfg.has_key(conn_type):
+                return cfg.get(conn_type)
+        else:
+            return None
+
+    @staticmethod
+    def show_active_connections():
+        """
+        Print active connection
+        :return:
+        """
+        cfg = read_dict(settings.grass_config)
+        if cfg:
+            grass.message('***' * 30)
+            grass.message('\n     Primary connection for db drivers\n')
+            for key, val in cfg.iteritems():
+                grass.message('       conn_type: %s -> conn_id: %s\n' % (key, val))
+        else:
+            grass.message('      No connection defined\n')
+        grass.message('***' * 30)
+
+    def get_current_connection(self, conn_type):
+        """
+        Return primary connection for given driver
+        :param conn_type:
+        :return:
+        """
+        idc = self.get_current_Id(conn_type)
+        if idc:
+            self.set_connection(conn_id=idc, conn_type=conn_type)
+            self._connect()
+        else:
+            self.connection = None
+
+        return self.connection
+
+    def get_hook(self):
+        """
+        Return hook of current connection
+        :return:
+        """
+        if self.connection:
+            return self.connection.get_hook()
+        return None
+
+    @staticmethod
+    def remove_conn_Id(id):
+        """
+        Remove connection for given ID
+        :param id:
+        :return:
+        """
+        cn = settings.engine.connect()
+        grass.message('***' * 30)
+        grass.message("\n     Removing connection %s " % id)
+        try:
+            grass.message('       conn_id= %s \n' % id)
+            cn.execute('DELETE FROM connection WHERE conn_id="%s"' % id)
+            cn.close()
+        except Exception, e:
+            grass.message("       ERROR: %s \n" % e)
+            # grass.message('     No connection with conn_id %s'%id)
+        grass.message('***' * 30)
+
+    def set_connection_uri(self, uri):
+        """
+        Set connection from uri
+        :param uri:
+        :return:
+        """
+        self.uri = uri
+        self._connect()
+
+    def test_connection(self, conn_type=None):
+        """
+        Test active connection
+        :param conn_type:
+        :return:
+        """
+        if conn_type is not None:
+            self.get_current_connection(conn_type)
+
+        hook = self.get_hook()
+        if hook:
+            if not hook.test():
+                grass.message('Cannot establish connection')
+                return False
+
+class HiveTableBuilder:
+    """
+    Abstract class for Hive table maker
+    """
+    def __init__(self,map,layer):
+        self.map = map
+        self.layer = layer
+
+    def get_structure(self):
+        raise NotImplemented
+
+        table = VectorDBInfoBase(self.map)
+        map_info = table.GetTableDesc(self.map)
+
+        for col in map_info.keys():
+            name=str(col)
+            dtype=col['type']
+            if dtype == 'integer':
+                dtype = 'INT'
+            if not dtype.capitalize() in dtype:
+                grass.fatal('Automatic generation of columns faild, datatype %s is not recognized'%dtype)
+
+    def _get_map(self):
+       raise NotImplemented
+
+class JSONBuilder:
+    """
+    Class which performe conversion from grass map to  serialisable GeoJSON
+    """
+    def __init__(self, grass_map=None, json_file=None):
+
+        self.grass_map = grass_map
+        self.json_file = json_file
+        self.json = ''
+
+    def get_JSON(self):
+        """
+        Return geojson for class  variable grass_map
+        :return:
+        """
+        if self.grass_map:
+            self.json = self._get_grass_json()
+        else:
+            filename, file_extension = os.path.splitext(self.json_file)
+            self.json = os.path.join(get_tmp_folder(), "%s.json" %filename)
+        return self.json
+
+    def rm_last_lines(self, path, rm_last_line=3):
+        """
+        Removing last n lines for given text file
+        :param path:
+        :param rm_last_line:
+        :return:
+        """
+        file = open(path, "r+", )
+
+        # Move the pointer (similar to a cursor in a text editor) to the end of the file.
+        file.seek(-rm_last_line, os.SEEK_END)
+
+        pos = file.tell() - 1
+
+        while pos > 0 and file.read(1) != "\n":
+            pos -= 1
+            file.seek(pos, os.SEEK_SET)
+
+        # So long as we're not at the start of the file, delete all the characters ahead of this position
+        if pos > 0:
+            file.seek(pos, os.SEEK_SET)
+            file.truncate()
+
+        file.close()
+
+    @staticmethod
+    def remove_line(filename, lineNumber):
+        """
+        Remove n lines for given file
+        :param filename:
+        :param lineNumber:
+        :return:
+        """
+        with open(filename, 'r+') as outputFile:
+            with open(filename, 'r') as inputFile:
+
+                currentLineNumber = 0
+                while currentLineNumber < lineNumber:
+                    inputFile.readline()
+                    currentLineNumber += 1
+
+                seekPosition = inputFile.tell()
+                outputFile.seek(seekPosition, 0)
+
+                inputFile.readline()
+
+                currentLine = inputFile.readline()
+                while currentLine:
+                    outputFile.writelines(currentLine)
+                    currentLine = inputFile.readline()
+
+            outputFile.truncate()
+
+    def _get_grass_json(self):
+        """
+        Transform GRASS map to GeoJSON
+        :return:
+        """
+        if self.grass_map['type'] not in ['point', 'line', 'boundary', 'centroid',
+                                          'area', 'face', 'kernel', 'auto']:
+            self.grass_map['type'] = 'auto'
+        out = "%s_%s.json" % (self.grass_map['map'],
+                              self.grass_map['layer'])
+
+        out = os.path.join(get_tmp_folder(), out)
+        if os.path.exists(out):
+            os.remove(out)
+        out1 = Module('v.out.ogr',
+                      input=self.grass_map['map'],
+                      layer=self.grass_map['layer'],
+                      type=self.grass_map['type'],
+                      output=out,
+                      format='GeoJSON',
+                      stderr_=PIPE,
+                      overwrite=True)
+
+        grass.message(out1.outputs["stderr"].value.strip())
+
+        self.rm_last_lines(out, 3)
+        # remove first 5 lines and last 3 to enseure format for serializetion
+        for i in range(5):  # todo optimize
+            self.remove_line(out, 0)
+
+        return out
+
+class GrassMapBuilder(object):
+    """
+    Base class for creating GRASS map from GeoJSON
+    """
+    def __init__(self, json_file, map,attributes):
+        self.file = json_file
+        self.map = map
+        self.attr=attributes
+
+    def build(self):
+        raise NotImplemented
+
+    def remove_line(self, lineNumber):
+        with open(self.file, 'r+') as outputFile:
+            with open(self.file, 'r') as inputFile:
+
+                currentLineNumber = 0
+                while currentLineNumber < lineNumber:
+                    inputFile.readline()
+                    currentLineNumber += 1
+
+                seekPosition = inputFile.tell()
+                outputFile.seek(seekPosition, 0)
+
+                inputFile.readline()
+
+                currentLine = inputFile.readline()
+                while currentLine:
+                    outputFile.writelines(currentLine)
+                    currentLine = inputFile.readline()
+
+            outputFile.truncate()
+
+    def _get_wkid(self):
+        """
+        Parse epsg from wkid (esri json)
+        :return:
+        :rtype:
+        """
+        with open(self.file, 'r') as f:
+            first_line = f.readline()
+            if first_line.find('wkid') != -1:
+                return self._find_between(first_line,'wkid":','}')
+
+    def _rm_null(self):
+        """
+        First line sometimes include null. Must be removed
+        :return:
+        """
+        with open(self.file, 'r') as f:
+            first_line = f.readline()
+            if first_line.find('null') != -1:
+                self.remove_line(0)
+
+    def _find_between(self,s, first, last ):
+        """
+        Return sting between two strings
+        :param s:
+        :param first:
+        :param last:
+        :return:
+        """
+        try:
+            start = s.index( first ) + len( first )
+            end = s.index( last, start )
+            return s[start:end]
+        except ValueError:
+            return ""
+
+    def _create_map(self):
+        """
+        Create map from GeoJSON
+        :return:
+        """
+        out1=Module('v.in.ogr',
+              input=self.file,
+              output=self.map,
+              verbose=True,
+              #stderr_=PIPE,
+              overwrite=True)
+
+        #grass.message(out1.outputs["stderr"].value.strip())
+        #logging.debug(out1.outputs["stderr"].value.strip())
+
+
+    def _prepend_line(self,line):
+        """
+        Prepend line to the text file
+        :param line:
+        :return:
+        """
+        with open(self.file, "r+") as f:
+             old = f.read() # read everything in the file
+             f.seek(0) # rewind
+             f.write("%s\n"%line + old) # write the new line before
+
+    def _append_line(self,line):
+        """
+        Append line to the text file
+        :param line:
+        :return:
+        """
+        with open(self.file, 'a') as file:
+            file.write(line)
+
+    def _get_type(self):
+        """
+        return type of esri simple feature
+        :return:
+        """
+        line = stream_lines(self.file) | nth(0)
+        if line.find('ring'):
+            return ['ring','"type":"Polygon","coordinates":']
+        if line.find('multipoint'):
+            return ['multipoint','"type":"MultiPoint","coordinates":']
+        if line.find('paths'):
+            return ['paths','"type":"MultiLineString","coordinates":']
+
+        if line.find('"x"'):
+            grass.fatal('Point is not supported')
+        if line.find('envelope'):
+            grass.fatal('Envelope is not supported')
+
+    def replace_substring(self,foo,bar):
+        """
+        Replace substring by given string
+        :param foo:
+        :param bar:
+        :return:
+        """
+        path='%s1'%self.file
+        io=open(path,'w')
+        stream_lines(self.file) |\
+        filt(lambda l: l.replace(foo, bar)) |\
+        to_stream(io)
+        self.file=path
+
+class GrassMapBuilderEsriToStandard(GrassMapBuilder):
+    """
+    Class for conversion serialised GeoJson to GRASS MAP
+    """
+    def __init__(self,json_file, map):
+        super(GrassMapBuilderEsriToStandard,self).__init__(json_file, map)
+
+    def build(self):
+        geom_type = self._get_type()
+        self.replace_substring(geom_type[0],[1])
+        self.replace_substring('}}}','}}},')
+
+        fst_line= ('{"type": "FeatureCollection","crs": '
+                   '{ "type": "name", "properties": { "name": "urn:ogc:def:crs:OGC:1.3:CRS84" } },"features": [')
+        self._prepend_line(fst_line)
+        self._append_line(']}')
+
+        self._create_map()
+
+class GrassMapBuilderEsriToEsri(GrassMapBuilder):
+    """
+    Class for conversion serialised Esri GeoJson to GRASS MAP
+    """
+    def __init__(self,json_file, map,attributes):
+        super(GrassMapBuilderEsriToEsri,self).__init__(json_file, map,attributes)
+        if not os.path.exists(self.file):
+            return
+
+    def build(self):
+        self._rm_null()
+        geom_type = self._get_type()
+        wkid = self._get_wkid()
+        self.replace_substring('}}','}},')
+
+        header=self._generate_header(geom_type[1],wkid)
+        self._prepend_line(header)
+        self._append_line(']}')
+
+        self._create_map()
+
+    def _generate_header(self,geom_type,wkid):
+
+        cols=''
+        if self.attr:
+            items=self.attr.split(',')
+            for att in items:
+                col,typ=att.split(' ')
+                if 'int' in typ.lower():
+                    typ='esriFieldTypeInteger'
+                if 'str' in typ.lower():
+                    typ = 'esriFieldTypeString'
+                if 'double' in typ.lower():
+                    typ = 'esriFieldTypeDouble'
+                if 'id' in typ.lower():
+                    typ = 'esriFieldTypeOID'
+
+                cols+='{"name":"%s","type":"%s"},'%(col,typ)
+
+
+        cols = cols[:-1]
+
+        if not wkid:
+            wkid='4326' #TODO g.proj.identify3
+        header =('{"objectIdFieldName":"objectid",'
+                 '"globalIdFieldName":"",'
+                 '"geometryType":"%s",'
+                 '"spatialReference":{"wkid":%s},'
+                 '"fields":[%s],'
+                 '"features": ['%(geom_type,wkid,cols))
+
+        return header
+
+    def _get_type(self):
+        logging.info("Get type for file: %s"%self.file)
+        line = stream_lines(self.file) | nth(0)
+        if line.find('ring'):
+            return ['ring','esriGeometryPolygon']
+        if line.find('multipoint'):
+            return ['multipoint','esriGeometryMultipoint']
+        if line.find('paths'):
+            return ['paths','esriGeometryPolyline']
+        if line.find('"x"'):
+            return ['point','esriGeometryPoint']
+        if line.find('envelope'):
+            return ['envelope','esriGeometryEnvelope']
+
+class GrassHdfs():
+    """
+    Helper class for ineteraction between GRASS and HDFS/HIVE
+    """
+    def __init__(self, conn_type):
+        self.conn = None
+        self.hook = None
+        self.conn_type = conn_type
+
+        self._init_connection()
+        if self.hook is None:
+            sys.exit("Connection can not establish")  # TODO
+
+    def _init_connection(self):
+        self.conn = ConnectionManager()
+        self.conn.get_current_connection(self.conn_type)
+        self.hook = self.conn.get_hook()
+
+    @staticmethod
+    def printInfo( hdfs, msg=None):
+        grass.message('***' * 30)
+        if msg:
+            grass.message("     %s \n" % msg)
+        grass.message(' path :\n    %s\n' % hdfs)
+        grass.message('***' * 30)
+
+    def get_path_grass_dataset(self):
+        LOCATION_NAME = grass.gisenv()['LOCATION_NAME']
+        MAPSET = grass.gisenv()['MAPSET']
+        dest_path = os.path.join('grass_data_hdfs', LOCATION_NAME, MAPSET, 'vector')
+        self.mkdir(dest_path)
+        return dest_path
+
+    def upload(self, fs, hdfs, overwrite=True, parallelism=1):
+        logging.info('Trying copy: fs: %s to  hdfs: %s   ' % (fs, hdfs))
+        self.hook.upload_file(fs, hdfs, overwrite, parallelism)
+        self.grass.messageInfo(hdfs, "File has been copied to:")
+
+    def mkdir(self, hdfs):
+        self.hook.mkdir(hdfs)
+        self.grass.messageInfo(hdfs)
+
+    def write(self, hdfs, data, **kwargs):
+        # Write file to hdfs
+        self.hook.write(hdfs, data, **kwargs)
+        self.grass.messageInfo(hdfs)
+
+    def download(self, fs, hdfs, overwrite=True, parallelism=1):
+        logging.info('Trying download : hdfs: %s to fs: %s   ' % (hdfs, fs))
+
+        out = self.hook.download_file( hdfs_path = hdfs,
+                                       local_path = fs,
+                                       overwrite = overwrite,
+                                       parallelism = parallelism)
+        if out:
+            self.grass.messageInfo(out)
+        else:
+            grass.message('Copy error!')
+        return out
+
+


Property changes on: grass-addons/grass7/hadoop/hd/hdfsgrass/hdfs_grass_lib.py
___________________________________________________________________
Added: svn:executable
   + *

Added: grass-addons/grass7/hadoop/hd/hdfsgrass/hdfs_grass_util.py
===================================================================
--- grass-addons/grass7/hadoop/hd/hdfsgrass/hdfs_grass_util.py	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hdfsgrass/hdfs_grass_util.py	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,36 @@
+import csv
+import os
+import tempfile
+
+
+def save_dict(fn, dict_rap):
+    f = open(fn, "wb")
+    w = csv.writer(f, delimiter='=')
+    for key, val in dict_rap.items():
+        if val is None or val == '':
+            continue
+        w.writerow([key, val])
+    f.close()
+
+
+def read_dict(fn):
+    if os.path.exists(fn):
+        f = open(fn, 'r')
+        dict_rap = {}
+        try:
+            for key, val in csv.reader(f, delimiter='='):
+                try:
+                    dict_rap[key] = eval(val)
+                except:
+                    val = '"' + val + '"'
+                    dict_rap[key] = eval(val)
+            f.close()
+            return (dict_rap)
+        except IOError as e:
+            print "I/O error({0}): {1}".format(e.errno, e.strerror)
+    else:
+        return {}
+
+
+def get_tmp_folder():
+    return tempfile.gettempdir()


Property changes on: grass-addons/grass7/hadoop/hd/hdfsgrass/hdfs_grass_util.py
___________________________________________________________________
Added: svn:executable
   + *

Added: grass-addons/grass7/hadoop/hd/hdfswrapper/Makefile
===================================================================
--- grass-addons/grass7/hadoop/hd/hdfswrapper/Makefile	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hdfswrapper/Makefile	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,23 @@
+MODULE_TOPDIR = ../..
+
+include $(MODULE_TOPDIR)/include/Make/Other.make
+include $(MODULE_TOPDIR)/include/Make/Python.make
+
+DSTDIR = $(ETC)/hd/hdfswrapper
+
+MODULES = $(wildcard *.py)
+
+PYFILES := $(patsubst %,$(DSTDIR)/%,$(MODULES))
+PYCFILES := $(patsubst %.py,$(DSTDIR)/%.pyc,$(filter %.py,$(MODULES)))
+
+default: $(PYFILES) $(PYCFILES)
+
+install:
+	$(MKDIR) $(INST_DIR)/etc/hd/hdfswrapper
+	@cp -rL $(DSTDIR) $(INST_DIR)/etc/hd
+
+$(DSTDIR):
+	$(MKDIR) -p $@
+
+$(DSTDIR)/%: % | $(DSTDIR)
+	$(INSTALL_DATA) $< $@

Added: grass-addons/grass7/hadoop/hd/hdfswrapper/__init__.py
===================================================================
--- grass-addons/grass7/hadoop/hd/hdfswrapper/__init__.py	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hdfswrapper/__init__.py	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,15 @@
+import os
+
+import base_hook
+import connections
+import hdfs_hook
+import hive_hook
+import security_utils
+import settings
+import webhdfs_hook
+
+for module in os.listdir(os.path.dirname(__file__)):
+    if module == '__init__.py' or module[-3:] != '.py':
+        continue
+    __import__(module[:-3], locals(), globals())
+del module

Added: grass-addons/grass7/hadoop/hd/hdfswrapper/base_hook.py
===================================================================
--- grass-addons/grass7/hadoop/hd/hdfswrapper/base_hook.py	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hdfswrapper/base_hook.py	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,57 @@
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+from __future__ import unicode_literals
+
+import logging
+import os
+import random
+
+from builtins import object
+
+from hdfswrapper import settings
+from hdfswrapper.connections import Connection
+
+CONN_ENV_PREFIX = 'GRASSHIVE_CONN_'
+
+
+class BaseHook(object):
+    """
+    Abstract base class for hooks, hooks are meant as an interface to
+    interact with external systems. HDFS,  HiveHook, return
+    object that can handle the connection and interaction to specific
+    instances of these systems, and expose consistent methods to interact
+    with them.
+    """
+
+    def __init__(self, source):
+        pass
+
+    @classmethod
+    def get_connections(cls, conn_id):
+        session = settings.Session()
+
+        db = (session.query(Connection).filter(Connection.conn_id == conn_id).all())
+        if not db:
+            raise Exception(
+                "The conn_id `{0}` isn't defined".format(conn_id))
+        session.expunge_all()
+        session.close()
+        return db
+
+    @classmethod
+    def get_connection(cls, conn_id):
+        environment_uri = os.environ.get(CONN_ENV_PREFIX + conn_id.upper())
+        conn = None
+        if environment_uri:
+            conn = Connection(conn_id=conn_id, uri=environment_uri)
+        else:
+            conn = random.choice(cls.get_connections(conn_id))
+        if conn.host:
+            logging.info("Using connection to: " + conn.host)
+        return conn
+
+    @classmethod
+    def get_hook(cls, conn_id):
+        connection = cls.get_connection(conn_id)
+        return connection.get_hook()

Added: grass-addons/grass7/hadoop/hd/hdfswrapper/connections.py
===================================================================
--- grass-addons/grass7/hadoop/hd/hdfswrapper/connections.py	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hdfswrapper/connections.py	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,178 @@
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+from __future__ import unicode_literals
+
+from future.standard_library import install_aliases
+
+install_aliases()
+from builtins import bytes
+import json
+import logging
+from urlparse import urlparse
+from sqlalchemy import (
+    Column, Integer, String, Boolean)
+from sqlalchemy.ext.declarative import declarative_base, declared_attr
+# from sqlalchemy.dialects.mysql import LONGTEXT
+from sqlalchemy.orm import synonym
+from hdfswrapper import settings
+
+Base = declarative_base()
+ID_LEN = 250
+# SQL_ALCHEMY_CONN = configuration.get('core', 'SQL_ALCHEMY_CONN')
+# SQL_ALCHEMY_CONN ='sqlite:////home/matt/Dropbox/DIPLOMKA/sqlitedb.db'
+
+ENCRYPTION_ON = False
+
+
+# try:
+#     from cryptography.fernet import Fernet
+#     FERNET = Fernet(configuration.get('core', 'FERNET_KEY').encode('utf-8'))
+#     ENCRYPTION_ON = True
+# except:
+#     pass
+
+class InitStorage:
+    def __init__(self, connection):
+        self.conn = connection
+        self.engine = settings.engine
+        Base.metadata.create_all(self.engine)
+
+
+class Connection(Base):
+    """
+    Placeholder to store information about different database instances
+    connection information. The idea here is that scripts use references to
+    database instances (conn_id) instead of hard coding hostname, logins and
+    passwords when using operators or hooks.
+    """
+    __tablename__ = "connection"
+
+    id = Column(Integer(), primary_key=True)
+    conn_id = Column(String(ID_LEN), unique=True)
+    conn_type = Column(String(500))
+    host = Column(String(500))
+    schema = Column(String(500))
+    login = Column(String(500))
+    _password = Column('password', String(5000))
+    port = Column(Integer())
+    is_encrypted = Column(Boolean, unique=False, default=False)
+    is_extra_encrypted = Column(Boolean, unique=False, default=False)
+    _extra = Column('extra', String(5000))
+
+    def __init__(self, conn_id=None,
+                 conn_type=None,
+                 host=None,
+                 login=None,
+                 password=None,
+                 schema=None,
+                 port=None,
+                 extra=None,
+                 uri=None):
+
+        self.conn_id = conn_id
+        if uri:
+            self.parse_from_uri(uri)
+        else:
+            self.conn_type = conn_type
+            self.host = host
+            self.login = login
+            self.password = password
+            self.schema = schema
+            self.port = port
+            self.extra = extra
+        self.init_connection_db()
+
+    def init_connection_db(self):
+        InitStorage(self)
+
+    def parse_from_uriparse_from_uri(self, uri):
+        temp_uri = urlparse(uri)
+        hostname = temp_uri.hostname or ''
+        if '%2f' in hostname:
+            hostname = hostname.replace('%2f', '/').replace('%2F', '/')
+        conn_type = temp_uri.scheme
+        if conn_type == 'postgresql':
+            conn_type = 'postgres'
+        self.conn_type = conn_type
+        self.host = hostname
+        self.schema = temp_uri.path[1:]
+        self.login = temp_uri.username
+        self.password = temp_uri.password
+        self.port = temp_uri.port
+
+    def get_password(self):
+        if self._password and self.is_encrypted:
+            if not ENCRYPTION_ON:
+                raise Exception(
+                    "Can't decrypt, configuration is missing")
+            return FERNET.decrypt(bytes(self._password, 'utf-8')).decode()
+        else:
+            return self._password
+
+    def set_password(self, value):
+        if value:
+            try:
+                self._password = FERNET.encrypt(bytes(value, 'utf-8')).decode()
+                self.is_encrypted = True
+            except NameError:
+                self._password = value
+                self.is_encrypted = False
+
+    @declared_attr
+    def password(cls):
+        return synonym('_password',
+                       descriptor=property(cls.get_password, cls.set_password))
+
+    def get_extra(self):
+        if self._extra and self.is_extra_encrypted:
+            if not ENCRYPTION_ON:
+                raise Exception(
+                    "Can't decrypt `extra`, configuration is missing")
+            return FERNET.decrypt(bytes(self._extra, 'utf-8')).decode()
+        else:
+            return self._extra
+
+    def set_extra(self, value):
+        if value:
+            try:
+                self._extra = FERNET.encrypt(bytes(value, 'utf-8')).decode()
+                self.is_extra_encrypted = True
+            except NameError:
+                self._extra = value
+                self.is_extra_encrypted = False
+
+    @declared_attr
+    def extra(cls):
+        return synonym('_extra',
+                       descriptor=property(cls.get_extra, cls.set_extra))
+
+    def get_hook(self):
+        from  hdfswrapper import hive_hook, webhdfs_hook, hdfs_hook
+
+        if self.conn_type == 'hive_cli':
+            return hive_hook.HiveCliHook(hive_cli_conn_id=self.conn_id)
+        elif self.conn_type == 'hiveserver2':
+            return hive_hook.HiveServer2Hook(hiveserver2_conn_id=self.conn_id)
+        elif self.conn_type == 'webhdfs':
+            return webhdfs_hook.WebHDFSHook(webhdfs_conn_id=self.conn_id)
+        elif self.conn_type == 'hdfs':
+            print(self.conn_id)
+            return hdfs_hook.HDFSHook(hdfs_conn_id=self.conn_id)
+
+    def __repr__(self):
+        return self.conn_id
+
+    @property
+    def extra_dejson(self):
+        """Returns the extra property by deserializing json"""
+        obj = {}
+        if self.extra:
+            try:
+                obj = json.loads(self.extra)
+            except Exception as e:
+                logging.exception(e)
+                logging.error(
+                    "Failed parsing the json for "
+                    "conn_id {}".format(self.conn_id))
+        return obj

Added: grass-addons/grass7/hadoop/hd/hdfswrapper/hdfs_hook.py
===================================================================
--- grass-addons/grass7/hadoop/hd/hdfswrapper/hdfs_hook.py	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hdfswrapper/hdfs_hook.py	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,84 @@
+from hdfswrapper.base_hook import BaseHook
+
+try:
+    snakebite_imported = True
+    from snakebite.client import Client, HAClient, Namenode
+except ImportError:
+    snakebite_imported = False
+
+
+class HDFSHookException(Exception):
+    pass
+
+
+class HDFSHook(BaseHook):
+    '''
+    Interact with HDFS. This class is a wrapper around the snakebite library.
+    '''
+
+    def __init__(self, hdfs_conn_id='hdfs_default', proxy_user=None):
+        if not snakebite_imported:
+            raise ImportError(
+                'This HDFSHook implementation requires snakebite, but '
+                'snakebite is not compatible with Python 3 '
+                '(as of August 2015). Please use Python 2 if you require '
+                'this hook  -- or help by submitting a PR!')
+        self.hdfs_conn_id = hdfs_conn_id
+        self.proxy_user = proxy_user
+
+    def get_conn(self):
+        '''
+        Returns a snakebite HDFSClient object.
+        '''
+        use_sasl = False
+        securityConfig = None
+        if securityConfig == 'kerberos':  # TODO make confugration file for thiw
+            use_sasl = True
+
+        connections = self.get_connections(self.hdfs_conn_id)
+        client = None
+        # When using HAClient, proxy_user must be the same, so is ok to always take the first
+        effective_user = self.proxy_user or connections[0].login
+        if len(connections) == 1:
+            client = Client(connections[0].host, connections[0].port, use_sasl=use_sasl, effective_user=effective_user)
+        elif len(connections) > 1:
+            nn = [Namenode(conn.host, conn.port) for conn in connections]
+            client = HAClient(nn, use_sasl=use_sasl, effective_user=effective_user)
+        else:
+            raise HDFSHookException("conn_id doesn't exist in the repository")
+        return client
+
+    def test(self):
+        try:
+            client = self.get_conn()
+            print('***' * 30)
+            print("\n    Test connection (ls /) \n")
+            print('***' * 30)
+            print(type(client.count(['/'])))
+            print('-' * 40 + '\n')
+            return False
+        except Exception, e:
+            print("     EROOR: connection can not be established: %s \n" % e)
+            print('***' * 30)
+            return False
+
+    def download_file(self):
+        raise NotImplementedError
+
+    def mkdir(self):
+        raise NotImplementedError
+
+    def write(self):
+        raise NotImplementedError
+
+    def load_file(self):
+        raise NotImplementedError
+
+    def check_for_path(self):
+        raise NotImplementedError
+
+    def get_cursor(self):
+        raise NotImplementedError
+
+    def execute(self, hql):
+        raise NotImplementedError

Added: grass-addons/grass7/hadoop/hd/hdfswrapper/hive_hook.py
===================================================================
--- grass-addons/grass7/hadoop/hd/hdfswrapper/hive_hook.py	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hdfswrapper/hive_hook.py	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,464 @@
+from __future__ import print_function
+
+import sys
+import csv
+import logging
+import re
+import subprocess
+
+import pyhs2
+from builtins import zip
+from past.builtins import basestring
+from thrift.protocol import TBinaryProtocol
+from thrift.transport import TSocket, TTransport
+
+import security_utils as utils
+from base_hook import BaseHook
+
+
+from hdfswrapper.hive_table import HiveSpatial
+
+class HiveCliHook(BaseHook, HiveSpatial):
+    """
+    Simple wrapper around the hive CLI.
+
+    It also supports the ``beeline``
+    a lighter CLI that runs JDBC and is replacing the heavier
+    traditional CLI. To enable ``beeline``, set the use_beeline param in the
+    extra field of your connection as in ``{ "use_beeline": true }``
+
+    Note that you can also set default hive CLI parameters using the
+    ``hive_cli_params`` to be used in your connection as in
+    ``{"hive_cli_params": "-hiveconf mapred.job.tracker=some.jobtracker:444"}``
+
+    The extra connection parameter ``auth`` gets passed as in the ``jdbc``
+    connection string as is.
+
+    """
+
+    def __init__(self, hive_cli_conn_id="hive_cli_default", run_as=None):
+        conn = self.get_connection(hive_cli_conn_id)
+        self.hive_cli_params = conn.extra_dejson.get('hive_cli_params', '')
+        self.use_beeline = conn.extra_dejson.get('use_beeline', True)
+        self.auth = conn.extra_dejson.get('auth', 'noSasl')
+        self.conn = conn
+        self.run_as = run_as
+
+    def execute(self, hql, schema=None, verbose=True):
+        """
+        Run an hql statement using the hive cli
+
+        >>> hh = HiveCliHook()
+        >>> result = hh.execute("USE default;")
+        >>> ("OK" in result)
+        True
+        """
+
+        conn = self.conn
+        schema = schema or conn.schema
+        if schema:
+            hql = "USE {schema};\n{hql}".format(**locals())
+        import tempfile, os
+
+        tmp_dir = tempfile.gettempdir()
+        if not os.path.isdir(tmp_dir):
+            os.mkdir(tmp_dir)
+
+        tmp_file = os.path.join(tmp_dir, 'tmpfile.hql')
+        if os.path.exists(tmp_file):
+            os.remove(tmp_file)
+
+        with open (tmp_file,'a') as f:
+            f.write(hql)
+            f.flush()
+            fname = f.name
+            hive_bin = 'hive'
+            cmd_extra = []
+
+            if self.use_beeline:
+                hive_bin = 'beeline'
+                jdbc_url = "jdbc:hive2://{conn.host}:{conn.port}/{conn.schema}"
+                securityConfig = None
+                if securityConfig == 'kerberos':  # TODO make confugration file for thiw
+                    template = conn.extra_dejson.get('principal', "hive/_HOST at EXAMPLE.COM")
+                    if "_HOST" in template:
+                        template = utils.replace_hostname_pattern(utils.get_components(template))
+
+                    proxy_user = ""
+                    if conn.extra_dejson.get('proxy_user') == "login" and conn.login:
+                        proxy_user = "hive.server2.proxy.user={0}".format(conn.login)
+                    elif conn.extra_dejson.get('proxy_user') == "owner" and self.run_as:
+                        proxy_user = "hive.server2.proxy.user={0}".format(self.run_as)
+
+                    jdbc_url += ";principal={template};{proxy_user}"
+                elif self.auth:
+                    jdbc_url += ";auth=" + self.auth
+
+                jdbc_url = jdbc_url.format(**locals())
+
+                cmd_extra += ['-u', jdbc_url]
+                if conn.login:
+                    cmd_extra += ['-n', conn.login]
+                if conn.password:
+                    cmd_extra += ['-p', conn.password]
+
+            hive_cmd = [hive_bin, '-f', fname] + cmd_extra
+            hive_cmd = [hive_bin, '-e', hql] + cmd_extra
+
+            if self.hive_cli_params:
+                hive_params_list = self.hive_cli_params.split()
+                hive_cmd.extend(hive_params_list)
+            if verbose:
+                logging.info(" ".join(hive_cmd))
+
+            logging.info('hive_cmd= %s\ntmp_dir= %s' % (hive_cmd, tmp_dir))
+
+            sp = subprocess.Popen(
+                hive_cmd,
+                stdout=subprocess.PIPE,
+                stderr=subprocess.STDOUT,
+                cwd=tmp_dir)
+            self.sp = sp
+            stdout = ''
+            for line in iter(sp.stdout.readline, ''):
+                stdout += line
+                if verbose:
+                    logging.info(line.strip())
+            sp.wait()
+
+            if sp.returncode:
+                raise Exception(stdout)
+
+            return stdout
+
+    def show_tables(self):
+        return self.execute('show tables')
+
+    def test(self):
+        out = self.show_tables()
+        try:
+            print("\n     Test connection (show databases;)\n        %s\n" % out)
+            print('***' * 30)
+
+            return True
+        except Exception, e:
+            print("      EROOR: connection can not be established:\n      %s\n" % e)
+            print('***' * 30)
+            return False
+
+    def test_hql(self, hql):
+        """
+        Test an hql statement using the hive cli and EXPLAIN
+        """
+        create, insert, other = [], [], []
+        for query in hql.split(';'):  # naive
+            query_original = query
+            query = query.lower().strip()
+
+            if query.startswith('create table'):
+                create.append(query_original)
+            elif query.startswith(('set ',
+                                   'add jar ',
+                                   'create temporary function')):
+                other.append(query_original)
+            elif query.startswith('insert'):
+                insert.append(query_original)
+        other = ';'.join(other)
+        for query_set in [create, insert]:
+            for query in query_set:
+
+                query_preview = ' '.join(query.split())[:50]
+                logging.info("Testing HQL [{0} (...)]".format(query_preview))
+                if query_set == insert:
+                    query = other + '; explain ' + query
+                else:
+                    query = 'explain ' + query
+                try:
+                    self.execute(query, verbose=False)
+                except Exception as e:
+                    message = e.args[0].split('\n')[-2]
+                    logging.info(message)
+                    error_loc = re.search('(\d+):(\d+)', message)
+                    if error_loc and error_loc.group(1).isdigit():
+                        l = int(error_loc.group(1))
+                        begin = max(l - 2, 0)
+                        end = min(l + 3, len(query.split('\n')))
+                        context = '\n'.join(query.split('\n')[begin:end])
+                        logging.info("Context :\n {0}".format(context))
+                else:
+                    logging.info("SUCCESS")
+
+    def kill(self):
+        if hasattr(self, 'sp'):
+            if self.sp.poll() is None:
+                print("Killing the Hive job")
+                self.sp.kill()
+
+    def drop_table(self, name):
+        self.execute('DROP TABLE IF EXISTS %s' % name)
+
+class HiveMetastoreHook(BaseHook):
+    """
+    Wrapper to interact with the Hive Metastore
+    """
+
+    def __init__(self, metastore_conn_id='metastore_default'):
+        self.metastore_conn = self.get_connection(metastore_conn_id)
+        self.metastore = self.get_metastore_client()
+
+    def __getstate__(self):
+        # This is for pickling to work despite the thirft hive client not
+        # being pickable
+        d = dict(self.__dict__)
+        del d['metastore']
+        return d
+
+    def __setstate__(self, d):
+        self.__dict__.update(d)
+        self.__dict__['metastore'] = self.get_metastore_client()
+
+    def get_metastore_client(self):
+        """
+        Returns a Hive thrift client.
+        """
+        from hive_service import ThriftHive
+
+        ms = self.metastore_conn
+        transport = TSocket.TSocket(ms.host, ms.port)
+        transport = TTransport.TBufferedTransport(transport)
+        protocol = TBinaryProtocol.TBinaryProtocol(transport)
+        return ThriftHive.Client(protocol)
+
+    def get_conn(self):
+        return self.metastore
+
+    def check_for_partition(self, schema, table, partition):
+        """
+        Checks whether a partition exists
+
+        >>> hh = HiveMetastoreHook()
+        >>> t = 'streets'
+        >>> hh.check_for_partition('default', t, "ds='2015-01-01'")
+        True
+        """
+        self.metastore._oprot.trans.open()
+        partitions = self.metastore.get_partitions_by_filter(
+            schema, table, partition, 1)
+        self.metastore._oprot.trans.close()
+        if partitions:
+            return True
+        else:
+            return False
+
+    def get_table(self, table_name, db='default'):
+        '''
+        Get a metastore table object
+
+        >>> hh = HiveMetastoreHook()
+        >>> t = hh.get_table(db='default', table_name='streets')
+        >>> t.tableName
+        'static_babynames'
+        >>> [col.name for col in t.sd.cols]
+        ['state', 'year', 'name', 'gender', 'num']
+        '''
+        self.metastore._oprot.trans.open()
+        if db == 'default' and '.' in table_name:
+            db, table_name = table_name.split('.')[:2]
+        table = self.metastore.get_table(dbname=db, tbl_name=table_name)
+        self.metastore._oprot.trans.close()
+        return table
+
+    def get_tables(self, db, pattern='*'):
+        '''
+        Get a metastore table object
+        '''
+        self.metastore._oprot.trans.open()
+        tables = self.metastore.get_tables(db_name=db, pattern=pattern)
+        objs = self.metastore.get_table_objects_by_name(db, tables)
+        self.metastore._oprot.trans.close()
+        return objs
+
+    def get_databases(self, pattern='*'):
+        '''
+        Get a metastore table object
+        '''
+        self.metastore._oprot.trans.open()
+        dbs = self.metastore.get_databases(pattern)
+        self.metastore._oprot.trans.close()
+        return dbs
+
+    def get_partitions(
+            self, schema, table_name, filter=None):
+        '''
+        Returns a list of all partitions in a table. Works only
+        for tables with less than 32767 (java short max val).
+        For subpartitioned table, the number might easily exceed this.
+
+        >>> hh = HiveMetastoreHook()
+        >>> t = 'dmt30'
+        >>> parts = hh.get_partitions(schema='grassdb', table_name=t)
+        >>> len(parts)
+        1
+        >>> parts
+        [{'ds': '2015-01-01'}]
+        '''
+        self.metastore._oprot.trans.open()
+        table = self.metastore.get_table(dbname=schema, tbl_name=table_name)
+        if len(table.partitionKeys) == 0:
+            raise Exception("The table isn't partitioned")
+        else:
+            if filter:
+                parts = self.metastore.get_partitions_by_filter(
+                    db_name=schema, tbl_name=table_name,
+                    filter=filter, max_parts=32767)
+            else:
+                parts = self.metastore.get_partitions(
+                    db_name=schema, tbl_name=table_name, max_parts=32767)
+
+            self.metastore._oprot.trans.close()
+            pnames = [p.name for p in table.partitionKeys]
+            return [dict(zip(pnames, p.values)) for p in parts]
+
+    def max_partition(self, schema, table_name, field=None, filter=None):
+        '''
+        Returns the maximum value for all partitions in a table. Works only
+        for tables that have a single partition key. For subpartitioned
+        table, we recommend using signal tables.
+
+        >>> hh = HiveMetastoreHook()
+        >>> t = 'static_babynames_partitioned'
+        >>> hh.max_partition(schema='default', table_name=t)
+        '2015-01-01'
+        '''
+        parts = self.get_partitions(schema, table_name, filter)
+        if not parts:
+            return None
+        elif len(parts[0]) == 1:
+            field = list(parts[0].keys())[0]
+        elif not field:
+            raise Exception(
+                "Please specify the field you want the max "
+                "value for")
+
+        return max([p[field] for p in parts])
+
+    def table_exists(self, table_name, db='default'):
+        '''
+        Check if table exists
+
+        >>> hh = HiveMetastoreHook()
+        >>> hh.table_exists(db='hivedb', table_name='static_babynames')
+        True
+        >>> hh.table_exists(db='hivedb', table_name='does_not_exist')
+        False
+        '''
+        try:
+            t = self.get_table(table_name, db)
+            return True
+        except Exception as e:
+            return False
+
+class HiveServer2Hook(BaseHook, HiveSpatial):
+    '''
+    Wrapper around the pyhs2 library
+
+    Note that the default authMechanism is NOSASL, to override it you
+    can specify it in the ``extra`` of your connection in the UI as in
+    ``{"authMechanism": "PLAIN"}``. Refer to the pyhs2 for more details.
+    '''
+
+    def __init__(self, hiveserver2_conn_id='hiveserver2_default'):
+        self.hiveserver2_conn_id = hiveserver2_conn_id
+
+    def get_conn(self):
+        db = self.get_connection(self.hiveserver2_conn_id)
+        # auth_mechanism = db.extra_dejson.get('authMechanism', 'SASL')
+        auth_mechanism = 'PLAIN'  # TODO
+
+        securityConfig = None
+        if securityConfig == 'kerberos':  # TODO make confugration file for this
+            auth_mechanism = db.extra_dejson.get('authMechanism', 'KERBEROS')
+
+        return pyhs2.connect(
+            host=str(db.host),
+            port=int(db.port),
+            authMechanism=str(auth_mechanism),
+            user=str(db.login),
+            password=str(db.password),
+            database=str(db.schema))
+
+    def get_results(self, hql, schema='default', arraysize=1000):
+
+        with self.get_conn() as conn:
+            if isinstance(hql, basestring):
+                hql = [hql]
+            results = {
+                'data': [],
+                'header': [],
+            }
+            for statement in hql:
+                with conn.cursor() as cur:
+                    cur.execute(statement)
+                    records = cur.fetchall()
+                    if records:
+                        results = {
+                            'data': records,
+                            'header': cur.getSchema(),
+                        }
+            return results
+
+    def to_csv(self,
+               hql,
+               csv_filepath,
+               schema='default',
+               delimiter=',',
+               lineterminator='\r\n',
+               output_header=True):
+
+        schema = schema or 'default'
+        with self.get_conn() as conn:
+            with conn.cursor() as cur:
+                logging.info("Running query: " + hql)
+                cur.execute(hql)
+                schema = cur.getSchema()
+                with open(csv_filepath, 'w') as f:
+                    writer = csv.writer(f, delimiter=delimiter,
+                                        lineterminator=lineterminator)
+                    if output_header:
+                        writer.writerow([c['columnName']
+                                         for c in cur.getSchema()])
+                    i = 0
+                    while cur.hasMoreRows:
+                        rows = [row for row in cur.fetchmany() if row]
+                        writer.writerows(rows)
+                        i += len(rows)
+                        logging.info("Written {0} rows so far.".format(i))
+                    logging.info("Done. Loaded a total of {0} rows.".format(i))
+
+    def get_records(self, hql, schema='default'):
+        """
+        Get a set of records from a Hive query.
+
+        >>> hh = HiveServer2Hook()
+        >>> sql = "SELECT * FROM default.static_babynames LIMIT 100"
+        >>> len(hh.get_records(sql))
+        100
+        """
+        return self.get_results(hql, schema=schema)['data']
+
+    def get_cursor(self):
+        conn = self.get_conn()
+        return conn.cursor()
+
+    def execute(self, hql, fatch=False):
+        with self.get_conn() as conn:
+            with conn.cursor() as cur:
+                logging.info("Running query: " + hql)
+                try:
+                    cur.execute(hql)
+
+                except Exception, e:
+                    print("Execute error: %s" % e)
+                    return None
+                if fatch:
+                    return cur.fetchall()

Added: grass-addons/grass7/hadoop/hd/hdfswrapper/hive_table.py
===================================================================
--- grass-addons/grass7/hadoop/hd/hdfswrapper/hive_table.py	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hdfswrapper/hive_table.py	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,451 @@
+import logging
+from utils import string2dict, find_ST_fnc
+import sys
+import os
+
+class HiveBaseTable(object):
+    """
+    Base class for creating Hive tables - table factory
+    """
+    def __init__(self,
+                name,
+                col,
+                temporary = False,
+                external = False,
+                exists = True,
+                db_name = None,
+                comment = None,
+                partitioned= None,
+                clustered = None,
+                sorted = None,
+                skewed = None,
+                row_format=None,
+                stored=None,
+                outputformat=None,
+                location = None,
+                tbl_properties = None):
+
+        self.db_name = db_name
+        self.name = name
+        self.temporary = temporary
+        self.external = external
+        self.exists = exists
+        self.col = col
+        self.comment = comment
+        self.partitioned= partitioned
+        self.clustered = clustered
+        self.sorted = sorted
+        self.skewed = skewed
+        self.row_format=row_format
+        self.stored=stored
+        self.outputformat=outputformat
+        self.location = location
+        self.tbl_properties = tbl_properties
+
+        self.hql = ''
+
+    def get_table(self):
+        self._base()
+        self._col()
+        self._partitioned()
+        self._cluster()
+        self._row_format()
+        self._stored()
+        self._location()
+        self._tbl_prop()
+
+        return self.hql
+
+    def _base(self):
+        self.hql = 'CREATE'
+        if self.temporary:
+            self.hql+= ' TEMPORARY'
+        if self.external:
+            self.hql+= ' EXTERNAL'
+        self.hql += ' TABLE'
+        if self.exists:
+            self.hql+= ' IF NOT EXISTS'
+        if self.db_name:
+            self.hql+= " %s.%s"%(self.db_name,self.name)
+        else:
+            self.hql+= " %s"%self.name
+
+    def _col(self):
+
+        self.hql+=' (%s)'%self.col
+
+    def _partitioned(self):
+        if self.partitioned:
+            self.hql+=' PARTITIONED BY (%s)'%self.partitioned
+
+    def _cluster(self):
+        if self.clustered:
+            self.hql+=' CLUSTERED BY (%s)'%self.clustered
+
+    #def _skewed(self):
+    #    if self.skewed:
+    #        self.hql+=' SKEWED BY (%s)'%self.skewed
+    def _row_format(self):
+        if self.row_format:
+            self.hql+=" ROW FORMAT '%s'"%self.row_format
+
+    def _stored(self):
+        #hint
+        # STORED AS INPUTFORMAT 'com.esri.json.hadoop.UnenclosedJsonInputFormat'
+        # OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat';
+        if self.stored:
+            self.hql+=" STORED AS INPUTFORMAT %s"%self.stored
+            if self.outputformat:
+                self.hql+=" OUTPUTFORMAT '%s'"%self.outputformat
+        else:
+            self.hql+=" STORED AS TEXTFILE"
+
+    def _location(self):
+        if self.location:
+            self.hql+=' LOCATION %s'%self.location
+
+    def _tbl_prop(self):
+        if self.tbl_properties:
+            self.hql+=' TBLPROPERTIES (%s)'%self.tbl_properties
+
+
+class HiveJsonTable(HiveBaseTable):
+    """
+    Table factory for Json tables
+    """
+    def __init__(self,
+                name,
+                col ,
+                db_name = None,
+                temporary = False,
+                external = False,
+                exists = True,
+                comment = None,
+                partitioned= None,
+                clustered = None,
+                sorted = None,
+                skewed = None,
+                row_format=None,
+                stored=None,
+                location = None,
+                outputformat=None,
+                tbl_properties = None):
+
+        super(HiveJsonTable,self).__init__(name=name,
+                                      col=col,
+                                      db_name=db_name,
+                                      temporary=temporary,
+                                      external=external,
+                                      exists=exists,
+                                      comment=comment,
+                                      partitioned=partitioned,
+                                      clustered=clustered,
+                                      stored=sorted,
+                                      skewed=skewed,
+                                      row_format=row_format,
+                                      location=location,
+                                      sorted=sorted,
+                                      outputformat=outputformat,
+
+                                      tbl_properties=tbl_properties)
+        self.outputformat=outputformat
+
+    def get_table(self):
+        self._base()
+        self._col()
+        self._partitioned()
+        self._row_format()
+        self._stored()
+        self._location()
+
+        return self.hql
+
+    def _row_format(self):
+        if self.row_format:
+            self.hql+=" ROW FORMAT SERDE '%s'"%self.row_format
+
+class HiveCsvTable(HiveBaseTable):
+    """
+    Table factory for CSV tables
+    """
+    def __init__(self,
+                name,
+                col ,
+                db_name = None,
+                temporary = False,
+                external = False,
+                exists = True,
+                comment = None,
+                partitioned= None,
+                clustered = None,
+                sorted = None,
+                skewed = None,
+                row_format=None,
+                stored=None,
+                outputformat=None,
+                location = None,
+                delimeter=',',
+                tbl_properties = None):
+
+        super(HiveCsvTable,self).__init__(name=name,
+                                      col=col,
+                                      db_name=db_name,
+                                      temporary=temporary,
+                                      external=external,
+                                      exists=exists,
+                                      comment=comment,
+                                      partitioned=partitioned,
+                                      clustered=clustered,
+                                      outputformat=outputformat,
+                                      stored=sorted,
+                                      skewed=skewed,
+                                      row_format=row_format,
+                                      location=location,
+                                      sorted=sorted,
+                                      tbl_properties=tbl_properties)
+        self.delimeter=delimeter
+
+    def get_table(self):
+        self._base()
+        self._col()
+        self._partitioned()
+        self._row_format()
+        self._stored()
+        self._location()
+        self._tbl_prop()
+
+        return self.hql
+
+    def _row_format(self):
+        if not self.row_format:
+            self.hql+=(" ROW FORMAT DELIMITED FIELDS TERMINATED"
+                       " BY '%s'"%self.delimeter)
+        else:
+             self.hql+=' ROW FORMAT %s'%self.row_format
+
+class HiveSpatial(object):
+    """
+    Factory for spatial queries
+    """
+    def execute(self, hql):
+        NotImplementedError()
+
+    def show_tables(self):
+        hql = 'show tables'
+        res = self.execute(hql, True)
+        if res:
+            print('***' * 30)
+            print('   show tables:')
+            for i in res:
+                print('         %s' % i[0])
+            print('***' * 30)
+
+    def add_functions(self, fce_dict, temporary=False):
+        """
+        :param fce_dict:
+        :type fce_dict:
+        :param temporary:
+        :type temporary:
+        :return:
+        :rtype:
+        """
+        hql = ''
+        for key, val in fce_dict.iteritems():
+            if temporary:
+                hql += "CREATE TEMPORARY FUNCTION %s as '%s'\n" % (key, val)
+            else:
+                hql += "CREATE FUNCTION %s as '%s'\n" % (key, val)
+        self.execute(hql)
+
+    def describe_table(self, table, show=False):
+        hql = "DESCRIBE formatted %s" % table
+        out = self.execute(hql, True)
+        if show:
+            for i in out:
+                print(i)
+        return out
+
+    def find_table_location(self, table):
+
+        out = self.describe_table(table)
+        #print out
+        if out:
+            for cell in out:
+                if 'Location:' in cell[0]:
+                    logging.info("Location of file in hdfs:  %s" % cell[1])
+                    path = cell[1].split('/')
+                    #print path
+
+                    path = '/'+'/'.join(path[3:]) #todo windows
+                    logging.info("path to table {} ".format(path))
+                    return path
+        return None
+
+    def esri_query(self, hsql, temporary=True):
+        STfce = ''
+        ST = find_ST_fnc(hsql)
+        tmp = ''
+        if temporary:
+            tmp = 'temporary'
+
+        for key, vals in ST.iteritems():
+            STfce += "create {tmp} function {key} as '{vals}' \n"
+
+        hql = STfce.format(**locals())
+        logging.info(hql)
+
+        hsqlexe = '%s\n%s' % (STfce, hsql)
+
+        self.execute(hsqlexe)
+
+    def test(self):
+        hql = 'show databases'
+        try:
+            print('***' * 30)
+            res = self.execute(hql, True)
+            print("\n     Test connection (show databases;) \n       %s\n" % res)
+            print('***' * 30)
+            return True
+        except Exception, e:
+            print("     EROOR: connection can not be established:\n       %s\n" % e)
+            print('***' * 30)
+            return False
+
+    def add_jar(self, jar_list, path=False):
+        """
+        Function for adding jars to the hive path.
+        :param jar_list: list of jars
+        :type jar_list: list
+        :param path: if true , jar_list must incliudes path \
+                 to jar, else by default jars must be in ${env:HIVE_HOME}
+        :type path: bool
+        :return:
+        :rtype:
+        """
+        hql = ''
+        for jar in jar_list:
+            if jar:
+
+                if not path:
+                    hql += 'ADD JAR /usr/local/spatial/jar/%s ' % jar
+                else:
+                    hql += 'ADD JAR %s ' % jar
+                logging.info(hql)
+        hql += '\n'
+        return hql
+
+    def create_geom_table(self,
+                          table,
+                          field=None,
+                          serde='org.openx.data.jsonserde.JsonSerDe',
+                          outputformat=None,
+                          stored=None,
+                          external=False,
+                          recreate=False,
+                          filepath=None,
+                          overwrite=None,
+                          partitioned=None,
+                          ):
+
+        tbl = HiveJsonTable(name=table,
+                          col=field,
+                          row_format=serde,
+                          stored=stored,
+                          exists=recreate,
+                          external=external,
+                          partitioned=partitioned,
+                          outputformat=outputformat
+                          )
+
+        hql= tbl.get_table()
+
+        if recreate:
+            self.drop_table(table)
+
+        if filepath:
+            self.data2table(filepath, table, overwrite)
+
+        logging.info(hql)
+        self.execute(hql)
+
+
+    def data2table(self, filepath, table, overwrite, partition=False):
+        """
+
+        :param filepath: path to hdfs data
+        :type filepath: string
+        :param table: name of table
+        :type table: string
+        :param overwrite: if overwrite data in table
+        :type overwrite: bool;
+        :param partition: target partition as a dict of partition columns and values
+        :type partition: dict;
+        :return:
+        :rtype:
+        """
+
+        hql = "LOAD DATA INPATH '{filepath}' "
+        if overwrite:
+            hql += "OVERWRITE "
+        hql += "INTO TABLE {table} "
+
+        if partition:
+            pvals = ", ".join(
+                ["{0}='{1}'".format(k, v) for k, v in partition.items()])
+            hql += "PARTITION ({pvals})"
+        hql = hql.format(**locals())
+        logging.info(hql)
+        self.execute(hql)
+
+    def create_csv_table(
+            self,
+            filepath,
+            table,
+            delimiter=",",
+            field=None,
+            stored=None,
+            outputformat=None,
+            create=True,
+            serde=None,
+            external=False,
+            overwrite=True,
+            partition=None,
+            tblproperties=None,
+            recreate=False):
+        """
+        Loads a local file into Hive
+
+        Note that the table generated in Hive uses ``STORED AS textfile``
+        which isn't the most efficient serialization format. If a
+        large amount of data is loaded and/or if the tables gets
+        queried considerably, you may want to use this operator only to
+        stage the data into a temporary table before loading it into its
+        final destination using a ``HiveOperator``.
+        """
+        tbl=HiveCsvTable(name=table,
+                          col=field,
+                          row_format=serde,
+                          stored=stored,
+                          exists=recreate,
+                          external=external,
+                          outputformat=outputformat,
+                          partitioned=partition,
+                          delimeter=delimiter,
+                          tbl_properties = tblproperties
+                          )
+
+        hql= tbl.get_table()
+
+        if recreate:
+            self.drop_table(table)
+
+        if filepath:
+            self.data2table(filepath, table, overwrite)
+
+        logging.info(hql)
+        self.execute(hql)
+
+
+    def drop_table(self, name):
+        self.execute('DROP TABLE IF EXISTS %s' % name)
+

Added: grass-addons/grass7/hadoop/hd/hdfswrapper/security_utils.py
===================================================================
--- grass-addons/grass7/hadoop/hd/hdfswrapper/security_utils.py	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hdfswrapper/security_utils.py	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,70 @@
+#!/usr/bin/env python
+# Licensed to Cloudera, Inc. under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  Cloudera, Inc. licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import re
+import socket
+
+# Pattern to replace with hostname
+HOSTNAME_PATTERN = '_HOST'
+
+
+def get_kerberos_principal(principal, host):
+    components = get_components(principal)
+    if not components or len(components) != 3 or components[1] != HOSTNAME_PATTERN:
+        return principal
+    else:
+        if not host:
+            raise IOError("Can't replace %s pattern since host is null." % HOSTNAME_PATTERN)
+        return replace_hostname_pattern(components, host)
+
+
+def get_components(principal):
+    """
+    get_components(principal) -> (short name, instance (FQDN), realm)
+    ``principal`` is the kerberos principal to parse.
+    """
+    if not principal:
+        return None
+    return re.split('[\/@]', str(principal))
+
+
+def replace_hostname_pattern(components, host=None):
+    fqdn = host
+    if not fqdn or fqdn == '0.0.0.0':
+        fqdn = get_localhost_name()
+    return '%s/%s@%s' % (components[0], fqdn.lower(), components[2])
+
+
+def get_localhost_name():
+    return socket.getfqdn()
+
+
+def get_fqdn(hostname_or_ip=None):
+    # Get hostname
+    try:
+        if hostname_or_ip:
+            fqdn = socket.gethostbyaddr(hostname_or_ip)[0]
+        else:
+            fqdn = get_localhost_name()
+    except IOError:
+        fqdn = hostname_or_ip
+
+    if fqdn == 'localhost':
+        fqdn = get_localhost_name()
+
+    return fqdn

Added: grass-addons/grass7/hadoop/hd/hdfswrapper/settings.py
===================================================================
--- grass-addons/grass7/hadoop/hd/hdfswrapper/settings.py	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hdfswrapper/settings.py	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,53 @@
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+from __future__ import unicode_literals
+
+import logging
+import os
+import sys
+
+import grass.script as grass
+from sqlalchemy import create_engine
+from sqlalchemy.orm import scoped_session, sessionmaker
+
+BASE_LOG_URL = 'log'
+GISDBASE = grass.gisenv()['GISDBASE']
+LOCATION_NAME = grass.gisenv()['LOCATION_NAME']
+MAPSET = grass.gisenv()['MAPSET']
+MAPSET_PATH = os.path.join(GISDBASE, LOCATION_NAME, MAPSET)
+
+SQL_ALCHEMY_CONN = 'sqlite:////%s' % os.path.join(MAPSET_PATH, 'sqlite', 'sqlite.db')
+
+LOGGING_LEVEL = logging.INFO
+
+engine_args = {}
+if 'sqlite' not in SQL_ALCHEMY_CONN:
+    # Engine args not supported by sqlite
+    engine_args['pool_size'] = 5
+    engine_args['pool_recycle'] = 3600
+
+# print(SQL_ALCHEMY_CONN)
+engine = create_engine(SQL_ALCHEMY_CONN, **engine_args)
+
+Session = scoped_session(
+    sessionmaker(autocommit=False, autoflush=False, bind=engine))
+
+LOG_FORMAT = (
+    '[%(asctime)s] {%(filename)s:%(lineno)d} %(levelname)s - %(message)s')
+SIMPLE_LOG_FORMAT = '%(asctime)s %(levelname)s - %(message)s'
+
+grass_config = os.path.join(MAPSET_PATH, 'grasshdfs.conf')
+
+
+# print(grass_config)
+
+
+
+def configure_logging():
+    logging.root.handlers = []
+    logging.basicConfig(
+        format=LOG_FORMAT, stream=sys.stdout, level=LOGGING_LEVEL)
+
+
+configure_logging()

Added: grass-addons/grass7/hadoop/hd/hdfswrapper/utils.py
===================================================================
--- grass-addons/grass7/hadoop/hd/hdfswrapper/utils.py	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hdfswrapper/utils.py	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,61 @@
+import errno
+import json
+import shutil
+from tempfile import mkdtemp
+
+try:
+    from cryptography.fernet import Fernet
+except:
+    pass
+
+
+def TemporaryDirectory(suffix='', prefix=None, dir=None):
+    name = mkdtemp(suffix=suffix, prefix=prefix, dir=dir)
+    try:
+        yield name
+    finally:
+        try:
+            shutil.rmtree(name)
+        except OSError as e:
+            # ENOENT - no such file or directory
+            if e.errno != errno.ENOENT:
+                raise e
+
+
+def generate_fernet_key():
+    try:
+        FERNET_KEY = Fernet.generate_key().decode()
+    except NameError:
+        FERNET_KEY = "cryptography_not_found_storing_passwords_in_plain_text"
+    return FERNET_KEY
+
+
+def string2dict(string):
+    try:
+        print(string)
+        return json.loads(string.replace("'", '"'))
+
+    except Exception, e:
+        print('Dictonary is not valid: %s' % e)
+        return None
+
+
+def find_ST_fnc(hsql):
+    '''
+    Parse hsql query and find ST_ functions.
+    :param hsql: string of hive query.
+    :type hsql: string
+    :return: dict {ST_fce: com.esri.hadoop.hive.ST_fce} (name: java path )
+    :rtype: dict
+    '''
+    first = "ST_"
+    last = "("
+    ST = {}
+    for s in hsql.split('('):
+        if s.find('ST_'):
+            s = s.split('ST_')
+            fc = 'ST_%s' % s[0]
+            if not fc in ST:
+                ST[s] = "com.esri.hadoop.hive.%s" % fc
+
+    return ST

Added: grass-addons/grass7/hadoop/hd/hdfswrapper/webhdfs_hook.py
===================================================================
--- grass-addons/grass7/hadoop/hd/hdfswrapper/webhdfs_hook.py	                        (rev 0)
+++ grass-addons/grass7/hadoop/hd/hdfswrapper/webhdfs_hook.py	2016-06-21 13:48:59 UTC (rev 68719)
@@ -0,0 +1,139 @@
+import logging
+import os
+
+from hdfs import InsecureClient, HdfsError
+
+from base_hook import BaseHook
+
+_kerberos_security_mode = None  # TODO make confugration file for this
+if _kerberos_security_mode:
+    try:
+        from hdfs.ext.kerberos import KerberosClient
+    except ImportError:
+        logging.error("Could not load the Kerberos extension for the WebHDFSHook.")
+        raise
+
+
+class WebHDFSHook(BaseHook):
+    """
+    Interact with HDFS. This class is a wrapper around the hdfscli library.
+    """
+
+    def __init__(self, webhdfs_conn_id='webhdfs_default', proxy_user=None):
+        self.webhdfs_conn_id = webhdfs_conn_id
+        self.proxy_user = proxy_user
+
+    def get_conn(self):
+        """
+        Returns a hdfscli InsecureClient object.
+        """
+        nn_connections = self.get_connections(self.webhdfs_conn_id)
+        for nn in nn_connections:
+            try:
+                logging.debug('Trying namenode {}'.format(nn.host))
+                connection_str = 'http://{nn.host}:{nn.port}'.format(nn=nn)
+                if _kerberos_security_mode:
+                    client = KerberosClient(connection_str)
+                else:
+                    proxy_user = self.proxy_user or nn.login
+                    client = InsecureClient(connection_str, user=proxy_user)
+                client.status('/')
+                logging.debug('Using namenode {} for hook'.format(nn.host))
+                return client
+            except HdfsError as e:
+                logging.debug("Read operation on namenode {nn.host} failed with"
+                              " error: {e.message}".format(**locals()))
+        nn_hosts = [c.host for c in nn_connections]
+        no_nn_error = "Read operations failed on the namenodes below:\n{}".format("\n".join(nn_hosts))
+        raise Exception(no_nn_error)
+
+    def test(self):
+        try:
+            path = self.check_for_path("/")
+            print('***' * 30)
+            print("\n   Test <webhdfs> connection (is path exists: ls /) \n    %s \n" % path)
+            print('***' * 30)
+            return True
+
+        except Exception, e:
+            print("\n     ERROR: connection can not be established: %s" % e)
+            print('***' * 30)
+            return False
+
+    def check_for_path(self, hdfs_path):
+        """
+        Check for the existence of a path in HDFS by querying FileStatus.
+        """
+        c = self.get_conn()
+        return bool(c.status(hdfs_path, strict=False))
+
+    def check_for_content(self, hdfs_path, recursive=False):
+
+        c = self.get_conn()
+        return c.list(hdfs_path, status=recursive)
+
+    def progress(self, a, b):
+        # print(a)
+        print('progress: chunk_size %s' % b)
+
+    def upload_file(self, source, destination, overwrite=True, parallelism=1,
+                    **kwargs):
+        """
+        Uploads a file to HDFS
+        :param source: Local path to file or folder. If a folder, all the files
+          inside of it will be uploaded (note that this implies that folders empty
+          of files will not be created remotely).
+        :type source: str
+        :param destination: PTarget HDFS path. If it already exists and is a
+          directory, files will be uploaded inside.
+        :type destination: str
+        :param overwrite: Overwrite any existing file or directory.
+        :type overwrite: bool
+        :param parallelism: Number of threads to use for parallelization. A value of
+          `0` (or negative) uses as many threads as there are files.
+        :type parallelism: int
+        :param \*\*kwargs: Keyword arguments forwarded to :meth:`upload`.
+        """
+        c = self.get_conn()
+        c.upload(hdfs_path=destination,
+                 local_path=source,
+                 overwrite=overwrite,
+                 n_threads=parallelism,
+                 progress=self.progress,
+                 **kwargs)
+        logging.debug("Uploaded file {} to {}".format(source, destination))
+
+    def download_file(self, hdfs_path, local_path, overwrite=True, parallelism=1,
+                      **kwargs):
+
+        c = self.get_conn()
+        out=c.download(hdfs_path=hdfs_path,
+                   local_path=local_path,
+                   overwrite=overwrite,
+                   n_threads=parallelism,
+                   **kwargs)
+
+        logging.debug("Download file {} to {}".format(hdfs_path, local_path))
+
+
+        return out
+
+    def mkdir(self, path, **kwargs):
+        c = self.get_conn()
+        c.makedirs(hdfs_path=path, **kwargs)
+
+        logging.debug("Mkdir file {} ".format(path))
+
+    def write(self, fs, hdfs, **kwargs):
+        client = self.get_conn()
+
+        client.delete(hdfs, recursive=True)
+        model = {
+            '(intercept)': 48.,
+            'first_feature': 2.,
+            'second_feature': 12.,
+        }
+
+        with client.write(hdfs, encoding='utf-8') as writer:
+            for item in model.items():
+                writer.write(u'%s,%s\n' % item)



More information about the grass-commit mailing list