[GRASS-SVN] r66518 - in grass-addons/tools/addons: . test test/data

svn_grass at osgeo.org svn_grass at osgeo.org
Fri Oct 16 12:14:06 PDT 2015


Author: wenzeslaus
Date: 2015-10-16 12:14:06 -0700 (Fri, 16 Oct 2015)
New Revision: 66518

Added:
   grass-addons/tools/addons/test/data/g.no.keywords.html
Modified:
   grass-addons/tools/addons/get_page_description.py
   grass-addons/tools/addons/test/data/g.broken.example.html
   grass-addons/tools/addons/test/data/wxGUI.example.html
   grass-addons/tools/addons/test/test_description_extraction.sh
Log:
get the page description for addons index also from meta comment and text

When the page does not have the standard format or content, try to
use meta in HTML comments and when this fails try to extract first
sentece from the text.

Also don't search the description line in the whole file when
Keywords section is missing.


Modified: grass-addons/tools/addons/get_page_description.py
===================================================================
--- grass-addons/tools/addons/get_page_description.py	2015-10-16 15:44:52 UTC (rev 66517)
+++ grass-addons/tools/addons/get_page_description.py	2015-10-16 19:14:06 UTC (rev 66518)
@@ -23,13 +23,49 @@
     return text
 
 
+def get_desc_from_comment_meta_line(text):
+    """
+    >>> get_desc_from_comment_meta_line("<!-- meta page description: Abc abc-->")
+    'Abc abc'
+    """
+    text = text.split("<!-- meta page description:", 1)[1]
+    text = text.split("-->", 1)[0]
+    return text.strip()
+
+
+def get_desc_from_desc_text(text):
+    r"""Get description defined as first sentence in the given text.
+
+    Sentence is defined as text which ends with dot and space.
+    The string is expected to contain this. The other case not handled.
+
+    >>> get_desc_from_desc_text("Abc abc.abc abc.")
+    'Abc abc.abc abc.'
+    >>> get_desc_from_desc_text("Abc abc.abc abc. ")
+    'Abc abc.abc abc.'
+    >>> get_desc_from_desc_text("Abc abc.abc\n abc.\n")
+    'Abc abc.abc\n abc.'
+    """
+    # this matches the sentence but gives also whole string even if it
+    # is not the sentence
+    text = re.split(r"\.(\s|$)", text, 1)[0]
+    # strip spaces at the beginning and add the tripped dot back
+    return text.lstrip() + '.'
+
+
 def main(filename):
     with open(filename) as page_file:
         desc = None
         in_desc_block = False
+        in_desc_section = False
+        desc_section = ''
+        desc_section_num_lines = 0
         desc_block_start = re.compile(r'NAME')
-        desc_block_end = re.compile(r'KEYWORDS')
+        # the incomplete manual pages have NAME followed by DESCRIPTION
+        desc_block_end = re.compile(r'KEYWORDS|DESCRIPTION')
+        desc_section_start = re.compile(r'DESCRIPTION')
         desc_line = re.compile(r' - ')
+        comment_meta_desc_line = re.compile(r'<!-- meta page description:.*-->')
         for line in page_file:
             line = line.rstrip()  # remove '\n' at end of line
             if desc_block_start.search(line):
@@ -39,6 +75,23 @@
             if in_desc_block:
                 if desc_line.search(line):
                     desc = get_desc_from_manual_page_line(line)
+            # if there was nothing in the generated section of the page
+            # try find manually added meta comments which are placed
+            # at the beginning of the manually edited part of the page
+            if not desc and comment_meta_desc_line.search(line):
+                desc = get_desc_from_comment_meta_line(line)
+            # if there was nothing else, last thing to try is get the first
+            # sentence from the description section (which is also last
+            # item in the file from all things we are trying
+            if in_desc_section:
+                desc_section += line + "\n"
+                desc_section_num_lines += 1
+                if desc_section_num_lines > 4:
+                    in_desc_section = False
+            if not desc and desc_section_start.search(line):
+                in_desc_section = True
+        if not desc and desc_section:
+            desc = get_desc_from_desc_text(desc_section)
         if not desc:
             desc = "(incomplete manual page, please fix)"
         # the original script attempted to add also </li> but it as not working

Modified: grass-addons/tools/addons/test/data/g.broken.example.html
===================================================================
--- grass-addons/tools/addons/test/data/g.broken.example.html	2015-10-16 15:44:52 UTC (rev 66517)
+++ grass-addons/tools/addons/test/data/g.broken.example.html	2015-10-16 19:14:06 UTC (rev 66518)
@@ -13,11 +13,10 @@
 
 <h2>DESCRIPTION</h2>
 
-This is a test page which should be emulate a broken manual page.
+This is a test page which should emulate a broken manual page.
 This can happen for example, when module cannot generate a proper
 description (broken imports, not using parser, etc.).
 
-
 <h2>SEE ALSO</h2>
 
 <em>

Copied: grass-addons/tools/addons/test/data/g.no.keywords.html (from rev 66517, grass-addons/tools/addons/test/data/g.broken.example.html)
===================================================================
--- grass-addons/tools/addons/test/data/g.no.keywords.html	                        (rev 0)
+++ grass-addons/tools/addons/test/data/g.no.keywords.html	2015-10-16 19:14:06 UTC (rev 66518)
@@ -0,0 +1,45 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
+<html>
+<head>
+<title>GRASS GIS Manual (test page): r.broken.example</title>
+<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
+</head>
+<body>
+
+<hr class="header">
+
+<h2>NAME</h2>
+<em><b>r.broken.example</b></em>
+
+<h2>DESCRIPTION</h2>
+
+This is a test page which should emulate a broken manual page
+without keywords section.
+This can happen for example, when module cannot generate a proper
+description (broken imports, not using parser, etc.).
+
+ - This line is supposed to look like description line but it is in a wrong place.
+
+<h2>SEE ALSO</h2>
+
+<em>
+<a href="wxGUI.components.html">wxGUI components</a><br>
+</em>
+
+<h2>AUTHORS</h2>
+
+Random Author
+
+<p>
+<i>Data placeholder: 2015-09-06 (Sun, 06 Sep 2015)</i><hr class="header">
+<p>
+<a href="index.html">Main index</a>
+<p>
+© 2003-2015
+<a href="http://grass.osgeo.org">GRASS Development Team</a>,
+GRASS GIS x.x Reference Manual (test page)
+</p>
+
+</div>
+</body>
+</html>

Modified: grass-addons/tools/addons/test/data/wxGUI.example.html
===================================================================
--- grass-addons/tools/addons/test/data/wxGUI.example.html	2015-10-16 15:44:52 UTC (rev 66517)
+++ grass-addons/tools/addons/test/data/wxGUI.example.html	2015-10-16 19:14:06 UTC (rev 66518)
@@ -14,7 +14,7 @@
 
 <h2>DESCRIPTION</h2>
 
-This is a test page which should be similar to a wxGUI manual pages.
+This is a test page which should be similar to wxGUI manual pages.
 
 
 <h2>SEE ALSO</h2>

Modified: grass-addons/tools/addons/test/test_description_extraction.sh
===================================================================
--- grass-addons/tools/addons/test/test_description_extraction.sh	2015-10-16 15:44:52 UTC (rev 66517)
+++ grass-addons/tools/addons/test/test_description_extraction.sh	2015-10-16 19:14:06 UTC (rev 66518)
@@ -8,3 +8,4 @@
 ../get_page_description.py data/r.group.page.html
 ../get_page_description.py data/wxGUI.example.html
 ../get_page_description.py data/g.broken.example.html
+../get_page_description.py data/g.no.keywords.html



More information about the grass-commit mailing list