PHP/Mapscript regex problem -- Any volunteers?

Daniel Morissette dmorissette at DMSOLUTIONS.CA
Fri May 13 11:08:51 EDT 2005


Bill Binko wrote:
>
> Just to make sure I understand this, does sizeof(regex_t) change?  Do the
> parameters differ?  I thought that PHP shipped with a POSIX compliant
> implementation of regex.  Perhaps I'm wrong, and there are differences.

Yup, they're completely different (size and contents).

Here is the regex_t def'n from PHP 4.3.6's regex.h:

typedef struct {
        int re_magic;
        size_t re_nsub;         /* number of parenthesized subexpressions */
        const char *re_endp;    /* end pointer for REG_PEND */
        struct re_guts *re_g;   /* none of your business :-) */
} regex_t;


I run Fedora Core 2, and the /usr/include/regex.h has the following
(with comments and #defines stripped out):

struct re_pattern_buffer
{
   unsigned char *buffer;
   unsigned long int allocated;
   unsigned long int used;
   reg_syntax_t syntax;
   char *fastmap;
   RE_TRANSLATE_TYPE translate;
   size_t re_nsub;
   unsigned can_be_null : 1;
   unsigned regs_allocated : 2;
   unsigned fastmap_accurate : 1;
   unsigned no_sub : 1;
   unsigned not_bol : 1;
   unsigned not_eol : 1;
   unsigned newline_anchor : 1;
};

typedef struct re_pattern_buffer regex_t;


>
> Actually, couldn't we just link them differently?  Link mapserv, et al
> against the system regex, and leave the php_mapscript.so unlinked.  If the
> definitions are identical, the .so will find the symbols when it is loaded
> into the php executable.
>

If the definitions were identical, the .so's would do the trick by
magic, but unfortunately they're not as I've shown above.


> It shouldn't require a rebuild: just that the mapserv executable needs to
> link to the dynamic, not static libregex.
>

I don't think mapserver is linking with the static libregex, it uses the
.so if present, the usual linker rules. However, PHP does link
statically with its built-in regex.

Even if both PHP and MapScript used dynamic linking, the .so's in each
case would likely have different names (e.g. libregex.so.1 vs
libregex.so.2 for instance) and the two .so's would end up being loaded
at runtime, causing the same problems as static linking.

>
> That's a broad brush -- the PHP team may have found several key platforms
> have horrific performance/major bugs in their regex and provided a POSIX
> compliant replacement.  I have no doubt that there are easier ways to
> integrate with this (there are simply too many PHP extensions that use
> regex) -- we just need to figure out how.
>

True. But PHP being a software that is known to link with tens
(hundreds?) of other libs, if they want to package their own copy of
something like regex then they should rename the symbols (e.g.
php_regex_t, etc.) in their local copy to prevent conflicts, or provide
a simple mechanism so that external packages can compile (.h) and link
with (.so) their built-in libs.

Right now, a package like MapScript that wants to link with PHP is in a
complete void, unless I missed something obvious.

>
>>So if you or someone else can get a hold of someone in the PHP dev't
>>team and get an explanation of why they force the use of the built-in
>>regex and what they propose as a solution to this mess, then we may be
>>on the way to find a solution.
>>
>
> I'll be happy to give that a shot.
>

Good Luck. I have to admit that I didn't try very hard to find and reach
the right person in the PHP team to discuss those problems. I've just
been frustrated by the way they've dealt with some bugs in the past and
gave up... For instance see their very "constructive" response to PHP
bug 25704: http://bugs.php.net/bug.php?id=25704

I'm sure some MapServer users may think the same of us MapServer
developers when we don't fully understand their problem or fail to
respond to some questions/bugs... that's the unfortunate side-effect of
having too much going on at the same time.

>
> (Erg.  Please take this the right way - I haven't been here long enough
> to lecture...)
>
> That simply isn't a solution.  Not just for me: for the project.  We need
> our users to be able to access the source, and build it effectively.  If
> not, we lose many of the benefits of Open Source.  I know I'm not the only
> one who's picked up Mapserver, found it didn't do something he needed, and
> dug into the code to add it.
>

I see your point completely. FGS is there for those who don't want to
bother with compiling, and honestly, even though I'm a programmer and
use open source exactly for the reasons that you stated, when I install
a new package, I much prefer using RPMs or .tar.gz binary packages than
having to deal with compiling and hunting for lib dependencies, 98% of
the time I'll never bother looking at the source code because it will
just work for me.

Thanks for your interest BTW, I hope you can help us find a viable
solution to this *very frustrating* problem.

Daniel
--
------------------------------------------------------------
  Daniel Morissette               dmorissette at dmsolutions.ca
  DM Solutions Group              http://www.dmsolutions.ca/
------------------------------------------------------------



More information about the mapserver-dev mailing list