[pdal] questions about iterator and buffer semantics

Mateusz Loskot mateusz at loskot.net
Wed Jul 3 02:21:45 PDT 2013


On 3 July 2013 04:15, Chris Foster <chris.foster at roames.com.au> wrote:
> On 3 July 2013 00:55, Howard Butler <hobu.inc at gmail.com> wrote:
>>> * Exceptions are thrown as a matter of course (even in getDimensionOptional,
>>>  where they are caught and used to return an empty optional).  I'm very
>>>  suspicious of any code which uses exceptions as part of normal flow control.
>>>
>>> * _Lots_ of memory management traffic happens behind the scenes:
>>>  std::ostringstream is used to format an error message as a matter of course
>>>  (rather than only in error situations), several std::string objects are
>>>  created, etc.
>>
>> Agreed. I'll spend some time trying to tighten this up a bit.
>
> I did some performance testing last night - initial results indicate that
> the major problem is throwing and gobbling exceptions unnecessarily.  For a
> buffer size of 1000 and a large file this can be half the runtime!  Not that I
> want small buffers per se, but why use extra memory and thrash the processor
> cache if you don't have to ;-)
>
> For good performance we really need to avoid
>
> try
> {
>     Thing& x = some_function_which_may_throw();
>     // do stuff with x
> }
> catch(Exception&)
> {
>     // Silently swallow the exception, since x not existing isn't
> actually an error
> }

If the function throws often and frequently, yes.

> This is much more efficiently (and conveniently!) written as
>
> Thing* x = some_function_optional();
> if (x)
> {
>     // do stuff with x
> }
>
> Some debate can be had about whether to use Thing* or boost::optional<Thing&>
> here.  To some extent that's just a matter of taste and I tend to prefer a raw
> pointer.  If you have functions named nicely like getDimensionOptional(), it's
> super easy to remember that the returned pointer may not actually be valid and
> must be checked.

The use of a pointer is idiomatic, indeed, but it is not equivalent to
exceptions,
same with boost::optional - it's just syntactic sugar for nullptr in this case.
It is equivalent to the following:

bool some_function_which_may_fail(T& t);
T t;
if (some_function_which_may_fail(t))
    // do stuff with t

If that captures the idea properly, it should be sufficient.

In case there are more complex states required, the only alternative
would be a proper monad:

either<error_code, file_id> load_file(...);
// check what error_code, decide to use file_id

Some freely avaialble implementations here:

https://github.com/camio/Boost.Either
https://github.com/TrademarkPewPew/Boost.Expected

with related v. detailed discussion:
http://lists.boost.org/Archives/boost/2013/06/204738.php


>>> Side note: I expect it's too late to change now, but what benefit do we get by
>>> making getDimensionOptional() return a boost::optional rather than a pointer?
>>> It strikes me as very slightly safer but less convenient when you end up
>>> caching raw pointers in places like the las reader anyway.
>>
>> The code is in transition. I want to move to boost::optional rather than
>> naked pointers. They're more convenient and a bit safer. Mateusz may have
>> some more to say on that account as well
>
> They are (arguably) slightly safer, but more conveient?
>
> Thing* x = some_function_optional();
> if (x)
> {
>     x->bar();
>     x->foo();
>     baz(x);
> }
>
> vs
>
> boost::optional<Thing&> x = some_function_optional();
> if (x)
> {
>     x->bar();
>     x->foo();
>     baz(*x);
> }
>
> or possibly
>
> boost::optional<Thing&> x = some_function_optional();
> if (x)
> {
>     Thing& x2 = *x;
>     x2.bar();
>     x2.foo();
>     baz(x2);
> }
>
> boost::optional definitely demands more typing in this case, and more human
> parsing effort when reading the code.  It's also very slightly less efficient
> because you end up checking validity of x at least twice (four times in
> example two).
>
> Another reason for using optional is that value types don't naturally have a
> special "invalid value".  In this case we're talking about reference semantics
> however and you can always use the NULL pointer which also happens to be very
> convenient to check in an if() statement.

The problem I've got with nullptr is that it does not necessarily mean
"invalid value". It means, "something could not deliver a value",
but if something can deliver and the value delivered is invalid...

My 5 groszy.

Best regards,
--
Mateusz  Loskot, http://mateusz.loskot.net


More information about the pdal mailing list