Chip's Tips for Developers

Contains coding, but not narcotic.

A bit of Haskell

March 7th, 2010 12:16:56 pm pst by Sterling Camden

HaskellWikiI’ve just started dabbling in Haskell, a pure functional language with some compelling features.

As a first exercise in this language, I eschewed the canonical greeting to the cosmos and decided to translate my algorithm for computing the likelihood of a bitmask yielding non-zero that I wrote originally in Ruby.  Here’s what it looks like in Haskell:

import Data.Bits

pmask :: Integer -> Double
pmask mask = pnonzero (bitcount mask)

pnonzero :: Integer -> Double
pnonzero onbits = 1.0 - (pzero onbits)

pzero :: Integer -> Double
pzero onbits = 1.0 / (2 ^ onbits)

bitcount :: Integer -> Integer
bitcount 0 = 0
bitcount n
  | n > 0 = 1 + (bitcount (n .&. (n - 1)))
  | otherwise = error "cannot use negative numbers"

Bit operations are not automatically included as they are in Ruby, hence the “import” statement.  A bit-and operator is thereby defined as “.&.”.  This also imports a “shift” function, but trying to use it here created enough type confusion to thwart my Haskell-novitiate brain.  Hence, I resorted to applying an exponent to 2 instead.

Speaking of types, Haskell can do type inference, but in the example above I found it necessary to specify type signatures to avoid compilation errors.  I like the fact that you can enforce type safety if you want to, but I wish the type inference was a bit smarter (or, alternatively, that I was a bit smarter to understand how it wants to work).  Type signatures are the lines containing a double-colon.  For example, the line pmask :: Integer –> Double means that pmask is the name of a function that takes an Integer argument and returns a Double.  The line following defines that function (in this case — you can place the definition anywhere).  Thus, the pmask function takes an argument named mask and returns the result of calling pnonzero with a single argument (in parentheses) which is the result of calling bitcount with mask as its argument.

I find the use of parentheses interesting.  Function arguments do not need parentheses, but you use parentheses to group a function and its arguments when they compose an argument to another function.  It makes perfectly economical sense – in practice it looks like a cross between the C (et al.) and Lisp usages.

Where Haskell starts getting interesting can be seen at the end of this example.  You’ll notice that we have two distinct definitions of the function bitcount, plus a branch in the second definition.  The former distinction is called “pattern matching”, where which function definition to use is determined by the argument passed.  The latter uses “guards” to perform boolean tests to choose the function definition.  Thus, when the argument to bitcount is 0, 0 will be returned.  When it is any other value, the second form is used.  Within that, if the argument is greater than 0, we use the recursive function definition.  Otherwise (which is a function that always returns true), we throw an error.

Whereas Ruby uses a relaxed form of the C-style semi-colons and braces to delineate statements, and Python uses significant whitespace, Haskell allows both.  But, it’s considered better form to use the significant whitespace, which is called “layout rule”.  I’m not a big fan of using indentation for anything more than readability, but so far it does not seem onerous in Haskell.

Where Haskell truly shines is in its approach to lazy evaluation, which is more radical than any I have used.  This allows things like infinite lists – for instance, [0 ..] is a list of all non-negative integers.  Naturally, evaluating that would never return, but Haskell doesn’t need to.  You can even map a function to the list, which results in another lazy infinite list, without evaluating all the members.  This eliminates the need for continuations, and makes memoization a breeze.  Consider the canonical Fibonacci function:

fib 0 = 0
fib 1 = 1
fib n = fib (n-2) + fib (n-1)

The computational complexity for any one value is roughly O(nn), which means that when you ask for anything higher than about 40, you’ve got quite a wait.  We can reduce that to O(n) by memoizing previously computed results:

fibonacci =
  let fib 0 = 0
      fib 1 = 1
      fib n = fibonacci (n-2) + fibonacci (n-1)
  in  (map fib [0 ..] !!)

How elegant is that?

download

Posted in Haskell | 2 Comments » RSS 2.0 | Sphere it!

Progn for Synergy/DE

June 28th, 2009 4:09:45 pm pst by Sterling Camden

As we have seen, it’s often useful in Synergy/DE to do more than one thing in a single statement – especially when defining macros.  One way you can join multiple operations is with a Boolean operator:

status = (operation1() && operation2())

But this approach has some limitations:

  1. If the first operation returns false, the second operation is not performed.  Of course, you can use that to your advantage as shorthand for an ‘if’.  And if you know that the first operation returns false, you could use || (or) instead of && (and) as shorthand for ‘if not’.  But what if you want to always execute both operations, regardless of their return value?  There isn’t an “I don’t care” operator.
  2. The only return value that you can get out of the expression is a Boolean value.  What if you wanted the actual result of the second operation, for instance?

Common Lisp solves this problem with PROGN:

(progn (operation1 …)
       (operation2 …))

which can take any number of operations (technically “forms”, not necessarily operations), and returns the result of the last one.

The downloadable code below provides a Progn class for Synergy/DE.  Using this class, you can say:

Progn.Do(operation1()).Returning(operation2())

which returns the result of operation2().  You can insert more operations in the middle using the Then() method:

Progn.Do(operation1())
    &.Then(operation2())
    &.Then(operation3())
    &.Returning(operation4())

The static Do() method returns a Progn object, ignoring its parameter.  The Then() method also ignores its parameter, and returns the same object.  The Returning() method merely returns its parameter.  Each of the steps is executed before the method to which it is passed.  In order to minimize casting, Progn defines versions of Do, Then, and Returning that take arguments of type int, decimal, alpha, Var, and object – with each version of Returning returning the same type as its argument.  So the only time you should need to cast the result is when returning an object of a class other than Var or string (which is alpha-compatible).

Of course, the operations above can be any expression that evaluates to a value – they don’t have to be function or method calls.

Combining Progn with Let, you can now construct inline functions with local scope that can also access their defining scope.  Unfortunately, you cannot evaluate these functions lazily.  In other words, you can’t create a closure to be called later on.  So we’re still not as functional as we’d like to be, but we’re taking one more small step in that direction.

UPDATE 2009-09-13: for Synergy/DE 9.3

Posted in SynergyDE | 2 Comments » RSS 2.0 | Sphere it!

Functional Let for Synergy/DE

April 23rd, 2009 9:41:15 am pst by Sterling Camden

Synergy/DE possesses a macro definition syntax that’s very similar to the #define syntax in C, but not quite as robust.  It doesn’t support macros that expand to multiple lines of code.  An individual parameter passed to a macro cannot be continued to the next line, either.  A macro cannot be used as a parameter to itself, nor within its expansion (as of 9.1.5).  But despite these limitations, I find macros to be a powerful tool for either simplifying my code or making it hopelessly obscure.  The basic Synergy/DE macro syntax is:

.define macroname(parameter, …) expansion

Wherever parameter is encountered within expansion it is replaced with whatever is passed to the macro in that argument position when it’s invoked.  In order to be recognized as the desired token within expansion, it must either be syntactically isolated as a token, or escaped with a back-quote (`).  Within strings, you use two back-quotes preceding and one following the identifier.  If you continue a macro to the next line, its expansion is also continued – which is why macros are limited to a single statement.  There is no syntax (yet) for saying, “this is a continuation of my macro, but I want it to be a new statement rather than a continuation of the same statement within the expansion.”

One problem that’s common to any parameterized macro system (be it Lisp’s, C’s, or Synergy/DE’s) is how many times an argument gets evaluated at runtime.  Macro invocation syntax looks like functional syntax, but how arguments get evaluated is quite a different ball game.  Consider the classic “max” macro:

.define max(val1, val2) fif(val1 > val2, val1, val2)

For a definition of the fif macro, see this post.  Now lets try using it:

account.fee = max(5.00, account.CloseQuarter() * 0.01)

Seems innocent enough, we’re going to close out the quarter, which returns the total, from which we’ll compute a fee of 1% of total activity, but not less than $5.00.  If max were a function, then each parameter would be evaluated once before invoking it.  But the max macro expands out into:

account.fee = fif(5.00 > account.CloseQuarter() * 0.01, 5.00, account.CloseQuarter() * 0.01)

which means that if 1% of activity is not less than 5.00, we’ll call account.CloseQuarter() twice.  Actually, because fif is implemented under the hood as a static method, we’ll always call account.CloseQuarter() twice.  Even if account.CloseQuarter() had no side-effects, it’s more than we intended to do.  I could also devise examples of macros in which an argument may not get evaluated at all.  This, too, can cause confusion when you’re looking at the code and thinking of the macro as if it were a function.

One way to solve this problem in Lisp is to use LET.  LET exists in many functional languages to allow you to define a scope for local assignments, but it can also be used to force single evaluation of macro parameters:

(defmacro mymacro (param1)  
  `(let ((,var1 ,param1))
      (maybe do things with ,var1)))

The macro generates a LET statement in the code, assigning the parameter param1 to a local variable var1, which only has the scope of the LET statement.  That assignment forces param1 to be evaluated once and only once within the generated code.  The LET then returns whatever the (maybe do things with var1) statement returns.  Even if there is a var1 declared in the surrounding code, the var1 referenced inside the LET will refer to the parameter’s value.  Other variables defined in the surrounding code may, however, be referenced and used within the macro (which isn’t hygenic, but is sometimes useful).

I have a dream

How could we do something similar in Synergy/DE?  We do have the ability to define a very local scope for variables in version 9, using the DATA statement — but there are at least two reasons why this won’t help us with macros:  (1) a DATA statement has to be on a line by itself, and we don’t have multi-line macro expansions (remember?), and (2) even if we did, we don’t have any way to return a value from a multi-statement block.   Probably the right way to solve this in the language would be to allow inline anonymous functions with access to the local scope (a la JavaScript and other languages with closures) and also provide a syntax for multiline macros.  We could dream, for instance, of something like this:

.define mymacro(param1) function, ^val
.&                          var1, i
.&                      proc
.&                          freturn maybe_do_things_with(var1) ;a macro that might use var1 or not
.&                      end(param1)  ;Invoke it, passing param1

By converting the macro into a function, you’d automatically evaluate arguments only once.  But that’s pie-in-the-sky – how about something I can use today?

Let my parameters go

The downloadable code below works with Synergy/DE version 9.1.5 and above, and provides a form of scoped LET.  The syntax is a bit more cluttered than I’d like, but it’s functional (pun intended).  For instance, you could define our mymacro example as:

.define mymacro(param1) let(assign(“var1”, param1), maybe_do_things_with(valueof(“var1”)))

and we can write a single-evaluation-safe version of the max macro as:

.define max(val1,val2) let(assign("val1",val1) && assign("val2",val2), fif((vv("val1") > vv("val2")), vv("val1"), vv("val2”)))

This version of max takes any two primitives (or Vars) and returns a Var containing the greater of the two values.  Both val1 and val2 will be evaluated once and only once, in their respective assigns.

The basic syntax for let is:

return_value = let(assignments, body)

where

return_value is whatever body returns.  It will have the same type as what body returns, so you may need to cast it.

assignments are one or more “assign” macros, joined with the logical AND operator (&& or .and.)

body is some expression whose result will be returned.

I lied just a bit when I said that the return value is the same as what body returns.  It will be one of the following, whichever is the best match for what body returns:

  • i (which covers boolean, int, and ^val functions)
  • decimal (which covers d and d.)
  • a (which also handles strings)
  • @Var (if it’s a Var)
  • @* (for all other object classes)

The syntax for the assign macro is:

assign(name, value)

where

name is the name of a let variable

value is the value to assign it

Under the hood

The assign macro expands into a static method named Lets.Set.  It returns a Lets object that is passed to let to maintain scope (let actually expands into a static method named Lets.Return).  The logical AND between two Lets objects causes the first one to maintain the scope of the second one.   Thus, we always use only one argument for all assignments, and the second argument is for the body.

The assignment of let variables is maintained in a static Hash of Stacks.  Each member of the hash is the stack of values for a given name.  An assign pushes a new value onto the stack for its name, and the let to which that assign belongs removes that value from the stack before it returns.  Within either a subsequent assign or within the body of the let, you can reference the topmost value on the stack for a given name via the valueof macro.

Since Stacks can only hold objects, if a primitive type is passed to assign, it is boxed in a Var.  The valueof macro (which expands into the static method Lets.Get) always returns an object, so I’ve provided a handy shortcut to cast it as a (Var): the vv macro.  vv(name) is equivalent to (Var)valueof(name).

What is It?

Another handy shortcut is the “it” macro.  “it” provides a reference to the most recently assigned value, without having to know its name.  “it” is also scoped to a let, so in the case of nested lets, it goes back to what it was in the outer let when the inner let returns.  To cast “it” as a Var, you can use the itv macro as a shortcut.

Why wouldn’t you know the name of the most recently assigned value?  Sometimes you don’t care, especially if you only need one temporary variable.  So, to generate a unique name, we have gensym, and as a shortcut for a let with only one assignment that uses gensym for its name, we have the genlet macro:

result = genlet(value, body)

Believe me, “it” will come in handy in that body.  In case you do want to inspect the name of whatever is assigned to “it”, there is the “itsname” macro to provide it to you.

I’ve provided quite a few tests in test_let.dbl.  These also serve as some good examples of usage.

Even though Synergy/DE doesn’t allow macros to be nested in their own parameters, I’ve worked around that for let and assign.  They aren’t actually parametric macros – they’re just text replacements for their static method names.  So you can have a let within an assign, and you can have a let within the body of a let.  The let variable scoping all works as you would expect.  You can also continue a statement to the next line in the middle of a let or an assign.

Back in the 1980s I stretched the DIBOL language (which at that time lacked integers, dynamic memory allocation, and recursion) to implement various DSL interpreters.  In the early 90s, I used the newly available xsubr routine to create a form of object-orientation by convention.  Ken Lidster and I subsequently designed a couple of prototypes for a true OOP implementation, which has finally come to fruition in Synergy/DE version 9.  Now once again, I’m pushing the limits of the language to enable the functional model.  Let us see where it goes from here.

UPDATE 2009-07-06: Change the namespace to “ChipsTips”, and included an updated version of Var.
UPDATE 2009-09-13: Updated for Synergy/DE 9.3

Posted in SynergyDE | 1 Comment » RSS 2.0 | Sphere it!

Functional “if” for Synergy/DE

March 2nd, 2009 12:00:18 pm pst by Sterling Camden

Programmers who have used languages derived from C will be familiar with C’s ternary conditional operator:

test ? iftrue : iffalse
which tests test for truth and returns iftrue if it is true, or iffalse if it isn’t.  No such operator exists in Synergy/DE, and many Synergy/DE programmers are just as glad that it doesn’t.  If you’re used to an imperative style of programming, it feels a bit obscure.  But there are a couple of good reasons why you’d want this sort of capability.  For one, it can save you a temporary variable declaration, as in the following:
if (x > y) then
  which = x
else
  which = y
mymethod(which)
Wouldn’t it be nice if you could just say:
mymethod((x > y) ? x : y)
But by far the biggest need for a functional-style if is in macro definitions.  Synergy/DE macros can be very powerful, but they have at least one glaring weakness:  they are limited to a single program statement (or part of a statement).  That’s only because (a) there’s no way to continue a macro definition to the next line without also continuing its expansion, and (b) the only statement terminator in Synergy/DE is end-of-line.  Besides, the imperative style of the DIBOL if-then-else (i.e., the fact that it doesn’t return anything) means that as soon as you add a condition into a macro you can no longer use your macro as an expression.  For instance, trying to write the C “max” function as a macro leads to frustration:
.define max(val1, val2) if (val1 > val2) then val1 else val2
This doesn’t work at all, because within the conditional, val1 and val2 must resolve to expressions, while in the “then” and “else” clauses they’re expected to be imperative statements.  And even where expressions can be used as statements (method invocations, for instance), you still can’t assign the result.
Until now.
The fif macro
The code below includes a “fif” macro (for Functional If), defined in fif.def, and code to support it in fif.dbl.  To use it, include fif.def at the top of your source file, and then you can use the syntax:
fif(test, iftrue, iffalse)
So you can write the “max” and “min” macros, for instance, as follows:
.define max(val1, val2) fif(val1 > val2, val1, val2)
.define min(val1, val2) fif(val1 < val2, val1, val2)
Caveats
  • Because this macro must call a method to do its work, all of the arguments will be evaluated once.  This is unlike the “if” statement, where only one of the “then” or “else” clauses gets evaluated.
  • This obviously involves more overhead than a simple if statement.  Use only where functional syntax is helpful — but remember that programmers are more expensive than CPU cycles, unless the code is executed very frequently.
  • The types for val1 and val2 must be compatible.  For those purposes, all object classes are compatible with one another — but there are rules for primitive types:  integer and decimal aren’t compatible (you need to convert one of them), but a decimal literal without precision will get converted to integer automatically if the other parameter is integer.  Alphanumeric and string are also compatible.
  • The return value will be one of the following, depending on the types of the values passed:  decimal, integer, alphanumeric, @Var, or object.  String arguments map to alphanumeric (unless the other argument is some other class of object).  All other classes map to object, so you’ll probably need to cast the result where non-primitive types are involved.

Implementation

In order to convert the multi-line, imperative DIBOL “if” statement into an expression, we have to use a function of one form or another.  Here’s where the static type system of Synergy/DE gets in our way — there is no way to designate a single function or method that can return any type.  Furthermore, bare functions cannot return objects (only primitive types), so we’re stuck using a static method.  But that actually works to our advantage, because we can use parametric polymorphism to define a separate version of our method for each argument/return type.

You’ll see in fif.dbl that I saved myself a bunch of typing (hah) by including the “Test” method’s code four times, defining the type for each inclusion.  I also used .ifndef to include the outer code on the first pass.  This is a trick I learned from Ken Lidster.  It’s not all that readable, but it’s certainly useful here.  Use cautiously, though, where noobies may tread.

You could add a .include for other types, if you have specific classes that you use frequently.  That would save you the trouble of casting the return value from the object (@*) version of the method in those cases.

The include file (fif.def) merely imports this code and defines a macro to call the Fif_.Test method.  The macro simply reduces syntactic weight — you could call the method directly, but I think “fif” is more readable than “Fif_.Test”.

Tests

The file test_fif.dbl provides some simple (but not comprehensive) tests for fif, using assertions.  It tries out the various types and automatic conversions.  See the comments in that file for details.

bld.bat contains the commands to build test_fif.dbr on Windows and Unix platforms.  ‘dbr test_fif’ to run it, and the tests succeed if you only get a STOP message.

comments.add(fif(you.liked(this), new Compliment(), new Flame()))

UPDATE 2009-04-22: Changed the macro definition to allow nesting of fifs.
UPDATE 2009-04-22: Added implicit support for Var
UPDATE 2009-07-06: Changed namespace to “ChipsTips”.
UPDATE 2009-09-13: Updated for Synergy/DE 9.3

Posted in SynergyDE | 3 Comments » RSS 2.0 | Sphere it!

Better Tag Cloud