Cake’s Progress

Cake is progressing rather slowly at the moment. Its next release (0.2) is about receiving requests, and thus will involve lots of networking code. Since I know very little about network programming I am first working my way through "UNIX Network Programming", Volume 1.

In the meantime, to still do some Haskell coding, I’m trying to solve some Perl and Ruby quizzes in Haskell. I’ll post my solutions plus explanations here.

Perl Quiz, Haskell Solution: Plusified Equations

The Quiz

We are given two positive integers $L and $R we need to find Plusified
expressions of both for which Eval($E_L) == Eval($E_R). So what is a plusified
expression? It is an expression where we can choose whether to add a single
"+" between any consecutive digit. So for example the number 123 has the
following plusified expression:

  • 123
  • 12+3
  • 1+23
  • 1+2+3

So if we are given 123 and 96 we can form the following plusified equation:

  • 12+3 == 9+6

Your mission is to write a Perl program (or an equivalent program in any
programming language) that will find all solutions to the plusified equation
of two numbers given as input. To normalise the output we’ll rule that:

  1. The equations should be given one at each line.
  2. They will be sorted so consecutive digits will take precedence over "+"'s.
  3. A "+" has no surrounding spaces.
  4. The = sign does have a preceding and following space.

A Haskell Solution

Listing of the complete solution:

module Main where

import Data.List (iterate)
import System (getArgs)

split :: Char -> String -> [String]
split delim s =
  let (s', s'') = break (== delim) s
  in s' : case s'' of
        delim : _ -> split delim (tail s'')
            otherwise -> []

plusify :: String -> [String]
plusify ""       = [""]
plusify (c : "") = [c : ""]
plusify (c : cs) = concatMap (\cs' -> [c : cs', c : '+' : cs']) (plusify cs)

combis :: String -> String -> [(String, String)]
combis x y =
  [(l, r) | l <- plusify x, r <- plusify y, sum' l == sum' r]
    where sum' = sum . map (\s -> read s :: Int) . split '+'

main :: IO ()
main = do
  args <- getArgs
  mapM_ (putStrLn . \(l, r) -> l ++ " = " ++ r) (combis (args!!0) (args!!1))
  return ()

Paperclip I18n call to arms

Paperclip is a fantastic library used by many many many developers, its github project page shows 1500 watchers but I have a feeling even more than that use the plugin in some shape or form.

The one thing that has been missing for me is proper I18n support and not basic interpolations using the :message option. I recently chanced upon an issue in the github issues registry (#14) which included a fork that implemented a basic version of I18n support but did not go far enough in my opinion. So after taking some advice from validates_timeliness, I have created a fork with goes a little bit further and adds two new messages for the attachment size.

The only thing now is to get some support for this fork so it gets patched into paperclip.

So without further ado, please support this change by leaving a message in the google group thread regarding this change:

Google Group Message – I18n changes and additions

And voting for this issue to be resolved:

Github Issue – Support for I18n

Also, if you have any questions, comments or advice regarding the fork, please don’t hesitate to leave a message below.

Thinking Sphinx DateTime MVAs

One of the most popular open-source indexers for web development has to be Sphinx, an sql full-text search engine. Sphinx has been used by many high profile site, including The Pirate Bay, Craigs List, and NetLog, to name the big ones (more here).

But this post is not about Sphinx, but its Ruby helper library, Thinking Sphinx. In fact, its not just about Thinking Sphinx, finding and fixing a problem yourself.

Thinking Sphinx is an amazing open source gem/plugin/library for Rails, masterminded by the great Pat Allan (@pat), and available to all on github. In fact, with 89 contributors and climbing, this library has had some tender love and care for many caring developers. If you are a Ruby developer and you are interested in indexing, have a look at Thinking Sphinx today, it is simple to install and use, fast and effective, and has some excellent documentation and a very active google group.

Recently while working on a new project for a client of ours I encountered an annoying issue/problem/bug with Thinking Sphinx, more specifically, DateTime multi-value attributes. A multi-value attribute (MVA) is like specifying an array of values for a specific field, for example, employee names, the trouble is that MVAs can only be integers so strings need to be converted (CRC values, CRC32) and datetimes need to be converted to unix timestamps. In Thinking Sphinx MVAs are usually defined through the relationships a model might hold, eg. an office has many employees.

Thinking Sphinx had no problem with me specifying MVAs of integers, but datetimes where a bit of an issue due to two issues, one is the converting, and the other is the group concat to add them all together with commas (like an array). Now, to say that it was an issue makes it sound worse than it really was, because by breaking down the issue into the two parts, converting and concatenating, two improvements could be made at the same time.

So after a brief chat with Pat on the google group to confirm the issue, and on a perfect sunday to be inside (hungover and not great weather), I forked Thinking Sphinx on github (gotta love github for making forking projects so easy) and created a four commit patch which solved the MVA datetime issue for everyone. This patch has since been pulled into the master Thinking Sphinx project so others can use datetime columns as MVAs without having to think about it.

The whole experience of finding this bug, fixing it, and submitting a patch, has reenforced a very important lesson for all developers. If you run into a problem which you do not believe is excepted behavior, do not run away from it and try to find a workaround, get into the code and try to find out where and why the problem is occurring. Its very easy to use libraries and gems and plugins without really knowing how they work or what they do, but if you never get your hands dirty and investigate what is going on then you never learn.

If you can’t find the problem, ask for help on the message boards or by emailing the maintainer. On the other hand, if you do find the problem, check with the maintainer if he can confirm your findings.

Next, create a patch which fixes the issue and does not break or alter current functionality (unless it really does need altering).

And last but not least, share the patch with the community. (make the world a better place)

By the time you have finished with the patch you will not only know the code a lot better, but you would have learnt a truck load about the library, which in the long run is a huge advantage as you are not blindly trusting code written by a hoard of other developers but know first hand what is going on.

Real World Haskell – Exercises Chapter 8

My answers to the exercises of chapter 8.

Page 205:

2. While filesystems on Unix are usually sensitive to case (e.g. “G” vs. “g”) in file names, Windows filesystems are not. Add a parameter to the globToRegex and matchesGlob functions that allows control over case sensitive matching.

Note that the solution below ignores ranges in character classes.

module GlobRegex
  ( globToRegex
  , matchesGlob
  ) where

import Data.Char (isLower,isUpper,toLower,toUpper)
import Data.List (nub)
import Text.Regex.Posix ((=~))

globToRegex :: String -> Bool-> String
globToRegex cs matchCase
  = '^' : globToRegex' cs matchCase ++ "$"

globToRegex' :: String -> Bool -> String
globToRegex' "" _ = ""

globToRegex' ('*':cs) matchCase
  = ".*" ++ globToRegex' cs matchCase

globToRegex' ('?':cs) matchCase
  = '.' : globToRegex' cs matchCase

globToRegex' ('[':'!':c:cs) matchCase
  = let c' = if matchCase
             then [c]
             else nub (c:shiftCase c:[])
    in "[^" ++ c' ++ charClass cs matchCase

globToRegex' ('[':c:cs) matchCase
  = let c' = if matchCase
             then [c]
             else nub (c:shiftCase c:[])
    in '[' : c' ++ charClass cs matchCase

globToRegex' ('[':_) _ = error "unterminated character class"

globToRegex' (c:cs) matchCase
  = escape c matchCase ++ globToRegex' cs matchCase

escape :: Char -> Bool -> String
escape c matchCase
  | c `elem` regexChars = '\' : [c]
  | matchCase = [c]
  | otherwise = '[' : nub (c:shiftCase c:[]) ++ "]"
  where regexChars = "\\+()^$.{}]|"

charClass :: String -> Bool -> String
charClass (']':cs) matchCase = ']' : globToRegex' cs matchCase
charClass (c:cs) True  = c : charClass cs True
charClass (c:cs) False = c' ++ charClass cs False
  where c' = nub (c:shiftCase c:[])
charClass [] _ = error "unterminated character class"

shiftCase :: Char -> Char
shiftCase c | isLower c = toUpper c
            | isUpper c = toLower c
            | otherwise = c

matchesGlob :: FilePath -> String -> Bool -> Bool
matchesGlob name pat matchCase
  = name =~ globToRegex pat matchCase

Page 212:

1. Glob patterns are simple enough to interpret that it's easy to write a matcher directly in Haskell, rather than going through the regexp machinery. Give it a try.

Note: I forgot to implement negated character classes.

module MatchGlob (match) where

import Data.List (tails)

type Glob = String

match :: Glob -> String -> Bool

match ""  "" = True
match ""  _  = False
match "*" "" = True
match _   "" = False

match (x:xs) z@(y:ys) =
  case x of
    '?'       -> match xs ys
    '\'      -> matchEscape xs z
    '*'       -> matchStar xs z
    '['       -> matchClass xs z
    otherwise -> x == y && match xs ys

matchEscape :: Glob -> String -> Bool
matchEscape [] _ = error "unescaped backslash"
matchEscape _ "" = False
matchEscape (x:xs) (y:ys) = x == y && match xs ys

matchStar :: Glob -> String -> Bool
matchStar x y = or (map (match x) (tails y))

matchClass :: Glob -> String -> Bool
matchClass (x:xs) (y:ys) =
  let (xs',zs) = break (== ']') xs
  in if zs == ""
     then error "unterminated character class"
     else  y `elem` expand (x:xs') && match (tail zs) ys
matchClass _ _ = error "illegal character class"

expand :: String -> String
expand []           = ""
expand (x:'-':[])   = x:'-':""
expand (x:'-':y:zs) = [x..y] ++ expand zs
expand (x:xs)       = x:expand xs