Solr DataImportHandler issue: positive integers indexed for string ‘nested’ fields

Posted: June 23rd, 2010 | Author: Ward Bekker | Filed under: Uncategorized | No Comments »

A quick note about a Solr issue that took me some time to solve.

If this sounds familiar….

  • You are using the DataImportHandler for Solr
  • You have a entity with a field which values come from a related entity.
  • After an import it looks like Solr only indexed even postive integers if you look at the schema browser.

….You probably have a ‘nested’ field which name is similar to it’s entity name. See the code below: entity name = regio and field name = regio. Changing field name to something else (regions) solved the issue. When you think about it, it’s somewhat logical that you don’t allow field names to have the same name as the entity. An schema exception during indexing would have been nice though.

<dataConfig>
<dataSource type="JdbcDataSource" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost/nvb?zeroDateTimeBehavior=convertToNull" user="****" password="****"/>
    <document name="vacatures">
        <entity name="vacature" query="select * from vacature">
            <field column="id" name="id" />
            <entity name="regio" query="SELECT regio from foo where vacature='${vacature.id}">
                <field name="regio" column="regio" />
            </entity>

MongoDB first impressions

Posted: February 8th, 2010 | Author: Ward Bekker | Filed under: Uncategorized | 4 Comments »

For a customer we have developed log analytics software. It’s currently uses MYSQL as the database backend. The system reads in a hourly log file, and calculates all kinds of fancy statistics. I wanted to see how the system would work if I used MongoDB, a schema-less document DB, instead of MYSQL. My impressions in no particular order:

  • Importing log data is much easier than on MYSQL because MongoDB is schema-less. Just create a collection (=bucket) and insert every log line into it as a hash. For log files that don’t have a fixed amount of fields, it’s a great fit.
  • Like MYSQL, you do need to create indexes to make searching fast(er).
  • MongoDB supports map reduce operations. It made some of the calculations much more elegant and better readable than the code that was written for MYSQL.
  • Chaining of map reduce operations is supported, and works as you would expect.
  • Queries are written in javascript. I’m happy that they didn’t invent yet another ’scripting’ language. Javascript looks capable enough.
  • Map reduce operations are not particularly fast. They are upgrading their javascript engine to V8 to improve the execution speed.
  • MongoDB community is nowhere near the size of MYSQL. Don’t expect a lot of Google results for a specific mongoDB issue. The moderated Google group is a better place to go currently.
  • I liked the API. Calls are not verbose and their intented use is easy to understand.
  • Although quite capable, mongoDB is still a young project. I need to have more time with it before using it on a customer project.

Ruby Quiz, Haskell Solution: LCD Numbers

Posted: December 17th, 2009 | Author: Michel Rijnders | Filed under: Haskell, Ruby Quiz, Uncategorized | 2 Comments »

A solution to Ruby Quiz #14 in literate Haskell:

LCD Numbers
===========

Problem
-------

[original source](http://rubyquiz.com/quiz14.html)

This week's quiz is to write a program that displays LCD style numbers
at adjustable sizes.

The digits to be displayed will be passed as an argument to the
program. Size should be controlled with the command-line option -s
follow up by a positive integer. The default value for -s is 2.

For example, if your program is called with:

    $ lcd.rb 012345

The correct display is:

     --        --   --        --
    |  |    |    |    | |  | |
    |  |    |    |    | |  | |
               --   --   --   --
    |  |    | |       |    |    |
    |  |    | |       |    |    |
     --        --   --        -- 

And for:

    $ lcd.rb -s 1 6789

Your program should print:

     -   -   -   -
    |     | | | | |
     -       -   -
    | |   | | |   |
     -       -   - 

Note the single column of space between digits in both examples. For
other values of -s, simply lengthen the - and | bars.

Solution
--------

Module declaration and imports:

> module Main where
>
> import Data.Char (digitToInt)
> import Data.List (intersperse)
> import System.Console.GetOpt
> import System.Environment (getArgs)

First we define the numbers at size 1:

> n0 = [ " - "
>      , "| |"
>      , "   "
>      , "| |"
>      , " - "
>      ]
>
> n1 = [ "   "
>      , "  |"
>      , "   "
>      , "  |"
>      , "   "
>      ]
>
> n2 = [ " - "
>      , "  |"
>      , " - "
>      , "|  "
>      , " - "
>      ]
>
> n3 = [ " - "
>      , "  |"
>      , " - "
>      , "  |"
>      , " - "
>      ]
>
> n4 = [ "   "
>      , "| |"
>      , " - "
>      , "  |"
>      , "   "
>      ]
>
> n5 = [ " - "
>      , "|  "
>      , " - "
>      , "  |"
>      , " - "
>      ]
>
> n6 = [ " - "
>      , "|  "
>      , " - "
>      , "| |"
>      , " - "
>      ]
>
> n7 = [ " - "
>      , "  |"
>      , "   "
>      , "  |"
>      , "   "
>      ]
>
> n8 = [ " - "
>      , "| |"
>      , " - "
>      , "| |"
>      , " - "
>      ]
>
> n9 = [ " - "
>      , "| |"
>      , " - "
>      , "  |"
>      , " - "
>      ]
>

Put the numbers in  a list:

> numbers = [n0,n1,n2,n3,n4,n5,n6,n7,n8,n9]

Horizontal scaling function, given a string replicate the second
character n times:

> hscale n cs = head cs : replicate n (cs!!1) ++ [last cs]

Vertical scaling function, repeat the second and fourth row n times:

> vscale n css = head css : replicate n cs1 ++ [cs2] ++ replicate n cs3 ++ [cs4]
>   where cs1 = css !! 1
>         cs2 = css !! 2
>         cs3 = css !! 3
>         cs4 = last css

Scale function; note this function scales a single number:

> scale n = vscale n . map (hscale n)

Function that converts a list of numbers to a string of LCD numbers:

> lcd n = concat .
>         intersperse "\n" .
>         foldr1 (zipWith (++)) .
>         intersperse (replicate (3 + 2*n) " ") .
>         map (scale n . (numbers !!))

`main` function:

> main = do
>   args <- getArgs
>   let (n, digits) = parseArgs args
>   putStrLn $ lcd n $ map digitToInt digits

Command-line argument parsing:

> data Flag = Scale Int
>             deriving Eq
>
> options = [Option "s" [] (ReqArg (Scale . read) "") ""]
>
> parseArgs args =
>   case parse args of
>    (_, [], _)              -> error "Usage: lcd [-s n] digits"
>    ([], digits, [])        -> (2, head digits)
>    ([Scale n], digits, []) -> (n, head digits)
>    (_, _, _)               -> error "Usage: lcd [-s n] digits"
>   where
>     parse = getOpt RequireOrder options