<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>/dev/tty</title>
	<atom:link href="http://blog.tty.nl/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.tty.nl</link>
	<description>Notes on Web Development, Computer Programming, and Software Engineering</description>
	<lastBuildDate>Wed, 01 Sep 2010 13:20:33 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.6</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>How to fix 404 on Wordpress Permalinks on IIS</title>
		<link>http://blog.tty.nl/2010/09/01/how-to-fix-404-on-wordpress-permalinks-on-iis/</link>
		<comments>http://blog.tty.nl/2010/09/01/how-to-fix-404-on-wordpress-permalinks-on-iis/#comments</comments>
		<pubDate>Wed, 01 Sep 2010 13:20:33 +0000</pubDate>
		<dc:creator>Almer Thie</dc:creator>
				<category><![CDATA[PHP]]></category>
		<category><![CDATA[PHP frameworks]]></category>
		<category><![CDATA[IIS]]></category>
		<category><![CDATA[Windows]]></category>
		<category><![CDATA[Wordpress]]></category>

		<guid isPermaLink="false">http://blog.tty.nl/?p=399</guid>
		<description><![CDATA[I wrote an article on my personal blog about fixing 404&#8217;s on Wordpress Permalinks on an IIS 6.0 on a Windows webserver. There were two different sources I got the 404 from and which I had to fix; first from Wordpress and then (Murphy&#8217;s law working here ;)) also from IIS. I hope it can [...]]]></description>
			<content:encoded><![CDATA[<p>I wrote an article on my personal blog about <a href="http://code.almeros.com/404-wordpress-permalinks-iis-with-isapi-rewrite" target="_blank">fixing 404&#8217;s on Wordpress Permalinks on an IIS 6.0 on a Windows webserver</a>. There were two different sources I got the 404 from and which I had to fix; first from Wordpress and then (Murphy&#8217;s law working here ;)) also from IIS. I hope it can help you if you have the same or a similar Wordpress problem.</p>
<p>Now at TTY we prefer setting up systems with open source software and therefor using servers with a Linux flavour and an open source HTTP server, but there may be situations where a Windows and IIS system is already in use. This article shows we are capable of handling just that.</p>
<p><a href="http://code.almeros.com/404-wordpress-permalinks-iis-with-isapi-rewrite" target="_blank">Read the article here</a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.tty.nl/2010/09/01/how-to-fix-404-on-wordpress-permalinks-on-iis/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Solr DataImportHandler issue: positive integers indexed for string &#8216;nested&#8217; fields</title>
		<link>http://blog.tty.nl/2010/06/23/solr-dataimporthandler-issue-positive-integers-indexed-for-string-nested-fields/</link>
		<comments>http://blog.tty.nl/2010/06/23/solr-dataimporthandler-issue-positive-integers-indexed-for-string-nested-fields/#comments</comments>
		<pubDate>Wed, 23 Jun 2010 12:43:53 +0000</pubDate>
		<dc:creator>Ward Bekker</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blog.tty.nl/?p=392</guid>
		<description><![CDATA[A quick note about a Solr issue that took me some time to solve.
If this sounds familiar&#8230;.

You are using the DataImportHandler for Solr
You have a entity with a field which values come from a related entity.
After an import it looks like Solr only indexed even postive integers if you look at the schema browser.

&#8230;.You probably [...]]]></description>
			<content:encoded><![CDATA[<p>A quick note about a <a href="http://lucene.apache.org/solr/">Solr</a> issue that took me some time to solve.</p>
<p>If this sounds familiar&#8230;.</p>
<ul>
<li>You are using the <a href="http://wiki.apache.org/solr/DataImportHandler">DataImportHandler for Solr</a></li>
<li>You have a entity with a field which values come from a related entity.</li>
<li>After an import it looks like Solr only indexed even postive integers if you look at the schema browser.</ul>
</ul>
<p>&#8230;.You probably have a &#8216;nested&#8217; field which name is similar to it&#8217;s entity name. See the code below: entity name = regio and field name = regio. Changing field name to something else (regions) solved the issue. When you think about it, it&#8217;s somewhat logical that you don&#8217;t allow field names to have the same name as the entity. An schema exception during indexing would have been nice though.</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border: 1px solid #9F9F9F;width:600px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">&lt;dataConfig&gt;<br />
&lt;dataSource type=&quot;JdbcDataSource&quot; driver=&quot;com.mysql.jdbc.Driver&quot; url=&quot;jdbc:mysql://localhost/nvb?zeroDateTimeBehavior=convertToNull&quot; user=&quot;****&quot; password=&quot;****&quot;/&gt;<br />
&nbsp; &nbsp; &lt;document name=&quot;vacatures&quot;&gt;<br />
&nbsp; &nbsp; &nbsp; &nbsp; &lt;entity name=&quot;vacature&quot; query=&quot;select * from vacature&quot;&gt;<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &lt;field column=&quot;id&quot; name=&quot;id&quot; /&gt;<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &lt;entity name=&quot;regio&quot; query=&quot;SELECT regio from foo where vacature='${vacature.id}&quot;&gt;<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &lt;field name=&quot;regio&quot; column=&quot;regio&quot; /&gt;<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &lt;/entity&gt;</div></div>
]]></content:encoded>
			<wfw:commentRss>http://blog.tty.nl/2010/06/23/solr-dataimporthandler-issue-positive-integers-indexed-for-string-nested-fields/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Zend SOAP Server Webservice quickstart</title>
		<link>http://blog.tty.nl/2010/06/21/zend-soap-server-webservice-quickstart/</link>
		<comments>http://blog.tty.nl/2010/06/21/zend-soap-server-webservice-quickstart/#comments</comments>
		<pubDate>Mon, 21 Jun 2010 18:38:01 +0000</pubDate>
		<dc:creator>Ward Bekker</dc:creator>
				<category><![CDATA[PHP]]></category>

		<guid isPermaLink="false">http://blog.tty.nl/?p=372</guid>
		<description><![CDATA[Below a quick writeup of my first impression with building an basic Zend Soap Webservice. I invite you to add spelling and grammer corrections in the comments for my education.
Starting point

My team needs to implement a SOAP service for mass posting of vacancies to a job board system.
The SOAP service is based on a WSDL [...]]]></description>
			<content:encoded><![CDATA[<p>Below a quick writeup of my first impression with building an basic Zend Soap Webservice. I invite you to add spelling and grammer corrections in the comments for my education.</p>
<h2>Starting point</h2>
<ul>
<li>My team needs to implement a <a href="http://en.wikipedia.org/wiki/SOAP">SOAP service</a> for mass posting of vacancies to a job board system.</li>
<li>The SOAP service is based on a <a href="http://www.w3.org/TR/wsdl">WSDL</a> of an existing service. So we’ll use these specifications as a starting point for the proof of concept.</li>
<li>On a site-note: I prefer <a href="http://en.wikipedia.org/wiki/Representational_State_Transfer">REST</a> above SOAP, because of it’s elegant simplicity. But it wouldn’t make a lot of business sense in this case because a lot of, paying, consumers of the new service have working code for the old service. Adapting to a slightly changed SOAP service will be much easier than a switch to a brand new REST API.</li>
</ul>
<h2>Available SOAP Server extensions for PHP</h2>
<p>There are several frameworks / extensions / toolkits for creating a SOAP server for PHP:</p>
<ul>
<li><a title="pear soap" href="http://pear.php.net/package/SOAP">Pear SOAP package</a>. Probably an orphan package as it’s not updated since 2008 and has a beta status. You probably want to look at the alternatives.</li>
<li><a title="nuSoap SOAP toolkit" href="http://sourceforge.net/projects/nusoap/">NuSoap SOAP toolkit</a>.  Started in 2002 and still under active development as the last release was just a few months ago at the time of writing.</li>
<li><a title="PHP 5 SOAP extensions" href="http://www.php.net/manual/en/class.soapserver.php">PHP 5 SOAP extensions</a>. The official SOAP extension for PHP since version 5.</li>
<li><a title="Zend SOAP Server" href="http://framework.zend.com/manual/en/zend.soap.server.html">Zend SOAP Server</a>. Part of the Zend Framework, so probably not very useful if that’s not your current PHP framework.</li>
</ul>
<p>As we use the <a href="http://framework.zend.com/">Zend framework</a> for this project, it was a natural choice to use it’s SOAP server implementation. We might opt for one of the alternatives if we slam into a brick wall later down the line.</p>
<h2>Testing the waters</h2>
<p>The steps I’ve taken to get a basic Zend SOAP Server based on the WSDL up and running</p>
<p><span style="font-size: 13.3333px; "> </span></p>
<ul>
<li>I copied the wdsl to the /public directory of the Zend framework application making it publicly accessible under  http://example.org/jobtool.wsdl</li>
<li>I created a new controller under application/controllers/soapController.php with an public indexAction function. <a href="http://gist.github.com/446932">Example code</a></li>
<li>The new SOAP service is now available under http://example.org/soap</li>
<li>Next step: actually handle SOAP requests. <a href="http://gist.github.com/446938">Example code</a>. Handling of the soap request is as expected: SOAP method arguments are passed as function arguments. Complex types are represented as a <a href="http://stackoverflow.com/questions/931407/what-is-stdclass-in-php">stdClass objects</a>, which basically are associative arrays. Nested complex types are translated to nested stdClass instances. You don&#8217;t get any warnings or exceptions if your argument count is different than specified in the SOAP request. IMHO that&#8217;s undesirable. I rather have big fat ugly exceptions in that case than subtle bugs.  The associative array you return are translated to the complex type as specified in the WSDL and returned to the client.</li>
<li>To test the SOAP service without the need for a full-blown client i’ve used the free <a href="http://sourceforge.net/projects/soapui/">soapUI</a> tool. You point this tool to the WDSL and it automatically creates fake soap request that you can use to test your brand new SOAP services. Make sure you specified the correct urls in the soapAction attributes in the WSDL.</li>
</ul>
<h2>Closing Thoughts</h2>
<p>I hope this post saved you some when time building your first SOAP Webservice using Zend Framework. I don’t know yet from experience if the Zend SOAP Server will handle more advanced scenario’s. Only time will tell. Let me know how it works for you.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.tty.nl/2010/06/21/zend-soap-server-webservice-quickstart/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A different approach to web form design</title>
		<link>http://blog.tty.nl/2010/03/02/different-approach-to-form-design/</link>
		<comments>http://blog.tty.nl/2010/03/02/different-approach-to-form-design/#comments</comments>
		<pubDate>Tue, 02 Mar 2010 13:57:50 +0000</pubDate>
		<dc:creator>Thijs Oppermann</dc:creator>
				<category><![CDATA[Design]]></category>
		<category><![CDATA[Forms]]></category>
		<category><![CDATA[Usability]]></category>
		<category><![CDATA[Web Development]]></category>

		<guid isPermaLink="false">http://blog.tty.nl/?p=363</guid>
		<description><![CDATA[An interesting article about a different way to look at forms:&#160;&#8220;Mad Libs&#8221; Style Form Increases Conversion 25-40%.
Conclusion of the article seems to be that conversion increases a lot when presenting the form as a narrative. Don&#8217;t know if that really is supported by the small test they did, but I&#8217;m inclined to think they might [...]]]></description>
			<content:encoded><![CDATA[<p>An interesting article about a different way to look at forms:&nbsp;<a href="http://www.lukew.com/ff/entry.asp?1007">&#8220;Mad Libs&#8221; Style Form Increases Conversion 25-40%</a>.</p>
<p>Conclusion of the article seems to be that conversion increases a lot when presenting the form as a narrative. Don&#8217;t know if that really is supported by the small test they did, but I&#8217;m inclined to think they might be on to something.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.tty.nl/2010/03/02/different-approach-to-form-design/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>NLTK&#8217;s dispersion_plot on Mac OS X</title>
		<link>http://blog.tty.nl/2010/02/28/nltks-dispersion_plot-on-mac-os-x/</link>
		<comments>http://blog.tty.nl/2010/02/28/nltks-dispersion_plot-on-mac-os-x/#comments</comments>
		<pubDate>Sun, 28 Feb 2010 11:33:10 +0000</pubDate>
		<dc:creator>Michel Rijnders</dc:creator>
				<category><![CDATA[Books]]></category>
		<category><![CDATA[Mac]]></category>
		<category><![CDATA[NLTK]]></category>
		<category><![CDATA[Python]]></category>

		<guid isPermaLink="false">http://blog.tty.nl/?p=344</guid>
		<description><![CDATA[While reading &#8220;Natural Language Processing with Python&#8221; I ran into problems on my Mac with examples that were using the dispersion_plot function: calls to the function returned immediately without displaying anything.
Turns out matplotlib&#8217;s back-end wasn&#8217;t configured properly. To fix this I had to add a rc file (matplotlibrc) to my ~/.matplotlib directory. The rc file [...]]]></description>
			<content:encoded><![CDATA[<p>While reading <a href="http://www.nltk.org/book">&#8220;Natural Language Processing with Python&#8221;</a> I ran into problems on my Mac with examples that were using the <tt>dispersion_plot</tt> function: calls to the function returned immediately without displaying anything.</p>
<p>Turns out <a href="http://matplotlib.sourceforge.net/">matplotlib</a>&#8217;s back-end wasn&#8217;t configured properly. To fix this I had to add a rc file (<tt>matplotlibrc</tt>) to my <tt>~/.matplotlib</tt> directory. The rc file contains the following:</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border: 1px solid #9F9F9F;width:600px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">backend: TkAgg</div></div>
<p>And, hey presto:<br />
<a href="http://blog.tty.nl/wp-content/uploads/Screen-shot-2010-02-28-at-12.26.21-PM.png"><img src="http://blog.tty.nl/wp-content/uploads/Screen-shot-2010-02-28-at-12.26.21-PM.png" alt="Screen shot 2010-02-28 at 12.26.21 PM" title="Screen shot 2010-02-28 at 12.26.21 PM" width="395" height="316" class="alignnone size-full wp-image-358" /></a></p>
<p>(disclaimer: &#8220;Works on my machine!&#8221;)</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.tty.nl/2010/02/28/nltks-dispersion_plot-on-mac-os-x/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A distributed index setup for Sphinx</title>
		<link>http://blog.tty.nl/2010/02/09/distributed-index-setup-sphinx/</link>
		<comments>http://blog.tty.nl/2010/02/09/distributed-index-setup-sphinx/#comments</comments>
		<pubDate>Tue, 09 Feb 2010 15:29:10 +0000</pubDate>
		<dc:creator>Thijs Oppermann</dc:creator>
				<category><![CDATA[Sphinx search]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[setup]]></category>
		<category><![CDATA[sphinx]]></category>

		<guid isPermaLink="false">http://blog.tty.nl/?p=281</guid>
		<description><![CDATA[Sphinx search is a powerful search engine. Recently we released it (version 0.9.9-rc2) as the backend for most of the searches on one of our high-volume websites. This site has about 360.000 visitors a day that generate about 4.500 search queries for the Sphinx backend per minute on average, peaking to nearly 9.000 per minute [...]]]></description>
			<content:encoded><![CDATA[<p><a title="Sphinx search" href="http://www.sphinxsearch.com/">Sphinx search</a> is a powerful search engine. Recently we released it (version 0.9.9-rc2) as the backend for most of the searches on one of our high-volume websites. This site has about 360.000 visitors a day that generate about 4.500 search queries for the Sphinx backend per minute on average, peaking to nearly 9.000 per minute when it gets busy on the site. To be able to handle that many requests we currently run Sphinx on four dedicated servers.</p>
<p>A problem with having more than one sphinx server is that you need to make sure the results from the different server are close to the same. Since it is possible to switch between servers for two consecutive searches (which on the site in question could also be a browsing action, for example moving from one page of results to the next) it could be very confusing if the search result were different.</p>
<p>With Sphinx there are a number of ways to solve this problem. The most commonly used solutions are:</p>
<ul>
<li>run the indexer on one server and make those indexing results available to all the other servers (through scp, rsync, or hosting on a shared filesystem)</li>
<li>using a distributed index setup</li>
</ul>
<p>The first should work, but is actually not recommended by the makers of Sphinx. We went for the second solution: a distributed index setup.<br />
<span id="more-281"></span></p>
<h4>Full and delta indices</h4>
<p>Our four servers do a full reindex every night (kicked off by cron). These run sequentially, so each server starts out with a different base index. With cron we run the delta index on each server every three minutes. These delta indices only reindex the last added records since the server&#8217;s last full index the night before. This delta index grows throughout the day and starts off taking less than 15 minutes and grows to take about 45 minutes just before the full index is re-run.</p>
<p>The setup for these indices, let&#8217;s call them <strong>base</strong> and <strong>delta</strong>, looks something like this (for the server with server_id = 2):</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border: 1px solid #9F9F9F;width:600px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">source base {<br />
&nbsp; type &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;= mysql<br />
&nbsp; sql_query_range = SELECT MIN(nr),MAX(nr) FROM object_table<br />
&nbsp; sql_range_step &nbsp;= 10000<br />
&nbsp; sql_query_pre &nbsp; = REPLACE INTO sph_counter SELECT 2, MAX(nr) FROM object_table<br />
&nbsp; sql_query &nbsp; &nbsp; &nbsp; = SELECT nr AS id, uid, \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; rootnr, \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; is_active, \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; title, \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; [ .. and all other fields we want in index .. ], \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; FROM object_table \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; WHERE is_active = '1' \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; AND nr &gt;= $start \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; AND nr &lt;= $end \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; AND nr &lt;= ( SELECT max_doc_id &nbsp;\<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; FROM sph_counter \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;WHERE server_id=2 ) <br />
&nbsp; sql_attr_uint &nbsp; = rootnr_attr<br />
&nbsp; sql_attr_uint &nbsp; = is_active_attr<br />
&nbsp; sql_attr_uint &nbsp; = uid_attr<br />
&nbsp; [ .. ]<br />
}</div></div>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border: 1px solid #9F9F9F;width:600px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">source delta : base {<br />
&nbsp; sql_query &nbsp; &nbsp; &nbsp; = SELECT nr AS id, uid, \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; rootnr, \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; is_active, \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; title, \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; [ .. and all other fields we want in index .. ], \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; FROM object_table \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; WHERE is_active = '1' \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; AND nr &gt;= $start \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; AND nr &lt;= $end \ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; AND nr &gt; ( SELECT max_doc_id &nbsp;\<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;FROM sph_counter \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; WHERE server_id=2 ) <br />
}</div></div>
<p>The table sph_counter is just a table that tracks the max_doc_id at the time of the last full index run for each server (set by the &#8216;REPLACE ..&#8217; sql_query_pre in base). </p>
<p>Each server has a similar setup as above, so that they all have a full index from the previous night, and a delta index that covers all the changes from that time up to the last delta index run.</p>
<h4>The distributed index</h4>
<p>Now we create a distributed index on all the servers that queries the local base and delta indices, and also all the delta indices on the other three servers. That looks something like this:</p>
<div class="codecolorer-container text default" style="overflow:auto;white-space:nowrap;border: 1px solid #9F9F9F;width:600px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">index main {<br />
&nbsp; &nbsp; type &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;= distributed<br />
&nbsp; &nbsp; local &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; = base<br />
&nbsp; &nbsp; local &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; = delta<br />
&nbsp; &nbsp; agent &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; = 192.168.33.57:3312:delta<br />
&nbsp; &nbsp; agent &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; = 192.168.33.56:3312:delta<br />
&nbsp; &nbsp; agent &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; = 192.168.33.13:3312:delta<br />
&nbsp; &nbsp; agent_connect_timeout &nbsp; = 200<br />
&nbsp; &nbsp; agent_query_timeout &nbsp; &nbsp; = 1000<br />
}</div></div>
<p>In our code we exclusively query the index named <strong>main</strong>. This index combines the base index from the particular random server the code happened to choose with the delta indices of all four servers. A distributed index combines the results from all of these, and discards the duplicates automatically. This seems to work pretty well.</p>
<h4>Things to look at</h4>
<p>One downside to this setup is that each search kicks off a search on the machine it connects to, but also sends a search on the delta index to all the other servers. At the moment it seems to work, but at busy moments we see a bit of an increase in our response times for our searches. It might be better to concentrate the delta index on one server dedicated to the delta index and leave the other three to concentrate on the base index. This is something we plan on looking into soon. So, possibly an update soon&#8230; ?</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.tty.nl/2010/02/09/distributed-index-setup-sphinx/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Simple ranked text search for MongoDB</title>
		<link>http://blog.tty.nl/2010/02/08/simple-ranked-text-search-for-mongodb/</link>
		<comments>http://blog.tty.nl/2010/02/08/simple-ranked-text-search-for-mongodb/#comments</comments>
		<pubDate>Mon, 08 Feb 2010 14:46:28 +0000</pubDate>
		<dc:creator>Ward Bekker</dc:creator>
				<category><![CDATA[Open Source Projects]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Software Engineering]]></category>

		<guid isPermaLink="false">http://blog.tty.nl/?p=329</guid>
		<description><![CDATA[In this code snippit you can see how to do a basic ranked text search for MongoDB. The code relies on two simple mapreduce operations. One to create an inverted index from some demo text, and a second one to score the matching documents based on query term hits.
]]></description>
			<content:encoded><![CDATA[<p>In this <a href="http://gist.github.com/298175">code snippit</a> you can see how to do a basic ranked text search for <a href="http://www.mongodb.org">MongoDB</a>. The code relies on two simple mapreduce operations. One to create an<a href="http://en.wikipedia.org/wiki/Inverted_index"> inverted index</a> from some demo text, and a second one to score the matching documents based on query term hits.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.tty.nl/2010/02/08/simple-ranked-text-search-for-mongodb/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>MongoDB first impressions</title>
		<link>http://blog.tty.nl/2010/02/08/mongodb-first-impressions/</link>
		<comments>http://blog.tty.nl/2010/02/08/mongodb-first-impressions/#comments</comments>
		<pubDate>Mon, 08 Feb 2010 11:40:24 +0000</pubDate>
		<dc:creator>Ward Bekker</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blog.tty.nl/?p=323</guid>
		<description><![CDATA[For a customer we have developed log analytics software. It’s currently uses MYSQL as the database backend. The system reads in a hourly log file, and calculates all kinds of fancy statistics. I wanted to see how the system would work if I used MongoDB, a schema-less document DB, instead of MYSQL. My impressions in no [...]]]></description>
			<content:encoded><![CDATA[<p style="margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px Helvetica;"><span style="letter-spacing: 0.0px;">For a customer we have developed log analytics software. It’s currently uses MYSQL as the database backend. The system reads in a hourly log file, and calculates all kinds of fancy statistics. I wanted to see how the system would work if I used <a href="http://www.mongodb.org">MongoDB</a>, a schema-less document DB, instead of MYSQL. My impressions in no particular order:</span></p>
<ul>
<li style="margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px Helvetica;"><span style="letter-spacing: 0.0px;">Importing log data is much easier than on MYSQL because MongoDB is schema-less. Just create a collection (=bucket) and insert every log line into it as a hash. For log files that don&#8217;t have a fixed amount of fields, it&#8217;s a great fit.</span></li>
<li style="margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px Helvetica;"><span style="letter-spacing: 0.0px;">Like MYSQL, you do need to create indexes to make searching fast(er). </span></li>
<li style="margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px Helvetica;"><span style="letter-spacing: 0.0px;">MongoDB supports map reduce operations. It made some of the calculations much more elegant and better readable than the code that was written for MYSQL.</span></li>
<li style="margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px Helvetica;"><span style="letter-spacing: 0.0px;">Chaining of map reduce operations is supported, and works as you would expect.</span></li>
<li style="margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px Helvetica;"><span style="letter-spacing: 0.0px;">Queries are written in javascript. I&#8217;m happy that they didn&#8217;t invent yet another &#8217;scripting&#8217; language. Javascript looks capable enough. </span></li>
<li style="margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px Helvetica;"><span style="letter-spacing: 0.0px;">Map reduce operations are not particularly fast. They are upgrading their javascript engine to <a href="http://jira.mongodb.org/browse/SERVER-446">V8</a> to improve the execution speed. </span></li>
<li style="margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px Helvetica;"><span style="letter-spacing: 0.0px;">MongoDB community is nowhere near the size of MYSQL. Don’t expect a lot of Google results for a specific mongoDB issue. <a href="http://groups.google.com/group/mongodb-user">The moderated Google group</a> is a better place to go currently. </span></li>
<li style="margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px Helvetica;"><span style="letter-spacing: 0.0px;">I liked the API. Calls are not verbose and their intented use is easy to understand.</span></li>
<li style="margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px Helvetica;"><span style="letter-spacing: 0.0px;">Although quite capable, mongoDB is still a young project. I need to have more time with it before using it on a customer project.</span></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://blog.tty.nl/2010/02/08/mongodb-first-impressions/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>My Reading List for 2010</title>
		<link>http://blog.tty.nl/2010/01/09/my-reading-list-for-2010/</link>
		<comments>http://blog.tty.nl/2010/01/09/my-reading-list-for-2010/#comments</comments>
		<pubDate>Sat, 09 Jan 2010 14:59:07 +0000</pubDate>
		<dc:creator>Michel Rijnders</dc:creator>
				<category><![CDATA[Books]]></category>
		<category><![CDATA[Programming Language Theory]]></category>

		<guid isPermaLink="false">http://blog.tty.nl/?p=316</guid>
		<description><![CDATA[One of the suggestions of &#8220;The Pragmatic Programmer&#8221; is that you should learn at least one new programming language every year. This is a great suggestion, but after a couple of years its usefulness diminishes, e.g. if one already knows Perl and Python, then the payback on learning Ruby is rather small. Therefore I&#8217;m going [...]]]></description>
			<content:encoded><![CDATA[<p>One of the suggestions of &#8220;The Pragmatic Programmer&#8221; is that you should learn at least one new programming language every year. This is a great suggestion, but after a couple of years its usefulness diminishes, e.g. if one already knows Perl and Python, then the payback on learning Ruby is rather small. Therefore I&#8217;m going to concentrate on the foundations of programming languages this year. Here&#8217;s my tentative reading list:</p>
<ul>
<li><a href="http://mitpress.mit.edu/catalog/item/default.asp?ttype=2&#038;tid=11656">Design Concepts in Programming Languages</a></li>
<li><a href="http://mitpress.mit.edu/catalog/item/default.asp?ttype=2&#038;tid=10142">Concepts, Techniques, and Models of Computer Programming</a></li>
<li><a href="http://www.cis.upenn.edu/~bcpierce/tapl/">Types and Programming Languages </a></li>
</ul>
<p>Suggestions welcome.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.tty.nl/2010/01/09/my-reading-list-for-2010/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Ruby Quiz, Haskell Solution: LCD Numbers</title>
		<link>http://blog.tty.nl/2009/12/17/ruby-quiz-haskell-solution-lcd-numbers/</link>
		<comments>http://blog.tty.nl/2009/12/17/ruby-quiz-haskell-solution-lcd-numbers/#comments</comments>
		<pubDate>Thu, 17 Dec 2009 13:26:34 +0000</pubDate>
		<dc:creator>Michel Rijnders</dc:creator>
				<category><![CDATA[Haskell]]></category>
		<category><![CDATA[Ruby Quiz]]></category>
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blog.tty.nl/?p=288</guid>
		<description><![CDATA[A solution to Ruby Quiz #14 in literate Haskell:

LCD Numbers
===========

Problem
-------

[original source](http://rubyquiz.com/quiz14.html)

This week's quiz is to write a program that displays LCD style numbers
at adjustable sizes.

The digits to be displayed will be passed as an argument to the
program. Size should be controlled with the command-line option -s
follow up by a positive integer. The default value for [...]]]></description>
			<content:encoded><![CDATA[<p>A solution to Ruby Quiz #14 in literate Haskell:</p>
<pre>
LCD Numbers
===========

Problem
-------

[original source](http://rubyquiz.com/quiz14.html)

This week's quiz is to write a program that displays LCD style numbers
at adjustable sizes.

The digits to be displayed will be passed as an argument to the
program. Size should be controlled with the command-line option -s
follow up by a positive integer. The default value for -s is 2.

For example, if your program is called with:

    $ lcd.rb 012345

The correct display is:

     --        --   --        --
    |  |    |    |    | |  | |
    |  |    |    |    | |  | |
               --   --   --   --
    |  |    | |       |    |    |
    |  |    | |       |    |    |
     --        --   --        -- 

And for:

    $ lcd.rb -s 1 6789

Your program should print:

     -   -   -   -
    |     | | | | |
     -       -   -
    | |   | | |   |
     -       -   - 

Note the single column of space between digits in both examples. For
other values of -s, simply lengthen the - and | bars.

Solution
--------

Module declaration and imports:

> module Main where
>
> import Data.Char (digitToInt)
> import Data.List (intersperse)
> import System.Console.GetOpt
> import System.Environment (getArgs)

First we define the numbers at size 1:

> n0 = [ " - "
>      , "| |"
>      , "   "
>      , "| |"
>      , " - "
>      ]
>
> n1 = [ "   "
>      , "  |"
>      , "   "
>      , "  |"
>      , "   "
>      ]
>
> n2 = [ " - "
>      , "  |"
>      , " - "
>      , "|  "
>      , " - "
>      ]
>
> n3 = [ " - "
>      , "  |"
>      , " - "
>      , "  |"
>      , " - "
>      ]
>
> n4 = [ "   "
>      , "| |"
>      , " - "
>      , "  |"
>      , "   "
>      ]
>
> n5 = [ " - "
>      , "|  "
>      , " - "
>      , "  |"
>      , " - "
>      ]
>
> n6 = [ " - "
>      , "|  "
>      , " - "
>      , "| |"
>      , " - "
>      ]
>
> n7 = [ " - "
>      , "  |"
>      , "   "
>      , "  |"
>      , "   "
>      ]
>
> n8 = [ " - "
>      , "| |"
>      , " - "
>      , "| |"
>      , " - "
>      ]
>
> n9 = [ " - "
>      , "| |"
>      , " - "
>      , "  |"
>      , " - "
>      ]
>

Put the numbers in  a list:

> numbers = [n0,n1,n2,n3,n4,n5,n6,n7,n8,n9]

Horizontal scaling function, given a string replicate the second
character n times:

> hscale n cs = head cs : replicate n (cs!!1) ++ [last cs]

Vertical scaling function, repeat the second and fourth row n times:

> vscale n css = head css : replicate n cs1 ++ [cs2] ++ replicate n cs3 ++ [cs4]
>   where cs1 = css !! 1
>         cs2 = css !! 2
>         cs3 = css !! 3
>         cs4 = last css

Scale function; note this function scales a single number:

> scale n = vscale n . map (hscale n)

Function that converts a list of numbers to a string of LCD numbers:

> lcd n = concat .
>         intersperse "\n" .
>         foldr1 (zipWith (++)) .
>         intersperse (replicate (3 + 2*n) " ") .
>         map (scale n . (numbers !!))

`main` function:

> main = do
>   args <- getArgs
>   let (n, digits) = parseArgs args
>   putStrLn $ lcd n $ map digitToInt digits

Command-line argument parsing:

> data Flag = Scale Int
>             deriving Eq
>
> options = [Option "s" [] (ReqArg (Scale . read) "") ""]
>
> parseArgs args =
>   case parse args of
>    (_, [], _)              -> error "Usage: lcd [-s n] digits"
>    ([], digits, [])        -> (2, head digits)
>    ([Scale n], digits, []) -> (n, head digits)
>    (_, _, _)               -> error "Usage: lcd [-s n] digits"
>   where
>     parse = getOpt RequireOrder options
</pre>
]]></content:encoded>
			<wfw:commentRss>http://blog.tty.nl/2009/12/17/ruby-quiz-haskell-solution-lcd-numbers/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>
