<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" 
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
    xmlns:admin="http://webns.net/mvcb/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd">
	<channel>
<title>davidavraamides.net blog</title><link>http://davidavraamides.net/index.html</link><description>davidavraamides.net blog</description><dc:language>en</dc:language><dc:creator>david.avraamides@mac.com</dc:creator><dc:rights>Copyright 2008 David Avraamides</dc:rights><dc:date>2008-05-13T22:24:56-04:00</dc:date><admin:generatorAgent rdf:resource="http://www.realmacsoftware.com/" />
<admin:errorReportsTo rdf:resource="mailto:david.avraamides@mac.com" /><sy:updatePeriod>hourly</sy:updatePeriod>
<sy:updateFrequency>1</sy:updateFrequency>
<sy:updateBase>2000-01-01T12:00+00:00</sy:updateBase>
<lastBuildDate>Wed, 14 May 2008 00:00:23 -0400</lastBuildDate><item><title>Digging Into Spotlight</title><dc:creator>david.avraamides@mac.com</dc:creator><category>Technology</category><dc:date>2008-05-13T22:24:56-04:00</dc:date><link>http://davidavraamides.net/files/digging-into-spotlight.html#unique-entry-id-12</link><guid isPermaLink="true">http://davidavraamides.net/files/digging-into-spotlight.html#unique-entry-id-12</guid><content:encoded><![CDATA[<h2>Dropping to the Command Line</h2>
<p>My first thought was just to search for help on Spotlight, and that's not a bad starting place as the page covers a number of keywords, more correctly called <em>metadata attributes</em>. Spotlight's help covers <code>kind</code>, <code>author</code>, <code>date</code>, <code>created</code> and <code>by</code> as well as the boolean operators <code>AND</code>, <code>OR</code> and <code>NOT</code>. But I knew there were many other metadata fields that were commonly used in files such as images and audio files. I had come across the <code>mdls</code> shell command before which lists the metadata fields on a file. A quick check of a JPEG image revealed all kinds of interesting data:
</p>
<div class="codehilite"><pre>da-imac-01:stuff  david<span class="nv">$ </span>mdls IMG_3564.JPG 
<span class="nv">kMDItemAcquisitionMake</span>         <span class="o">=</span> <span class="s2">&quot;Canon&quot;</span>
<span class="nv">kMDItemAcquisitionModel</span>        <span class="o">=</span> <span class="s2">&quot;Canon EOS 10D&quot;</span>
<span class="nv">kMDItemAperture</span>                <span class="o">=</span> 0.970855712890625
<span class="nv">kMDItemBitsPerSample</span>           <span class="o">=</span> 32
<span class="nv">kMDItemColorSpace</span>              <span class="o">=</span> <span class="s2">&quot;RGB&quot;</span>
<span class="nv">kMDItemContentCreationDate</span>     <span class="o">=</span> 2008-04-22 18:40:25 -0400
<span class="nv">kMDItemContentModificationDate</span> <span class="o">=</span> 2008-04-22 18:40:25 -0400
...
<span class="nv">kMDItemFlashOnOff</span>              <span class="o">=</span> 0
<span class="nv">kMDItemFNumber</span>                 <span class="o">=</span> 1.399999976158142
<span class="nv">kMDItemFocalLength</span>             <span class="o">=</span> 50
...
</pre></div>
<p>This is a truncated list of the 50+ fields in one of my image files. Note some of the interesting ones like the aperture, flash setting and focal length.
</p>
<p>I played around with another of the Spotlight/metadata shell commands: <code>mdfind</code>. This lets you do the equivalent of a Spotlight search from the command line and after a bit of trial and error, guessing the keyword names and value formats was fairly easy:
</p>
<div class="codehilite"><pre>da-imac-01:stuff david<span class="nv">$ </span>mdfind make:canon focallength:50 flash:0 iso:125
/Users/david/Desktop/Turks, April 2008/IMG_3564.JPG
/Users/david/Pictures/iPhoto Library/Originals/2008/Turks, April 2008/IMG_3564.JPG
/Users/david/Pictures/iPhoto Library/Originals/2008/Museum Visit/IMG_3273.JPG
/Users/david/Pictures/iPhoto Library/Originals/2008/Museum Visit/IMG_3276.JPG
/Users/david/Pictures/iPhoto Library/Originals/2008/Mar 23, 2008/IMG_3288.JPG
...
</pre></div>

<h2>Peeling the Onion with DTrace</h2>
<p>Although these shell commands are very useful, the man pages for the commands do not list the valid search keywords. I knew there must be a list of the keywords used in Spotlight search bar that mapped to these constant names so I thought *what a great time to learn <code>dtrace</code>!*
</p>
<p>For those of you who haven't heard of <code>dtrace</code> I encourage you to play around with it. It's a very powerful tool for doing live probing and tracing of low level activity in the operating system. After skimming <a href="http://www.solarisinternals.com/wiki/index.php/DTrace_Topics_Intro">this nice tutorial</a> I tried this command in one window:
</p>
<div class="codehilite"><pre>da-imac-01:bin david<span class="nv">$ </span>sudo dtrace -n <span class="s1">&#39;syscall::open*:entry /execname == &quot;mdfind&quot;/ \</span>
<span class="s1">    { printf(&quot;%s %s&quot;, execname, copyinstr(arg0)); }&#39;</span>
Password:
dtrace: description <span class="s1">&#39;syscall::open*:entry &#39;</span> matched 3 probes
</pre></div>
<p>and then ran my <code>mdfind</code> command again in another Terminal window. The <code>dtrace</code> &quot;script&quot; says to trace all system calls whose name begins with &quot;open&quot; when the system call is entered, but only if they were called from the <code>mdfind</code> process, and then print out the name of the system call and the first argument (which in the case of <code>open</code> is the file or device name). That resulted in LOTs of calls like this showing <code>mdfind</code> opening all kinds of metadata importer files, which I assume are libraries that know how to manipulate certain types of metadata attributes:
</p>
<div class="codehilite"><pre>CPU     ID                    FUNCTION:NAME
  0  18390              open_nocancel:entry mdfind /System/Library/Spotlight/<span class="se">\</span>
    Audio.mdimporter
  0  18390              open_nocancel:entry mdfind /System/Library/Spotlight/<span class="se">\</span>
    Audio.mdimporter/Contents
  0  17604                       open:entry mdfind /dev/autofs_nowait
  0  17604                       open:entry mdfind /System/Library/Spotlight/<span class="se">\</span>
    Audio.mdimporter/Contents/Info.plist
  0  18390              open_nocancel:entry mdfind /System/Library/Spotlight/<span class="se">\</span>
    Chat.mdimporter
  0  18390              open_nocancel:entry mdfind /System/Library/Spotlight/<span class="se">\</span>
    Chat.mdimporter/Contents
...
</pre></div>
<p> But the part of the trace I was most interested in was near the very end:
</p>
<div class="codehilite"><pre>...
  1  17604                       open:entry mdfind /System/Library/Frameworks/<span class="se">\</span>
    CoreServices.framework/Versions/A/Frameworks/Metadata.framework/<span class="se">\</span>
    Resources/MDPredicate.plist
  1  17604                       open:entry mdfind /dev/autofs_nowait
  1  17604                       open:entry mdfind /System/Library/Frameworks/<span class="se">\</span>
    CoreServices.framework/Versions/A/Frameworks/Metadata.framework/<span class="se">\</span>
    Resources/English.lproj/MDPredicateKeywords.plist
  1  17604                       open:entry mdfind /dev/autofs_nowait
  1  17604                       open:entry mdfind /System/Library/Frameworks/<span class="se">\</span>
    CoreServices.framework/Versions/A/Frameworks/Metadata.framework/<span class="se">\</span>
    Resources/English.lproj/schema.strings
...
</pre></div>
<p>Note the files <code>MDPredicateKeywords.list</code> and <code>schema.strings</code> in the <code>/System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/Metadata.framework/Resources/English.lproj</code> folder. I tried looking at the <code>schema.strings</code> file but it was in a binary format. So I tried <code>open schema.strings</code> and sure enough, Xcode launched and loaded the file which contains over 400 lines of mostly metadata keyword definitions. The ones we are interested in are near the end and of the form <code>kMDItemXXX.ShortName = yyy</code>:
</p>
<div class="codehilite"><pre><span class="na">&quot;kMDItemPixelHeight.ShortName&quot;</span>                <span class="o">=</span> <span class="s">&quot;pixelheight,height&quot;;</span>
<span class="na">&quot;kMDItemPixelWidth.ShortName&quot;</span>                 <span class="o">=</span> <span class="s">&quot;pixelwidth,width&quot;;</span>
<span class="na">&quot;kMDItemWhiteBalance.ShortName&quot;</span>               <span class="o">=</span> <span class="s">&quot;whitebalance&quot;;</span>
<span class="na">&quot;kMDItemAperture.ShortName&quot;</span>                   <span class="o">=</span> <span class="s">&quot;aperture,fstop&quot;;</span>
<span class="na">&quot;kMDItemAudioEncodingApplication.ShortName&quot;</span>   <span class="o">=</span> <span class="s">&quot;audioencodingapplication&quot;;</span>
<span class="na">&quot;kMDItemComposer.ShortName&quot;</span>                   <span class="o">=</span> <span class="s">&quot;composer,author,by&quot;;</span>
<span class="na">&quot;kMDItemLyricist.ShortName&quot;</span>                   <span class="o">=</span> <span class="s">&quot;lyricist,author,by&quot;;</span>
<span class="na">&quot;kMDItemStarRating.ShortName&quot;</span>                 <span class="o">=</span> <span class="s">&quot;starrating&quot;;</span>
</pre></div>
<p>These are just a few of the dozens of entries to whet your appetite. For the most part, I've found them to work as expected, with one exception: <code>starrating</code>. I never got any hits using it so I tried using <code>mdls</code> on an MP3 that I knew had a rating set in iTunes and there was no metadata attribute set on it for the iTunes rating. So I guess all you Mac developers out there should &quot;do as Apple says, not as Apple does.&quot;
</p>

<h2>Satisfaction</h2>
<p>One of the things that I really like about OS X is the ability to work with the system at varying levels of depth. This diversion started with me playing around with the Spotlight search bar: a very advanced &quot;desktop search&quot; feature found only in the most modern operating systems. But when I wanted to learn more, I was able to easily muck around at the command line and experiment with the very same infrastructure that Spotlight is built on. Finally, I was able to leverage a very powerful, low level system tool, <code>dtrace</code> to probe the details of what was going on inside OS X which led me to the answer I was looking for. 
</p>]]></content:encoded></item><item><title>Parsing Things Database</title><dc:creator>david.avraamides@mac.com</dc:creator><category>Technology</category><dc:date>2008-03-23T21:33:31-04:00</dc:date><link>http://davidavraamides.net/files/parsing-things-database.html#unique-entry-id-11</link><guid isPermaLink="true">http://davidavraamides.net/files/parsing-things-database.html#unique-entry-id-11</guid><content:encoded><![CDATA[<p>If you haven't taken a look at things, it's very slick application for managing todo lists. While it's not specifically designed around the <a href="http://en.wikipedia.org/wiki/Getting_Things_Done">GTD</a> process, it is close enough that it's easy for people who practice GTD - or something similar to it - to use Things to manage this process.
   <img class="imageStyle" alt="Picture 3" src="http://davidavraamides.net/files//page2_blog_entry11_1.png" width="635" height="576"/>
   Recently, I've had two different cases come up where I needed to print or export data from Things. Unfortunately, that's a feature area that is not finished. So I decided to take a closer look at the XML file format and see how difficult it would be to parse the file and create my own report.
</p>

<h2>Objects and Relationships</h2>
<p>Things' data file is a fairly simple XML file that is primarily a collection of <code>object</code> elements. These elements contain <code>attribute</code> elements which are the properties of an <code>object</code> and <code>relationship</code> elements which can model a one-to-one or a one-to-many relationship to other <code>object</code>s in the file. Here is a snippet of a test file I used.
</p>
<div class="codehilite"><pre><span class="nt">&lt;object</span> <span class="na">type=</span><span class="s">&quot;TODO&quot;</span> <span class="na">id=</span><span class="s">&quot;z159&quot;</span><span class="nt">&gt;</span>
    <span class="nt">&lt;attribute</span> <span class="na">name=</span><span class="s">&quot;focustype&quot;</span> <span class="na">type=</span><span class="s">&quot;int32&quot;</span><span class="nt">&gt;</span>131072<span class="nt">&lt;/attribute&gt;</span>
    <span class="nt">&lt;attribute</span> <span class="na">name=</span><span class="s">&quot;focuslevel&quot;</span> <span class="na">type=</span><span class="s">&quot;int16&quot;</span><span class="nt">&gt;</span>0<span class="nt">&lt;/attribute&gt;</span>
    <span class="nt">&lt;attribute</span> <span class="na">name=</span><span class="s">&quot;datemodified&quot;</span> <span class="na">type=</span><span class="s">&quot;date&quot;</span><span class="nt">&gt;</span>227707669.31836900115013122559<span class="nt">&lt;/attribute&gt;</span>
    <span class="nt">&lt;attribute</span> <span class="na">name=</span><span class="s">&quot;datecreated&quot;</span> <span class="na">type=</span><span class="s">&quot;date&quot;</span><span class="nt">&gt;</span>227707452.24475499987602233887<span class="nt">&lt;/attribute&gt;</span>
    <span class="nt">&lt;attribute</span> <span class="na">name=</span><span class="s">&quot;title&quot;</span> <span class="na">type=</span><span class="s">&quot;string&quot;</span><span class="nt">&gt;</span>Todo 1.1.1<span class="nt">&lt;/attribute&gt;</span>
    <span class="nt">&lt;attribute</span> <span class="na">name=</span><span class="s">&quot;index&quot;</span> <span class="na">type=</span><span class="s">&quot;int32&quot;</span><span class="nt">&gt;</span>0<span class="nt">&lt;/attribute&gt;</span>
    <span class="nt">&lt;attribute</span> <span class="na">name=</span><span class="s">&quot;identifier&quot;</span> <span class="na">type=</span><span class="s">&quot;string&quot;</span><span class="nt">&gt;</span>7F63B75E-11F4-4153-B222-7506882CAD79<span class="nt">&lt;/attribute&gt;</span>
    <span class="nt">&lt;attribute</span> <span class="na">name=</span><span class="s">&quot;compact&quot;</span> <span class="na">type=</span><span class="s">&quot;bool&quot;</span><span class="nt">&gt;</span>1<span class="nt">&lt;/attribute&gt;</span>
    <span class="nt">&lt;relationship</span> <span class="na">name=</span><span class="s">&quot;parent&quot;</span> <span class="na">type=</span><span class="s">&quot;1/1&quot;</span> <span class="na">destination=</span><span class="s">&quot;THING&quot;</span> <span class="na">idrefs=</span><span class="s">&quot;z169&quot;</span><span class="nt">&gt;&lt;/relationship&gt;</span>
    <span class="nt">&lt;relationship</span> <span class="na">name=</span><span class="s">&quot;author&quot;</span> <span class="na">type=</span><span class="s">&quot;1/1&quot;</span> <span class="na">destination=</span><span class="s">&quot;COWORKER&quot;</span><span class="nt">&gt;&lt;/relationship&gt;</span>
    <span class="nt">&lt;relationship</span> <span class="na">name=</span><span class="s">&quot;delegate&quot;</span> <span class="na">type=</span><span class="s">&quot;1/1&quot;</span> <span class="na">destination=</span><span class="s">&quot;COWORKER&quot;</span><span class="nt">&gt;&lt;/relationship&gt;</span>
    <span class="nt">&lt;relationship</span> <span class="na">name=</span><span class="s">&quot;focus&quot;</span> <span class="na">type=</span><span class="s">&quot;1/1&quot;</span> <span class="na">destination=</span><span class="s">&quot;FOCUS&quot;</span> <span class="na">idrefs=</span><span class="s">&quot;z150&quot;</span><span class="nt">&gt;&lt;/relationship&gt;</span>
    <span class="nt">&lt;relationship</span> <span class="na">name=</span><span class="s">&quot;recurrenceinstance&quot;</span> <span class="na">type=</span><span class="s">&quot;1/1&quot;</span> <span class="na">destination=</span><span class="s">&quot;TODO&quot;</span><span class="nt">&gt;&lt;/relationship&gt;</span>
    <span class="nt">&lt;relationship</span> <span class="na">name=</span><span class="s">&quot;recurrencetemplate&quot;</span> <span class="na">type=</span><span class="s">&quot;1/1&quot;</span> <span class="na">destination=</span><span class="s">&quot;TODO&quot;</span><span class="nt">&gt;&lt;/relationship&gt;</span>
    <span class="nt">&lt;relationship</span> <span class="na">name=</span><span class="s">&quot;scheduler&quot;</span> <span class="na">type=</span><span class="s">&quot;1/1&quot;</span> <span class="na">destination=</span><span class="s">&quot;GLOBALS&quot;</span><span class="nt">&gt;&lt;/relationship&gt;</span>
    <span class="nt">&lt;relationship</span> <span class="na">name=</span><span class="s">&quot;children&quot;</span> <span class="na">type=</span><span class="s">&quot;0/0&quot;</span> <span class="na">destination=</span><span class="s">&quot;THING&quot;</span><span class="nt">&gt;&lt;/relationship&gt;</span>
    <span class="nt">&lt;relationship</span> <span class="na">name=</span><span class="s">&quot;tags&quot;</span> <span class="na">type=</span><span class="s">&quot;0/0&quot;</span> <span class="na">destination=</span><span class="s">&quot;TAG&quot;</span> <span class="na">idrefs=</span><span class="s">&quot;z130 z102&quot;</span><span class="nt">&gt;&lt;/relationship&gt;</span>
    <span class="nt">&lt;relationship</span> <span class="na">name=</span><span class="s">&quot;reminderdates&quot;</span> <span class="na">type=</span><span class="s">&quot;0/0&quot;</span> <span class="na">destination=</span><span class="s">&quot;REMINDER&quot;</span><span class="nt">&gt;&lt;/relationship&gt;</span>
<span class="nt">&lt;/object&gt;</span>
</pre></div>
<p>You can see the <code>relationship</code> with the name <code>parent</code> which has one ID in its <code>idrefs</code> list. Further down, you can see the <code>tags</code> relationship has two references. There are three primary types of <code>object</code> elements:
</p>
<ol>
 <li>
     TODO: an actual todo item
 </li>

 <li>
     TAG: a tag object
 </li>

 <li>
     FOCUS: these are a generalized object type used for Projects, Areas and groupings of these (like Today, Next).
 </li>
</ol>
<p>All I do to parse the document is to create a <code>dict</code> for each object, using the <code>attribute</code> and <code>relationship</code> sub-elements as fields in the <code>dict</code>. Relationships are initially stored as lists of the string <code>idref</code> values. Then at the end of the parsing method, once I have all the objects loaded into an ID map, I resolve the references to the actual <code>dict</code> objects. This leaves me with a &quot;Pythonic&quot; graph of the data rather which is easier to work with (IMHO) for querying and processing.
</p>
<div class="codehilite"><pre><span class="k">def</span> <span class="nf">parse_things_xml</span><span class="p">(</span><span class="n">database</span><span class="p">):</span>
    <span class="sd">&quot;&quot;&quot;Parse object nodes of from of XML object elements</span>
<span class="sd">    and save in dicts. Parses attribute elements for dict fields</span>
<span class="sd">    and follows parent/children relationship elements to link up</span>
<span class="sd">    related nodes. Returns a list of root nodes (those without</span>
<span class="sd">    parents).</span>
<span class="sd">    &quot;&quot;&quot;</span>

    <span class="k">def</span> <span class="nf">parse_relationship</span><span class="p">(</span><span class="n">relationships</span><span class="p">,</span> <span class="n">name</span><span class="p">):</span>
        <span class="sd">&quot;&quot;&quot;Parse the specified relationship out of the list of relationships.</span>
<span class="sd">        Assumes there is only one relationship of the specified name.</span>
<span class="sd">        &quot;&quot;&quot;</span>
        <span class="n">rels</span> <span class="o">=</span> <span class="p">[</span><span class="n">r</span> <span class="k">for</span> <span class="n">r</span> <span class="ow">in</span> <span class="n">relationships</span>
                <span class="k">if</span> <span class="n">r</span><span class="o">.</span><span class="n">attributes</span><span class="p">[</span><span class="s">&#39;name&#39;</span><span class="p">]</span><span class="o">.</span><span class="n">value</span> <span class="o">==</span> <span class="n">name</span><span class="p">]</span>
        <span class="k">if</span> <span class="ow">not</span> <span class="n">rels</span><span class="p">:</span>
            <span class="k">return</span> <span class="bp">None</span>
        <span class="n">idrefs</span> <span class="o">=</span> <span class="n">rels</span><span class="p">[</span><span class="mf">0</span><span class="p">]</span><span class="o">.</span><span class="n">attributes</span><span class="o">.</span><span class="n">has_key</span><span class="p">(</span><span class="s">&#39;idrefs&#39;</span><span class="p">)</span> <span class="ow">and</span> \
                 <span class="n">rels</span><span class="p">[</span><span class="mf">0</span><span class="p">]</span><span class="o">.</span><span class="n">attributes</span><span class="p">[</span><span class="s">&#39;idrefs&#39;</span><span class="p">]</span><span class="o">.</span><span class="n">value</span><span class="o">.</span><span class="n">split</span><span class="p">(</span><span class="s">&#39; &#39;</span><span class="p">)</span> <span class="ow">or</span> <span class="p">[]</span>
        <span class="k">return</span> <span class="n">idrefs</span>

    <span class="n">xmldoc</span> <span class="o">=</span> <span class="n">minidom</span><span class="o">.</span><span class="n">parse</span><span class="p">(</span><span class="n">database</span><span class="p">)</span>
    <span class="n">objects</span> <span class="o">=</span> <span class="n">xmldoc</span><span class="o">.</span><span class="n">getElementsByTagName</span><span class="p">(</span><span class="s">&#39;object&#39;</span><span class="p">)</span>

    <span class="c"># parse each object element into a dict storing attribute</span>
    <span class="c"># child elements as dict fields and child relationship</span>
    <span class="c"># elements as lists of idref values</span>
    <span class="n">node_map</span> <span class="o">=</span> <span class="p">{}</span>
    <span class="k">for</span> <span class="n">obj</span> <span class="ow">in</span> <span class="n">objects</span><span class="p">:</span>
        <span class="n">node</span> <span class="o">=</span> <span class="p">{</span><span class="s">&#39;id&#39;</span><span class="p">:</span> <span class="n">obj</span><span class="o">.</span><span class="n">attributes</span><span class="p">[</span><span class="s">&#39;id&#39;</span><span class="p">]</span><span class="o">.</span><span class="n">value</span><span class="p">,</span>
                <span class="s">&#39;type&#39;</span><span class="p">:</span> <span class="n">obj</span><span class="o">.</span><span class="n">attributes</span><span class="p">[</span><span class="s">&#39;type&#39;</span><span class="p">]</span><span class="o">.</span><span class="n">value</span><span class="p">}</span>
        <span class="n">attrs</span> <span class="o">=</span> <span class="n">obj</span><span class="o">.</span><span class="n">getElementsByTagName</span><span class="p">(</span><span class="s">&#39;attribute&#39;</span><span class="p">)</span>
        <span class="k">for</span> <span class="n">attr</span> <span class="ow">in</span> <span class="n">attrs</span><span class="p">:</span>
            <span class="n">val</span> <span class="o">=</span> <span class="n">attr</span><span class="o">.</span><span class="n">hasChildNodes</span><span class="p">()</span> <span class="ow">and</span> <span class="n">attr</span><span class="o">.</span><span class="n">firstChild</span><span class="o">.</span><span class="n">data</span> <span class="ow">or</span> <span class="bp">None</span>
            <span class="n">node</span><span class="p">[</span><span class="n">attr</span><span class="o">.</span><span class="n">attributes</span><span class="p">[</span><span class="s">&#39;name&#39;</span><span class="p">]</span><span class="o">.</span><span class="n">value</span><span class="p">]</span> <span class="o">=</span> <span class="n">val</span>
        <span class="n">rels</span> <span class="o">=</span> <span class="n">obj</span><span class="o">.</span><span class="n">getElementsByTagName</span><span class="p">(</span><span class="s">&#39;relationship&#39;</span><span class="p">)</span>
        <span class="k">for</span> <span class="n">rel</span> <span class="ow">in</span> <span class="n">rels</span><span class="p">:</span>
            <span class="n">relname</span> <span class="o">=</span> <span class="n">rel</span><span class="o">.</span><span class="n">attributes</span><span class="p">[</span><span class="s">&#39;name&#39;</span><span class="p">]</span><span class="o">.</span><span class="n">value</span>
            <span class="n">node</span><span class="p">[</span><span class="n">relname</span><span class="p">]</span> <span class="o">=</span> <span class="n">parse_relationship</span><span class="p">(</span><span class="n">rels</span><span class="p">,</span> <span class="n">relname</span><span class="p">)</span>
        <span class="n">node_map</span><span class="p">[</span><span class="n">node</span><span class="p">[</span><span class="s">&#39;id&#39;</span><span class="p">]]</span> <span class="o">=</span> <span class="n">node</span>

    <span class="c"># resolve idrefs by replacing their value with a reference to the</span>
    <span class="c"># actual dict</span>
    <span class="k">for</span> <span class="n">node</span> <span class="ow">in</span> <span class="n">node_map</span><span class="o">.</span><span class="n">values</span><span class="p">():</span>
        <span class="k">for</span> <span class="n">relname</span><span class="p">,</span> <span class="n">idrefs</span> <span class="ow">in</span> <span class="n">node</span><span class="o">.</span><span class="n">items</span><span class="p">():</span>
            <span class="k">if</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">idrefs</span><span class="p">,</span> <span class="nb">list</span><span class="p">):</span>
                <span class="n">relnodes</span> <span class="o">=</span> <span class="p">[</span><span class="n">node_map</span><span class="p">[</span><span class="n">idref</span><span class="p">]</span> <span class="k">for</span> <span class="n">idref</span> <span class="ow">in</span> <span class="n">idrefs</span><span class="p">]</span>
                <span class="k">if</span> <span class="n">relnodes</span> <span class="ow">and</span> <span class="n">relnodes</span><span class="p">[</span><span class="mf">0</span><span class="p">]</span><span class="o">.</span><span class="n">has_key</span><span class="p">(</span><span class="s">&#39;index&#39;</span><span class="p">):</span>
                    <span class="n">relnodes</span><span class="o">.</span><span class="n">sort</span><span class="p">(</span><span class="n">key</span><span class="o">=</span><span class="k">lambda</span> <span class="n">n</span><span class="p">:</span> <span class="n">n</span><span class="p">[</span><span class="s">&#39;index&#39;</span><span class="p">])</span>
                <span class="n">node</span><span class="p">[</span><span class="n">relname</span><span class="p">]</span> <span class="o">=</span> <span class="n">relnodes</span>

    <span class="k">return</span> <span class="n">node_map</span><span class="o">.</span><span class="n">values</span><span class="p">()</span>
</pre></div>
<p>I also have a method to query the object graph based on some simple selection criteria and then a ''very'' simplistic printing routine to print out the results. The code is a little rough, but it's a start. I've organized it into two files: <code>thingslib.py</code> which does the parsing and querying, and <code>things.py</code> which is a script that you can call to select the items of interest and print them out. The code is available from my <a href="download/download.html" rel="self" title="download">download page</a>.
</p>]]></content:encoded></item><item><title>Amazon vs. iTunes</title><dc:creator>david.avraamides@mac.com</dc:creator><category>Technology</category><dc:date>2008-03-02T15:16:28-05:00</dc:date><link>http://davidavraamides.net/files/amazon-vs-itunes.html#unique-entry-id-10</link><guid isPermaLink="true">http://davidavraamides.net/files/amazon-vs-itunes.html#unique-entry-id-10</guid><content:encoded><![CDATA[<p>I should say that I'm a little old school in terms of buying music. Of my 4000+ track music collection only 90 songs have been purchased online. In the past I would typically buy CDs and rip them to MP3s, and now have over 400 CDs  in my &quot;physical&quot; collection (collecting dust in the basement, truth be told). But most of the 90 tracks I <em>have</em> purchased online have been singles bought in the last year.
</p>
<p>So when I was putting together a playlist for a party the other night, I was just about to click <img class="imageStyle" alt="buysong" src="http://davidavraamides.net/files//page2_blog_entry10_1.png" width="69" height="13"/> in iTunes when I thought &quot;why not try out Amazon.com?&quot; I'd heard they had a good collection, all DRM-free, with equal or better pricing than iTunes. And you can usually count on Amazon.com to make the buying experience simple and smooth. So I surfed over to Amazon.com, searched for the Psychedelic Furs  &quot;<a href="http://www.amazon.com/Love-My-Way-Album-Version/dp/B0013800GC/ref=sr_f2_1?ie=UTF8&s=dmusic&qid=1204510280&sr=102-1" rel="self">Love My Way</a>&quot; and bought my first MP3 on Amazon.com.
</p>
<p>The first time you purchase an MP3 on Amazon.com, it installs an application that manages the file download process and automatically adds the files to your iTunes library. The files are fully tagged and include artwork just like a song you would by from the iTunes Music Store. Additional purchases have a very similar experience to iTMS: search for a song, play a sample of it if you like, one click to buy and in a few seconds it's in iTunes.
</p>
<p>Amazon's MP3s are also pretty high quality, encoded at or near 256 kbps, some using variable bit-rate and others using constant bit-rate. I also noticed that they use the LAME 3.x encoder, considered by many to be one of the best MP3 encoders out there.
</p>
<p>In all I bought 10 singles and one 13-track album, spending a total of $17.61. Of the 10 singles I purchased, 8 were $0.99, and 2 were $0.89. The album was $7.97.  Had I purchased these on iTMS I would have paid an additional $2.22 or 12.5% more (all of the songs were $0.99 on iTMS and the album was $9.99, and interestingly enough none of the songs I bought were available in iTunes Plus DRM-free format).
</p>
<p>So not only did I save a couple of bucks, but more importantly I purchased DRM-free music where I'm not limited on how and where I can play my music. And the purchasing experience was about as close to a fully-iTunes-integrated solution as you could get without the software being written by Apple. Apple may need to swallow a bit of it's pride and go back to the record companies to work out DRM-free deals with everyone or they may be in for a tough fight with Amazon.com over digital music.
</p>]]></content:encoded></item><item><title>Blog Reboot</title><dc:creator>david.avraamides@mac.com</dc:creator><category>Technology</category><dc:date>2008-02-23T13:10:12-05:00</dc:date><link>http://davidavraamides.net/files/blog-reboot.html#unique-entry-id-8</link><guid isPermaLink="true">http://davidavraamides.net/files/blog-reboot.html#unique-entry-id-8</guid><content:encoded><![CDATA[<p>You see, about a year and a half ago - after many years as a Windows guy - I decided to give the Macintosh a try. I had many specific reasons but the overarching theme was that I wanted to spend more time using my home computer to get things done and less time tinkering, experimenting and researching how best to do things. It&rsquo;s a trap programmers often fall in: instead of doing something simple like writing a program to create a report, you spend time researching reporting languages or learning report writers or figuring out how to automate Excel to do the report (because it&rsquo;s ever so close to what you want), and on and on. And after wasting days or weeks on those distractions, you inevitably come to the same conclusion: it would be easier if you just wrote a custom reporting system yourself. In fact, what you really need is a <em>reporting grammar</em> because what problem today can&rsquo;t best be solved with a <a href="http://en.wikipedia.org/wiki/Domain-specific_programming_language" rel="self">DSL</a>?
</p>
<p>So then you&rsquo;re off looking at parsing tools and trying to decide which one best fits your problem, but since all the simple ones have really only been designed to parse math expressions and all the fully-featured ones were really designed to parse the language they were written in, you come to that same conclusion again: you could probably just design a simple parser generator for reporting grammars on your own. Really, how hard can <em>that</em> be?
</p>
<p>And then hopefully, you stop. And think. And try to wind your way backwards along the thread in your mind that led you down this path back to ... what was I trying to do? Oh yeah, create a simple report.
</p>
<p>So that&rsquo;s how I ended up writing a blog from scratch. And how I spent much more time building, tweaking and maintaining the site then I spent writing posts for the site. Which is precisely how I ended up with this new version of my blog, using RapidWeaver. It allows me to easily post blog entries, share photos and videos, but still gives me a lot of control of the site without turning this into a programming project.
</p>
<p>That doesn't mean I won't do <em>some</em> hacking of RW (which I've already done, and I'll talk about more later), but I'm hoping I won't get sucked in like I did before.
</p>]]></content:encoded></item><item><title>Minor Frustrations</title><dc:creator>david.avraamides@mac.com</dc:creator><category>Technology</category><dc:date>2007-01-04T23:43:30-05:00</dc:date><link>http://davidavraamides.net/files/minor-frustrations.html#unique-entry-id-4</link><guid isPermaLink="true">http://davidavraamides.net/files/minor-frustrations.html#unique-entry-id-4</guid><content:encoded><![CDATA[<h2>The World's Largest Console</h2>
<p>The first problem arose after I had transferred over all my applications, data and settings from my Macbook. Apple includes a Migration Utility to help you move data from another Mac when you first setup a new one, and I thought I'd give this a spin and see how it worked. I chose to transfer over <em>everything</em> just to see what would happen. At the same time this was going on, OSX was downloading a number of OS and firmware updates to bring my machine up to date. I let both of these run to completion and then was prompted to restart my computer to apply some of the OSX updates (yes, updating a Mac sometimes requires a restart, just like Windows).
</p>
<p>The machine restarted and I got the familiar login window, clicked on my name, entered my password and ... the screen went black. Huh? No, it wasn't completely black, there was a little text way up in the upper left corner::
</p>
<div class="codehilite"><pre>Welcome to Darwin!
Login:
</pre></div>
<p>Somehow the OSX window manager had exited and dumped me right to the Darwin console. I tried logging in and exiting, rebooting, even typing Ctrl-D at the Login prompt to exit out of the terminal mode. Everything eventually just took me back to the graphical login window and from there, the console login prompt. Yikes!
</p>
<p>Googling didn't reveal anything obvious so I did what any well-trained Windows user would do: I reinstalled OSX from scratch. But this time I skipped the Migration Utility, partly because I was suspicious that it may have been the culprit, and partly because I didn't really want to transfer everything over (and lastly because I'd never installed OSX from scratch and wanted to give it a try). That seemed to do the trick and the machine was now logging me into OSX correctly.
</p>

<h2>DVDead</h2>
<p>The second problem I ran into had to do with playing a DVD. I play DVDs on my Macbook fairly often, and it works great. I wanted to see how one would look on a 24-inch display so I popped in <a href="http://www.amazon.com/Skin-Bones-Foo-Fighters/dp/B000J6I0PM">Skin and Bones</a>, waited for the usual whirring and humming of the DVD drive, saw QuickTime/iDVD come up and then to my surprise saw the message: <strong>Supported disc not available.</strong> After trying a few things (reinserting the DVD, using a different DVD, rebooting), I started to get a little worried.
</p>
<p>This time, Google was my friend. I got quite a few hits about this problem and a pretty consistent suggestion as a solution: use the Disk Utility to fix the permissions. So I booted off the installation disk, ran Repair Disk Permissions (while wondering to myself, <em>why is this necessary?</em>) and restarted the iMac. I popped in the DVD and it worked.
</p>
<p>Yes, the Mac is a great machine and OSX is a very nice operating system, but it does have its occasional problems. What's interesting, though, is that when I run into these issues on the Mac, I'm generally <em>surprised</em>. When I run into them on Windows, however, I usually just take them in stride, as if reinstalling apps, rebooting, and Googling your way out of the <em>error tarpit</em> is just the status quo. Using a Mac is like driving down a smooth highway and hitting the occasional bump in the road. Using Windows feels more like you are driving down a dirt road: you expect a rough ride and look forward to the occasioanal clear patches.
</p>]]></content:encoded></item><item><title>Size Matters</title><dc:creator>david.avraamides@mac.com</dc:creator><category>Technology</category><dc:date>2007-01-04T21:59:55-05:00</dc:date><link>http://davidavraamides.net/files/size-matters.html#unique-entry-id-3</link><guid isPermaLink="true">http://davidavraamides.net/files/size-matters.html#unique-entry-id-3</guid><content:encoded><![CDATA[<h2>Out of the Box</h2>
<img class="imageStyle" alt="overview-box" src="http://davidavraamides.net/files//page2_blog_entry3_1.jpg" width="161" height="137"/>

<p>Opening any Apple product is an experience in itself. I sometimes think they put more though into the packaging of their products than many companies put into the actual product itself. Then once you unwrap the product, there are those first few minutes where you marvel in the simple, elegant and clever design and engineering. It doesn't seem to matter if its a monstrous 24-inch iMac, a <a href="http://www.apple.com/ipodshuffle/">sub $100 iPod</a> or even a <a href="http://www.apple.com/airportexpress/">wireless access point</a>. Opening a box that says &quot;Designed by Apple in California&quot; puts a smile on my face and brings back the same feelings I had as a kid when opening a Christmas present. (But now my toys are a <em>lot</em> more expensive).
</p>

<h2>Hardware</h2>
<p>The iMac is very clean: it has a flush-mounted power cord that when inserted seems to emerge right out of the computer's case. There is a guide hole in the back of the stand to lead the cord through so it is mostly hidden from view. The various I/O ports are hidden behind the lower right of the display, leading cables back and away (rather than out the side of the display where they would be more noticable). The large power button is in the back of the lower-left corner of the display, keeping it hidden, but easy to find by touch.
</p>
<p>The display is properly balanced on a hinge in the stand so even though its large and heavy, it takes very little effort to tilt the display when adjusting it. This is easily accomplished with one hand pushing or pulling on the lower edge of the display. There is just the right amount of friction to keep it in place but allow for fine adjustments without jerking the display.
</p>
<p>And then there is the remote control: simple and small, but it does what you need it to when driving Front Row. When I opened the accessories box, I thought to myself &quot;I probably won't use the remote that much but I don't want to lose it. Maybe I should just leave it in the box.&quot; Yep, someone at Apple thought about that too. That's why there is a magnet inside the right edge of the display so you can just stick the remote right on the side of the display. Its out of the way, but easy to find when you need it. Nice.
</p>
<p>The display is beautiful. A 24-inch widescreen may not sound that big these days when 20-inch monitors are commonplace and many people have two or even three monitors, but when you are sitting a foot and a half away from this display it feels very large. I actually found myself initially having to move my head a little bit to see the different corners of the screen. I have 2 side-by-side 20-inch LCD displays at work but there's something about one complete workspace that feels a lot bigger. Two separate displays feel like, well, two separate displays. The borders of the monitor break up the display surface so you rarely want one window to span across them. It takes constant fiddling to keep them exactly the same height such that the two screens line up correctly, and frequently there will be slight color variations between one and the other that is both annoying and distracting.
</p>
<p>The colors are rich and the display is very bright. And watching a DVD felt like I was really in a theater (or at least in front of a very nice TV). This display covers so much of your peripheral vision that its easy to get lost in the movie and forget you are watching it on a computer. I never had that feeling on the Macbook because everything around the smaller display was constantly in my attention zone and can easily become a distraction. This display really grabs your full attention.
</p>
<blockquote><p>Note that while I was very impressed with the out-of-the box experience of my new iMac, I did, however, run into a few snags with the new machine, which you can read about <a href="files/minor-frustrations.html" rel="self" title="blog:Minor Frustrations">here</a>.
</p>
</blockquote>
<h2>Where are the Beautiful PCs?</h2>
<p>As I was playing with my new iMac, I kept wondering why can't you buy such a machine from a PC manufacturer? Apple now has the price of their line of computers competitive with equivalent PCs, so its not like they would be so expensive that people wouldn't buy them. And its not like people don't appreciate good hardware. People pay a premium for all kinds of specialties: size with ultra-portables, performance with high-end gaming machines, quiet PCs, small form PCs, PCs with large displays. So wouldn't it make sense that in such a large market there would be a segment of people who would pay a premium for a beautiful, well-designed, all-in-one PC that is simple to setup, just works and is a pleasure to use? Of course, with Bootcamp, Parallels, and now VMWare, there is. The iMac.
</p>]]></content:encoded></item><item><title>Why Python?</title><dc:creator>david.avraamides@mac.com</dc:creator><category>Programming</category><dc:date>2006-12-03T21:43:30-05:00</dc:date><link>http://davidavraamides.net/files/why-python.html#unique-entry-id-0</link><guid isPermaLink="true">http://davidavraamides.net/files/why-python.html#unique-entry-id-0</guid><content:encoded><![CDATA[<h2>Background</h2>
<p>There is no <em>best</em> programming language, contrary to what some people will say. And to understand why I picked Python as my <code>goto</code> language, you need to understand a little bit about the context of my decision.
</p>
<p>Most of my career has been spent developing proprietary software for small groups of users (10-50) in the financial services industry, either for small departments in a larger firm or for all employees within a small company. There are a few important characteristics of this environment that are worth noting:
</p>
<ol>
 <li>
     The financial services domain changes rapidly. New types of products are created every few months, new trading strategies are constantly developed, and the landscape for third-party products and financial market data evolves and changes at a very fast pace.
 </li>

 <li>
     The number of users is relatively small (from a handful to dozens). 
 </li>

 <li>
     The users are not technical. They do not understand the complexities of software development and it can be very difficult to get the users to spend the time to define a problem with any level of clarity or detail that would provide a useful specification.
 </li>

 <li>
     These are not technology companies. You will usually not find a lot of internal support to build a complete development team (e.g. with roles covering project management, quality assurance, tools developer, support, etc). If your boss doesn't see everyone on your team writing code all day then they will think something is wrong.
 </li>
</ol>
<p>We can identify some requirements for a &quot;good&quot; programming language from the above points:
</p>
<ul>
 <li>
     The first point speaks to flexibility and the need to design for change. You want to develop solutions that can be easily modified as business needs change, and you want to choose tools that lend themselves nicely to this goal.
 </li>

 <li>
     The second point simply reminds us that we don't need to get carried away on complex architectures or get preoccupied with performance - simpler solutions will usually suffice.
 </li>

 <li>
     From point 3, I've learned that it really helps to know the financial business and to try and hire people that are conversant in the problem domain. But that's not easy so you can't rely on growing a team to meet the demands of the business. You may need get by with a smaller group and need to choose tools that really help your productivity.
 </li>

 <li>
     The last point has always been one of the harder things about working in finance - I usually don't have much in the way of technical peers to bounce ideas off of or to get a &quot;second pair of eyes&quot; when debugging. So its good to pick a language that has a rich community so you can fill that void outside of the office.
 </li>
</ul>
<p>There are also a couple of important lessons I learned from previous jobs in this environment:
</p>
<ol>
 <li><p><em>One language is better than many.</em> I know people will often say you should use the right tool for the job, but in environments I've worked at where using multiple programming languages was the norm, its always been a disaster. There has always been way too much time spent on integration between the languages, and you will often end up with multiple implementations of the same components. Its also rare that every team member will be proficient in every language, so you don't have the ability to reassign people to different projects as easily.
</p>

 </li>

 <li><p><em>Web applications work well for most problems.</em> Web applications have a lot of benefits even for <em>intranet</em> applications. Deployment is dead simple and updates can be rolled out intraday. Users are familiar with the model and often don't require much training and support. Web applications perform well even for remote users connecting over a VPN. And modern development techniques have demonstrated that the web works nicely even for more dynamic, interactive types of applications.
</p>

 </li>
</ol>

<h2>The Language</h2>

<h3>Lessons Learned</h3>
<p>I've used a number of programming languages over the years but have never really been passionate about any one. I started in C, enjoyed it a lot, and used it in the first few years of my career, and like many others, moved &quot;up&quot; to C++ for the next few years. But I never really <em>liked</em> C++. It always seemed that the benefits from the better OO support were outweighed by the added heft and complexity of the language. Add to that templates and MFC, and you sometimes feel like you are using a <a href="http://en.wikipedia.org/wiki/Rube_Goldberg_machine">Rube Goldberg</a> programming language.
</p>
<p>From there I moved onto the more modern compiled languages, first Java and then later C#/.NET. While I found both those languages a breath of fresh air from the C++ world, I also found the simplicity of the languages lost in the sheer scale of the ecosystem they lived in. The class libraries were very large, but oddly void of some useful libraries (where is the FTP library in C#, for example?).  And the index page of the Java docs for J2EE should borrow a line from <a href="http://en.wikipedia.org/wiki/The_Divine_Comedy">Dante</a> and warn the naive programmer <em>&quot;Abandon all hope, ye who enter here&quot;</em>.
</p>
<p>In the financial services area, there are also lots little data processing tasks that need to be performed (parsing broker files, web scraping, exporting data to vendors). I found C# too verbose for simple things like parsing a flat file and transforming the lines. 
</p>
<p>But there were a couple of things I did like about my experience using C#/.NET - it was the <em>only</em> language we used at that firm and the advantages of one language were real and notable:
</p>
<ul>
 <li>
     great reuse due to one common library
 </li>

 <li>
     good collaboration on the team as we all were &quot;thinking in C#&quot; 
 </li>

 <li>
     a sort of network effect of shared knowledge that occurs when everyone is gaining expertise in the same set of tools
 </li>
</ul>

<h3>Interpreting the Options</h3>
<p>With those experiences behind me, I started looking at interpretive programming languages. I had used Perl in the past and quickly ruled it out. I just think that it promotes <a href="http://www.tbray.org/ongoing/When/200x/2003/07/31/PerlAngst">&quot;write only code&quot;</a> and that the language (and community) seems to value tricky one-liners more than explicit, understandable (and maintainable) code.
</p>
<p>So I quickly whittled my choices down to Python and Ruby. After reading a lot of articles, primers and sample code, I found Python a better fit for my needs. I liked the focus on simplicity, explicitness, readability, practicality and &quot;batteries included&quot;. It has great documentation and a very rich and helpful community. Ruby had a lot of the same things going for it but I found the libraries less comprehensive and developed and the community at least seemed smaller when I surfed around for various resources. (And frankly I don't find Ruby code <em>that</em> beautiful like your hear so often from many of its fans.)
</p>

<h2>The Framework</h2>
<p>Once I settled on the language, I wanted to learn the &quot;right&quot; way to build web applications in Python. I had done a fair amount of ASP.NET in C# in my past job and found it lacking. If you stayed within the basics - using standard web controls and following the post-back model - it worked well. But for any real application, you quickly needed to go &quot;outside the lines&quot; where ASP.NET quickly became both complex and confining. Also, I just don't think that trying to make web programming like VB GUI programming is the right conceptual model for web apps.
</p>
<p>In researching Ruby, I of course had run across Ruby on Rails. In some ways its probably more popular than Ruby itself. Python had a number of frameworks and tools to build web applications, but three stood out: <a href="http://www.turbogears.org/">Turbo Gears</a>, <a href="http://www.djangoproject.com/">Django</a> and <a href="http://webpy.org/">web.py</a>. Turbo Gears and Django are higher level, full featured systems similar to Rails, while web.py is a much simpler and smaller toolkit, but quite compelling, nonetheless.
</p>
<p>Since I was really looking for big productivity dividends, I ruled out web.py. There were too many things that the other frameworks had that I would need and I would end up having to build (or locate and integrate) on my own. While I liked both Turbo Gears and Django, I liked the consistency of Django that comes from a single complete project rather than a &quot;best of breed&quot; approach that Turbo Gears takes.
</p>

<h2>Results</h2>
<p>I've now been using Python and Django for about 10 months and am <em>very</em> happy with my decision. I've written <a href="http://davidavraamides.net/blog/2006/05/11/yet-another-django-blog/">this blog</a> and a <a href="http://davidavraamides.net/blog/2006/07/27/getting-things-done-django-style-part-1/">simple todo app</a> for the fun of it, and have built up about 12k lines of code on my work projects.  At work I built up a nice collection of tools to help my dev process mostly by using standard Python packages. These include 
</p>
<ul>
 <li>
     a web app for monitoring and managing scheduled tasks
 </li>

 <li>
     a unit testing framework that produces a complete web site of coverage charts, statistics and decorated source listings 
 </li>

 <li>
     automated code documentation site
 </li>
</ul>
<p>The more I learn about both Python and Django, the better I feel about the direction I've taken. Python seems to be a language that is simple when you are learning, but rich and powerful when you need it to be. Django shares these same qualities, and while still in its infancy, has already proven itself in many large scale web sites.
</p>
<p>But probably the best thing I can say about Python is that I really enjoy using it. And <em>that</em> is &quot;Why Python.&quot;
</p><br />]]></content:encoded></item><item><title>Jumping Out of Windows</title><dc:creator>david.avraamides@mac.com</dc:creator><category>Technology</category><dc:date>2006-08-25T23:14:48-04:00</dc:date><link>http://davidavraamides.net/files/jumping-out-windows.html#unique-entry-id-2</link><guid isPermaLink="true">http://davidavraamides.net/files/jumping-out-windows.html#unique-entry-id-2</guid><content:encoded><![CDATA[<h2>Sold!</h2>
<p>So now, a short two months later, I can safely say I'm done <em>evaluating</em> the Mac and now working on <em>migrating</em> to it completely. My MacBook has become my day-to-day personal machine and I really only use my existing Windows workstations at home to surf when I happen to be right in front of the machine.
</p>
<p>I've been doing a test run with different types of data and applications to find the best way (for me) to use the Mac and to decide if there's any type of conversion I need to go through as I migrate my data. I ripped a handful of new music CDs and imported about 400 photos so I could play around with some multimedia content. I made a nice iPhoto book of my recent vacation in Vermont (way cool!), which has been a big hit with the grandparents. I checked out my subversion repository to try programming on the Mac, and outside of a few browser rendering quirks between IE and Safari, everything worked fine.
</p>
<p>I had been using Microsoft Money for a long time to track my personal finances but was not really thrilled with the app (and the underwhelming annual upgrades). I did a lot of reading and reviewing of other products and finally settled on <a href="http://moneydance.com/">Moneydance</a>. Migrating from my Money files wasn't that difficult - lots of people had done the same thing and posted notes to the discussion group - but I also wanted to clean up my data along the way and simplify my categorizations so there was a fair amount of work in getting it set up.
</p>
<p>There was also one Windows-specific app that I'd become quite used to: <a href="http://keepass.sourceforge.net/">KeePass</a>. Its an application for managing the many passwords we all accumulate these days. It is an open-source application, but was developed for Windows so I wasn't quite sure how I was going to migrate the password database, plus I use it at both home and work so I wanted to find something that would work on OS X and Windows (so Keychain was out). It turns out I wasn't the only one with this problem as there is now a project for a Linux/OS X port of the application called <a href="http://keepassx.sourceforge.net/">KeePassX</a>.  Although its lacking some of the cooler features of the Windows version (like auto-type), it supports the same database format and that's what was important to me.
</p>

<h2>Backups</h2>
<p>I've always been pretty good about backing up my data regularly and over the years had settled on scheduled tasks on all my machines that use <a href="http://www.microsoft.com/downloads/details.aspx?familyid=9d467a69-57ff-4ae7-96ee-b18c4790cffd&amp;displaylang=en">Robocopy</a> to mirror my data to my server, and then to mirror the server to a USB hard drive. I pretty quickly learned that <code>rsync</code> was the analogous technique for a *nix-based OS. A little googling led me to <a href="http://rsyncbackup.erlang.no/">rsyncbackup</a> which uses a Perl script to manage a set of <em>sources</em>, <em>destinations</em> and <em>backupsets</em> and invoke <code>rsync</code> to simplify the backup process. I configured it for a USB thumb drive, my home NAS and my externally-hosted Linux server.
</p>
<p>Now that I have a feel for how I want to backup my data, I'll probably just write a Python script to drive the process. While <code>rsyncbackup</code> is pretty nice its a little more complicated than necessary (about half of the script is dedicated to parsing the config files, which I would just hard wire in as Python <code>dict</code>s). And its written in Perl: been there, done that, won't repeat.
</p>
<p>With the recent <a href="http://events.apple.com.edgesuite.net/aug_2006/event/index.html">demonstration</a> of Panther (OS X 10.5) at the WWDC, I'm very curious to play around with <a href="http://www.apple.com/macosx/leopard/timemachine.html">Time Machine</a>. I think that will augment my backup strategy, but not replace it. I think Time Machine will mostly eliminate the need to recover an accidentally deleted file from an external backup, but you'll still want to backup your current data regularly to some external devices - even <a href="http://en.wikipedia.org/wiki/Timeline_%28novel%29">time machines break</a> once in a while.
</p>

<h2>Open Data Formats</h2>
<p>As I've put more and more effort into building, organizing and cleaning my personal data library (music, photos, etc.), I've become more and more concerned about choosing an approach that gives me control over <em>my data</em>. Mark Pilgrim has <a href="http://diveintomark.org/archives/2006/06/02/when-the-bough-breaks">written about this</a> and for him the solution was to move <em>away</em> from the Mac. While I understand his line of reasoning, I'm not driven as much by the principles between open and proprietary file formats, but more by the pragmatism of being able to do what I want. <em>(Or maybe I'm just one OS behind Mark?)</em>
</p>
<p>So I did a little research into Moneydance, iTunes and iPhoto (the most relevant applications for me right now, with respect to data ownership issues) and have found ways to preserve my data and meta data in ways that allow me to: 1) restore the information if data is lost or corrupted, and 2) export the data to other applications if I ever feel like I've outgrown any of the applications. Of course, this isn't as easy as it <em>could</em> be (i.e. simple import/export features built into each application). I'll have to resort to a little coding to make it work, but I'm fine with that.
</p>

<h2>Diving In</h2>
<p>I'm now planning out how to migrate my other PCs to Macs. The only complication is that my wife is pretty set on Windows so I don't think I can cut her over (not without <em>lots</em> of tech support calls to my office). I'm going to first replace my other two desktop PCs at home with a 17&quot; iMac and either another 20&quot; iMac or a Mac Pro. To support my wife's need for Windows, I've worked out how to setup her Mac login such that it will automatically run Parallels in full screen when she logs in. To her it will look just like a Windows machine, but over time she'll have the opportunity to get more familiar with OS X without being forced to switch overnight.
</p>
<p>The last bit of the puzzle is to get a Mac mini and use it as a home theater PC for viewing photos, listening to music and watching videos. It works pretty well out of the box with the Mac remote and Front Row, but there are other HTPC apps out there that are worth researching. There are also some cabling issues that I need to look into, but I've learned enough to know that it will work nicely. And it will look cool. <em>(Why are all the Windows-based HTPCs big, slow and expensive?)</em>
</p>
<p>So after dipping my toes in the Apple <del>Kool-aid</del> water, I'm ready to dive right in...
</p><br />]]></content:encoded></item><item><title>Page Stats Middleware</title><dc:creator>david.avraamides@mac.com</dc:creator><category>Programming</category><dc:date>2006-07-03T21:38:09-04:00</dc:date><link>http://davidavraamides.net/files/page-stats-middleware.html#unique-entry-id-6</link><guid isPermaLink="true">http://davidavraamides.net/files/page-stats-middleware.html#unique-entry-id-6</guid><content:encoded><![CDATA[<img class="imageStyle" alt="stats" src="http://davidavraamides.net/files//page2_blog_entry6_1.png" width="357" height="54"/><br /><p>I wanted to know the time spent generating the page in total as well as the breakdown between Python (Django) and the database. Additionally, I wanted to know how many times the database was queried as too many queries can be a hint that you aren't accessing the database efficiently.<br />

</p>
<p>Since I wanted the feature to work across any Django app, a custom middleware class seemed like the right approach. Writing a custom middleware is quite easy and, like so many other things in Django, <a href="http://www.djangoproject.com/documentation/middleware/">well documented</a>.
</p>

<h2>Calculating the Metrics</h2>
<p>Luckily, the Django developers had already thought of the value of collecting some basic database statistics, so I simply had to figure out how to access them. When <code>DEBUG</code> mode is on (through your settings file), a database backend class wraps a <code>cursor</code> in a <code>CursorDebugWrapper</code> object which keeps a list of each SQL query executed along with its execution time. I just had to make sure <code>DEBUG</code> was enabled, if not already, and then add up the time of any calls incurred during this page's invocation. 
</p>
<p>The total time to process the view is measured in the middleware's <code>process_view</code> routine and then the time spent in Python (i.e Django) was easy to back out by timing the entire view call and then subtracting out the database time.
</p>

<h2>Viewing the Stats</h2>
<p>The last part of the puzzle was how to put this information in the page itself. Normally in Django you would use a custom template tag for such a chore but I couldn't since I wouldn't have completed timing the view call until the template had already been rendered.
</p>
<p>I decided to follow the spirit of template substitution by creating a special HTML comment that described the format of the output and then replacing the placeholder with the formatted output as the middleware returned the view to its caller.
</p>

<h2>The <code>StatsMiddleware</code> Class</h2>
<p>The code is pretty straightforward. First, I save the state of the debug setting and then enable debugging. This triggers the use of the debug wrapper class in the database backend which will keep the stats of the database part of the execution. Then I save the size of the connection's queries where this debug info is stored so I will know which queries were additionally called during this view's invocation. Then I build the view, keeping track of its time
</p>
<div class="codehilite"><pre><span class="k">import</span> <span class="nn">re</span>
<span class="k">from</span> <span class="nn">operator</span> <span class="k">import</span> <span class="n">add</span>
<span class="k">from</span> <span class="nn">time</span> <span class="k">import</span> <span class="n">time</span>
<span class="k">from</span> <span class="nn">django.db</span> <span class="k">import</span> <span class="n">connection</span>

<span class="k">class</span> <span class="nc">StatsMiddleware</span><span class="p">(</span><span class="nb">object</span><span class="p">):</span>

    <span class="k">def</span> <span class="nf">process_view</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">request</span><span class="p">,</span> <span class="n">view_func</span><span class="p">,</span> <span class="n">view_args</span><span class="p">,</span> <span class="n">view_kwargs</span><span class="p">):</span>

        <span class="c"># turn on debugging in db backend to capture time</span>
        <span class="k">from</span> <span class="nn">django.conf</span> <span class="k">import</span> <span class="n">settings</span>
        <span class="n">debug</span> <span class="o">=</span> <span class="n">settings</span><span class="o">.</span><span class="n">DEBUG</span>
        <span class="n">settings</span><span class="o">.</span><span class="n">DEBUG</span> <span class="o">=</span> <span class="bp">True</span>

        <span class="c"># get number of db queries before we do anything</span>
        <span class="n">n</span> <span class="o">=</span> <span class="nb">len</span><span class="p">(</span><span class="n">connection</span><span class="o">.</span><span class="n">queries</span><span class="p">)</span>

        <span class="c"># time the view</span>
        <span class="n">start</span> <span class="o">=</span> <span class="n">time</span><span class="p">()</span>
        <span class="n">response</span> <span class="o">=</span> <span class="n">view_func</span><span class="p">(</span><span class="n">request</span><span class="p">,</span> <span class="o">*</span><span class="n">view_args</span><span class="p">,</span> <span class="o">**</span><span class="n">view_kwargs</span><span class="p">)</span>
        <span class="n">totTime</span> <span class="o">=</span> <span class="n">time</span><span class="p">()</span> <span class="o">-</span> <span class="n">start</span>

        <span class="c"># compute the db time for the queries just run</span>
        <span class="n">queries</span> <span class="o">=</span> <span class="nb">len</span><span class="p">(</span><span class="n">connection</span><span class="o">.</span><span class="n">queries</span><span class="p">)</span> <span class="o">-</span> <span class="n">n</span>
        <span class="k">if</span> <span class="n">queries</span><span class="p">:</span>
            <span class="n">dbTime</span> <span class="o">=</span> <span class="nb">reduce</span><span class="p">(</span><span class="n">add</span><span class="p">,</span> <span class="p">[</span><span class="nb">float</span><span class="p">(</span><span class="n">q</span><span class="p">[</span><span class="s">&#39;time&#39;</span><span class="p">])</span> 
                                  <span class="k">for</span> <span class="n">q</span> <span class="ow">in</span> <span class="n">connection</span><span class="o">.</span><span class="n">queries</span><span class="p">[</span><span class="n">n</span><span class="p">:]])</span>
        <span class="k">else</span><span class="p">:</span>
            <span class="n">dbTime</span> <span class="o">=</span> <span class="mf">0.0</span>

        <span class="c"># and backout python time</span>
        <span class="n">pyTime</span> <span class="o">=</span> <span class="n">totTime</span> <span class="o">-</span> <span class="n">dbTime</span>

        <span class="c"># restore debugging setting</span>
        <span class="n">settings</span><span class="o">.</span><span class="n">DEBUG</span> <span class="o">=</span> <span class="n">debug</span>

        <span class="n">stats</span> <span class="o">=</span> <span class="p">{</span>
            <span class="s">&#39;totTime&#39;</span><span class="p">:</span> <span class="n">totTime</span><span class="p">,</span>
            <span class="s">&#39;pyTime&#39;</span><span class="p">:</span> <span class="n">pyTime</span><span class="p">,</span>
            <span class="s">&#39;dbTime&#39;</span><span class="p">:</span> <span class="n">dbTime</span><span class="p">,</span>
            <span class="s">&#39;queries&#39;</span><span class="p">:</span> <span class="n">queries</span><span class="p">,</span>
            <span class="p">}</span>

        <span class="c"># replace the comment if found            </span>
        <span class="k">if</span> <span class="n">response</span> <span class="ow">and</span> <span class="n">response</span><span class="o">.</span><span class="n">content</span><span class="p">:</span>
            <span class="n">s</span> <span class="o">=</span> <span class="n">response</span><span class="o">.</span><span class="n">content</span>
            <span class="n">regexp</span> <span class="o">=</span> <span class="n">re</span><span class="o">.</span><span class="n">compile</span><span class="p">(</span><span class="s">r&#39;(?P&lt;cmt&gt;&lt;!--\s*STATS:(?P&lt;fmt&gt;.*?)--&gt;)&#39;</span><span class="p">)</span>
            <span class="n">match</span> <span class="o">=</span> <span class="n">regexp</span><span class="o">.</span><span class="n">search</span><span class="p">(</span><span class="n">s</span><span class="p">)</span>
            <span class="k">if</span> <span class="n">match</span><span class="p">:</span>
                <span class="n">s</span> <span class="o">=</span> <span class="n">s</span><span class="p">[:</span><span class="n">match</span><span class="o">.</span><span class="n">start</span><span class="p">(</span><span class="s">&#39;cmt&#39;</span><span class="p">)]</span> <span class="o">+</span> \
                    <span class="n">match</span><span class="o">.</span><span class="n">group</span><span class="p">(</span><span class="s">&#39;fmt&#39;</span><span class="p">)</span> <span class="o">%</span> <span class="n">stats</span> <span class="o">+</span> \
                    <span class="n">s</span><span class="p">[</span><span class="n">match</span><span class="o">.</span><span class="n">end</span><span class="p">(</span><span class="s">&#39;cmt&#39;</span><span class="p">):]</span>
                <span class="n">response</span><span class="o">.</span><span class="n">content</span> <span class="o">=</span> <span class="n">s</span>

        <span class="k">return</span> <span class="n">response</span>
</pre></div>
<p>After the view is called, I restore the debug setting and save the stats in a <code>dict</code> which will be used for argument replacement during the output formatting step. As I said earlier, I use a sort of <em>poor-man's templating</em> to render the stats into the page on output. An HTML comment of the form
</p>
<div class="codehilite"><pre><span class="c">&lt;!-- STATS: format_string --&gt;</span>
</pre></div>
<p>will be replaced with the output of <code>format_string</code> after formatting against the <code>stats</code> dict. This blog uses the string:
</p>
<div class="codehilite"><pre><span class="c">&lt;!-- STATS: Total: %(totTime).2f Python: %(pyTime).2f</span>
<span class="c">     DB: %(dbTime).2f Queries: %(queries)d --&gt;</span>
</pre></div>
<p>I put this inside the footer <code>&lt;div&gt;</code> and format through the CSS style sheet.
</p>
<p>The replacement block in the code deserves a couple of comments. I wanted to make sure the string formatting was not applied against the entire content of the page so that random <code>%</code> symbols wouldn't cause problems. So I search for the special comment tag and then apply the formatting only to the subsection where it was found. The resulting content is then assembled with the untouched bounding parts of the original conent and the modified subsection.
</p>

<h2>Immediately Helpful</h2>
<p>Soon after I got this working, it helped me identify a logic bug in my code. I had created a view that showed the position of an investment fund in a fairly large table (about 700 rows). The page took a few seconds to build and render. The statistics at the bottom of the page showed a whopping 2813 queries! Yikes!
</p>
<p>The reason became immediately clear: each row in the table displayed a <code>Position</code> object as well as fields from the four other models it had <code>ForeignKey</code> relations with. Thus, each time a row was evaluated in the template, there were four individual database queries to look up the related data. I simply added <code>select_related()</code> to the end of my main query and the query count dropped to 4 and the execution time was cut to less than half the original 2.5 seconds. Nice.
</p>]]></content:encoded></item><item><title>Macbook Impressions</title><dc:creator>david.avraamides@mac.com</dc:creator><category>Technology</category><dc:date>2006-06-28T17:52:13-04:00</dc:date><link>http://davidavraamides.net/files/macbook-impressions.html#unique-entry-id-5</link><guid isPermaLink="true">http://davidavraamides.net/files/macbook-impressions.html#unique-entry-id-5</guid><content:encoded><![CDATA[<p>I had been using a Sony Vaio PCG-TR3P for the last two years and was very happy with it: it's small, light and fairly full-featured. But two years in laptop-time is pretty old and I was ready to get a new portable based on the Intel Duo. I quickly narrowed my choice down to the Lenovo X60 and the Macbook.
</p>
<p>Even though I'm mostly a Windows guy, I was intrigued by the Mac and impressed by what I'd seen from other Mac owners. The two factors that pushed me to side with the Mac were: 1) Windows compatibility through the Intel chip, and 2) changes in how I use a PC over time.
</p>
<p>The Wintel compatibility with tools like Bootcamp and Parallels seemed like a nice insurance policy, if I really <em>needed</em> to do someting in Windows. But the second factor was really more important. I typically use my home PC for four things which all work fine on my Mac:
</p>
<ol>
 <li>
     Surfing: most websites look and work fine in Safari, Firefox or IE. No bug surprise there.
 </li>

 <li>
     Email: we use Exchange at work but there are a number of ways to access it from the Mac: the native Mail app can work with Exchange, you can use Microsoft's Entourage (essentially the Mac version of Outlook) or you can use Exchange's web interface - Outlook Web Access. I've tried them all so far and Entourage seems to be the nicest way to go.
 </li>

 <li>
     Development: my development &quot;stack&quot; is now all open-source tools that are available on Windows and OS X as well as many other platforms like Linux. I use Apache, Python, MySQL and Django and have found no issues in running my apps on Windows, OS X or Linux for that matter (case in point: <a href="/blog/2006/05/15/about-site/">this site</a>).
 </li>

 <li>
     Media: or more specifically, organizing and working with photos, scans, music and video files. Since the data formats themselves are portable and well-supported, it comes down to which applications you prefer. This is a case where the Mac probably has more to offer and certainly is a strong selling point for the machine.
 </li>
</ol>

<h2>OS X</h2>
<p>I <em>really</em> like OS X. it's intuitive, responsive, consistent and slick. Most people rave about the user interface - which is great - but I was really impressed with the integration between the modern GUI and the Unix layer beneath the covers. Other attempts I'd seen at this on Linux feel like a messy collection of glorified &quot;etc&quot; file editors, with no commonality or integration between them. In Linux I feel like the GUI just gets in the way and I often don't even bother installing X. But under OS X, the Terminal is really there as an extension when needed, which I'm finding I need less and less.
</p>
<img class="imageStyle" alt="osx" src="http://davidavraamides.net/files//page2_blog_entry5_1.png" width="507" height="413"/>

<p>I haven't had my machine long enough to comment on stability. I assume it's very good but to be honest, I've had very good luck with Windows XP in that respect, too, so it's more the type of thing I've come to expect in a mature, modern operating system, rather than be surprised by.
</p>

<h2>Hooked?</h2>
<p>it's too early to say if these are the first few steps towards a shift away from Windows in my home, or if I'll be content to just have the one-off Mac that I use only for myself. ( <em>Or</em> if you'll be seeing my Macbook on eBay once the honeymoon is over! But that seems doubtful at this point).
</p>
<p>I have to admit that I simply like using my Mac - something I haven't really experienced since Windows 95 shipped. Sure, I've been happy when I've bought new PCs in the past, but even though the PC might be new, Windows is getting fairly old ... and just a little bit stale. At this point I'm not missing my old laptop or Windows for my home and casual PC use.
</p>
<p>it's impressive how Apple consistently seems to outshine it's competitors in the areas of design and usability. The iPod is a great example, but you can find it in large, complex products like the Macbook and OS X as well as simpler products like their AC adapter or the iPod's lanyard headphones.
</p>
<p>Apple must have a culture that values great design as it's so prevalent in their entire product line. Whatever it is, I'm glad they have it. And I'm glad I bought a Mac.
</p><br />]]></content:encoded></item><item><title>Yet Another Django Blog</title><dc:creator>david.avraamides@mac.com</dc:creator><category>Programming</category><dc:date>2006-05-11T22:17:47-04:00</dc:date><link>http://davidavraamides.net/files/yet-another-django-blog.html#unique-entry-id-1</link><guid isPermaLink="true">http://davidavraamides.net/files/yet-another-django-blog.html#unique-entry-id-1</guid><content:encoded><![CDATA[<blockquote><p><strong>Update 1/14/2008:</strong> While this site is no longer built using Django, I thought I'd
   keep my Django-related posts up here as I've gotten a decent amount of feedback from
   people doing similar projects. 
</p>
</blockquote><p>I've recently cutover from WordPress to a new blog which I developed using
   <a href="http://www.djangoproject.com">Django</a>. I've been using Django at work for
   the past month and as I've gotten more familiar with it, I decided building
   a completely new blog would actually be fun and fairly simple. Besides, I've
   read about a <a href="http://www.rossp.org/blog/2006/jan/23/building-blog-django-1/">few</a>
   <a href="http://www2.jeffcroft.com/2006/may/02/django-non-programmers/">other</a>
   <a href="http://www.socialistsoftware.com/post/socialist-software-now-powered-django/">people</a>
   who have done the same thing and they say the hardest part was converting over
   their old blog entries. That should be cake for me as I only have a handful of posts
   to worry about!
</p>
<p>If you haven't heard of it yet, Django is a framework for building web applications
   written in Python. It's an open-source project that spun out of an online news
   organization and thus has its roots in CMS applications. But it has now grown
   to support many other kinds of web applications. I stumbled onto Django when I
   was looking for a better way to build software after using C#/.NET and ASP.NET
   for the last two years. That led me to Python and I quickly found myself leaving
   C# behind with no regrets. Once I was sold on Python, I started looking at the &quot;right&quot;
   way to build web apps in Python. While there really isn't an answer to that question,
   Django comes pretty darn close.
</p>

<h2>Building the Blog</h2>
<p>After reading the posts above about other Django-based blogs, I sat down to start
   my own. It took me about an hour to get a first cut of a working blog together,
   which I think is great, but some people seem fixated on the 15- or 20-minute
   project. I was basing some of my design on other people's older code base which
   changed considerably with the &quot;magic-removal&quot; relase so I had to convert things
   over (using the sophisticated engineering technigue of: <em>refresh the browser, read
the error message, fix the code, rinse, repeat</em>).
</p>
<p>I started with these fairly basic models for a <code>Tag</code> and a <code>Post</code> which gives
   you useful enough models for a surprsingly workable blog.
</p>
<div class="codehilite"><pre><span class="k">class</span> <span class="nc">Tag</span><span class="p">(</span><span class="n">models</span><span class="o">.</span><span class="n">Model</span><span class="p">):</span>
    <span class="n">slug</span> <span class="o">=</span> <span class="n">models</span><span class="o">.</span><span class="n">SlugField</span><span class="p">(</span><span class="n">prepopulate_from</span><span class="o">=</span><span class="p">(</span><span class="s">&#39;title&#39;</span><span class="p">,),</span> <span class="n">primary_key</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
    <span class="n">title</span> <span class="o">=</span> <span class="n">models</span><span class="o">.</span><span class="n">CharField</span><span class="p">(</span><span class="n">maxlength</span><span class="o">=</span><span class="mf">30</span><span class="p">)</span>
    <span class="n">description</span> <span class="o">=</span> <span class="n">models</span><span class="o">.</span><span class="n">TextField</span><span class="p">(</span><span class="n">help_text</span><span class="o">=</span><span class="s">&#39;Short summary of this tag&#39;</span><span class="p">)</span>

    <span class="k">def</span> <span class="nf">__str__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
        <span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">title</span>

    <span class="k">def</span> <span class="nf">get_absolute_url</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
        <span class="k">return</span> <span class="s">&quot;/blog/tag/</span><span class="si">%s</span><span class="s">/&quot;</span> <span class="o">%</span> <span class="bp">self</span><span class="o">.</span><span class="n">slug</span>

    <span class="k">class</span> <span class="nc">Admin</span><span class="p">:</span>
        <span class="n">list_display</span> <span class="o">=</span> <span class="p">(</span><span class="s">&#39;slug&#39;</span><span class="p">,</span> <span class="s">&#39;title&#39;</span><span class="p">,)</span>
        <span class="n">search_fields</span> <span class="o">=</span> <span class="p">(</span><span class="s">&#39;title&#39;</span><span class="p">,</span> <span class="s">&#39;description&#39;</span><span class="p">,)</span>

    <span class="k">class</span> <span class="nc">Meta</span><span class="p">:</span>
        <span class="n">ordering</span> <span class="o">=</span> <span class="p">(</span><span class="s">&#39;title&#39;</span><span class="p">,)</span>
</pre></div>
<p>A <code>Tag</code> is just really just a unique string used to categorize posts. In
   hindsight, I could have made this even simpler as I'm really just using
   a <code>Tag</code>'s <code>slug</code> field. A <code>Post</code> is the model that really makes it a blog.
   Its also pretty simple, with the obvious fields: <code>title</code>, <code>date</code> and <code>body</code>, 
   plus a the many-to-many <code>tags</code> relationship. It also has a <code>slug</code> field
   which is like a human-readable key to make it easier to refer to an
   object by a friendly name rather then a number.
</p>
<div class="codehilite"><pre><span class="k">class</span> <span class="nc">Post</span><span class="p">(</span><span class="n">models</span><span class="o">.</span><span class="n">Model</span><span class="p">):</span>
    <span class="n">slug</span> <span class="o">=</span> <span class="n">models</span><span class="o">.</span><span class="n">SlugField</span><span class="p">(</span><span class="n">prepopulate_from</span><span class="o">=</span><span class="p">(</span><span class="s">&#39;title&#39;</span><span class="p">,),</span> <span class="n">primary_key</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
    <span class="n">tags</span> <span class="o">=</span> <span class="n">models</span><span class="o">.</span><span class="n">ManyToManyField</span><span class="p">(</span><span class="n">Tag</span><span class="p">)</span>
    <span class="n">title</span> <span class="o">=</span> <span class="n">models</span><span class="o">.</span><span class="n">CharField</span><span class="p">(</span><span class="n">maxlength</span><span class="o">=</span><span class="mf">80</span><span class="p">)</span>
    <span class="n">date</span> <span class="o">=</span> <span class="n">models</span><span class="o">.</span><span class="n">DateTimeField</span><span class="p">(</span><span class="n">auto_now_add</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
    <span class="n">body</span> <span class="o">=</span> <span class="n">models</span><span class="o">.</span><span class="n">TextField</span><span class="p">()</span>

    <span class="k">def</span> <span class="nf">__str__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
        <span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">title</span>

    <span class="k">def</span> <span class="nf">get_absolute_url</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
        <span class="k">return</span> <span class="s">&quot;/blog/</span><span class="si">%s</span><span class="s">/</span><span class="si">%s</span><span class="s">/&quot;</span> <span class="o">%</span> <span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">date</span><span class="o">.</span><span class="n">strftime</span><span class="p">(</span><span class="s">&quot;%Y/%m/</span><span class="si">%d</span><span class="s">&quot;</span><span class="p">)</span><span class="o">.</span><span class="n">lower</span><span class="p">(),</span>
                                 <span class="bp">self</span><span class="o">.</span><span class="n">slug</span><span class="p">)</span>

    <span class="k">class</span> <span class="nc">Admin</span><span class="p">:</span>
        <span class="n">list_display</span> <span class="o">=</span> <span class="p">(</span><span class="s">&#39;slug&#39;</span><span class="p">,</span> <span class="s">&#39;title&#39;</span><span class="p">,</span> <span class="s">&#39;date&#39;</span><span class="p">,)</span>
        <span class="n">search_fields</span> <span class="o">=</span> <span class="p">(</span><span class="s">&#39;title&#39;</span><span class="p">,</span> <span class="s">&#39;body&#39;</span><span class="p">,)</span>
        <span class="n">date_hierarchy</span> <span class="o">=</span> <span class="s">&#39;date&#39;</span>

    <span class="k">class</span> <span class="nc">Meta</span><span class="p">:</span>
        <span class="n">get_latest_by</span> <span class="o">=</span> <span class="s">&#39;date&#39;</span>
        <span class="n">ordering</span> <span class="o">=</span> <span class="p">(</span><span class="s">&#39;-date&#39;</span><span class="p">,)</span>
</pre></div>
<p>Now that I'm a little more familiar with the problem, I could probably start
   from scratch and do it again in under 30 minutes, but <em>why?</em> In fact, I think
   someone should make a video of a <em>real</em> development session with all the typical
   mistakes and error messages, along with a voice over of how the user figured
   out the problem from the errors. Now <em>that</em> would be useful.
</p>
<p>After getting the basic features done, I started working on the little tweaks that
   would make it more polished and easy to use. Some of these things came very easy but
   others were a little trickier. I'll post my comments on some of these features in
   the near futrue, as the information might be useful to others.
</p>]]></content:encoded></item><item><title>How Do I Copy Thee? Let Me Count the Ways</title><dc:creator>david.avraamides@mac.com</dc:creator><category>Programming</category><dc:date>2005-05-07T10:44:50-04:00</dc:date><link>http://davidavraamides.net/files/strcpy.html#unique-entry-id-7</link><guid isPermaLink="true">http://davidavraamides.net/files/strcpy.html#unique-entry-id-7</guid><content:encoded><![CDATA[<h2>Why <code>strcpy</code>?</h2>
<p>First I should start off with a bit of a disclaimer: this isn't the only question I ask as part of an interview process. On the contrary, I put candidates through a phone screen, have them take a written quiz (of which this is one of about 40 questions), and follow all that up with a take-home programming problem. No, the strcpy question is just one data point I use when evaluating programmers ... but it is probably one of the best.
</p>
<p>To review, <code>strcpy</code> is the function for copying one string to another that is part of the standard library of the C programming language. Sounds simple, but asking someone to code it up is a bit of a loaded question, and that's what makes it so great for interviews. To implement the function, you have to know what it does and how strings are represented in C, which gets to one of the important design choices of C's creators: there is no native string type in C. Instead, a string is represented as a sequence of bytes (char values) in memory terminated with a byte containing zero. A pointer to the first byte in the sequence represents the string, but its really more of a convention then a first-class data type (although the compiler helps you a little by automatically appending a zero to string constants in memory).
</p>
<p>Then there is the design of <code>strcpy</code> itself. It was designed to have the feel of an assignment statement for strings so where you might say 
</p>
<div class="codehilite"><pre><span class="kt">int</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">;</span>
<span class="p">...</span>
<span class="n">a</span> <span class="o">=</span> <span class="n">b</span><span class="p">;</span>
</pre></div>
<p>to copy the value from <code>b</code> into <code>a</code>, you would likewise use
</p>
<div class="codehilite"><pre><span class="kt">char</span> <span class="o">*</span><span class="n">a</span><span class="p">,</span> <span class="o">*</span><span class="n">b</span><span class="p">;</span>
<span class="p">...</span>
<span class="n">strcpy</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">);</span>
</pre></div>
<p>to copy the string value from <code>b</code> to <code>a</code>, that is, the destination is the first parameter (left-hand side) and the source is the second parameter (right-hand side), just like an assignment.
</p>
<p>Another feature of C is that assignment statements themselves have a well-defined value and this is often used in shortcuts like
</p>
<div class="codehilite"><pre><span class="kt">int</span> <span class="n">val</span><span class="p">;</span>
<span class="k">if</span> <span class="p">((</span><span class="n">val</span> <span class="o">=</span> <span class="n">getval</span><span class="p">())</span> <span class="o">!=</span> <span class="o">-</span><span class="mi">1</span><span class="p">)</span> <span class="p">{</span>
    <span class="c">// do something with val</span>
<span class="p">}</span>
</pre></div>
<p>where we can call a function that returns a value, assign it to a variable and test its value in one expression. A (little-known) feature of <code>strcpy</code> is that it returns the value of the destination pointer as a convenience to the caller so you can copy a string, and use the value in one expression as in
</p>
<div class="codehilite"><pre><span class="kt">char</span> <span class="o">*</span><span class="n">a</span><span class="p">,</span> <span class="o">*</span><span class="n">b</span><span class="p">;</span>
<span class="p">...</span>
<span class="n">printf</span><span class="p">(</span><span class="s">&quot;%s</span><span class="se">\n</span><span class="s">&quot;</span><span class="p">,</span> <span class="n">strcpy</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">));</span>
</pre></div>
<p>Admittedly, not a feature I've used much, but experienced C programmers are usually aware its there.
</p>
<p>Another interesting thing about implementing <code>strcpy</code> is that the solution is introduced, discussed and refined in the C &quot;bible&quot;: <em>The C Programming Language</em> by Brian Kernighan and Dennis Ritchie, more commonly referred to as the &quot;K&amp;R book.&quot;  Anyone who really wanted to learn C would have at least heard of this book and hopefully been curious enough to read it. If they had (and they paid attention), then they would already know the answer. Experience has shown me, though, that it doesn't appear to be as popular as I would have thought - at least not with the candidates that have crossed my path.
</p>

<h2>Phrasing the Question</h2>
<p>The <code>strcpy</code> question was one of a number of problems I put on a written quiz for interviewing programmers. I liked having candidates write it on paper because it avoided the problem of working on a computer where different programmers use different tools - everyone should be equally familiar with paper and pencil. It also avoided the distractions of a development environment where the person might be messing around with build settings and key bindings rather than focusing on the problem at hand.
</p>
<p>As for how to phrase the question, initially I was wary of giving away too much with the signature of the function, and part of the reason for asking this specific question was to see if users were familiar with such a ubiquitous function, so my first version was simply
</p>
<blockquote><p>Implement the C standard library function: strcpy
</p>
</blockquote><p>I figured most people would know what it was and what it did so there wasn't really a need to be more explicit. And if they didn't know the function, then that told me a lot - maybe more than whether or not they could code it. You have to understand, the interviewee had already done well enough on a phone screen to be invited in for a round of interviews. I figured wrong. Asking the question in this way led to very few correct answers. Some people had not heard of the function, many people who knew what it did got the source and destination arguments mixed up and practically no one got the return value right. So for the second version I gave up on people knowing the correct signature from memory and rephrased the question to be more explicit (and more in line with ANSI C):
</p>
<blockquote><p>Implement the following C library function:
</p>
</blockquote><div class="codehilite"><pre><span class="kt">char</span><span class="o">*</span> <span class="nf">strcpy</span><span class="p">(</span><span class="kt">char</span><span class="o">*</span> <span class="n">s1</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span><span class="o">*</span> <span class="n">s2</span><span class="p">)</span> <span class="p">{</span>
    <span class="p">...</span>
<span class="p">}</span>
</pre></div>
<p>Note that I didn't give away the source and destination arguments by name but I thought most people would see the <code>const</code> modifier hint. I also expected the return type of the function to jog a few people's memories. It turns out that the <code>const</code> hint was not strong enough, however, as many people continued to confuse the source and destination arguments. A few people understood that they should be returning something, but even with the prototype, very few people actually got that part right.
</p>
<p>I tweaked the question again to arrive at the third version. I simply gave up on the return type in order to simplify the problem so they could focus on the basic algorithm. I also named the arguments hoping to clear up the confusion:
</p>
<blockquote><p>Implement the following C library function:
</p>
</blockquote><div class="codehilite"><pre><span class="kt">void</span> <span class="nf">strcpy</span><span class="p">(</span><span class="kt">char</span><span class="o">*</span> <span class="n">dst</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span><span class="o">*</span> <span class="n">src</span><span class="p">)</span> <span class="p">{</span>
    <span class="p">...</span>
<span class="p">}</span>
</pre></div>
<p>So now I was leading them to the water, and was expecting many correct, or nearly correct solutions. Did I ever say I was a bit of an optimist? How foolish I was to think my new version of the question was clear enough. Granted, most people started off on the right track copying chars from the <code>const src</code> pointer to the <code>dst</code> pointer. (No one ever asked why its return type was not <code>char *</code> so I felt validated on that omission.) But wait, you can do it without calling <code>strlen</code> - think about it! And why are you calling <code>malloc</code>? Or <code>memcpy</code>? Okay, one more refinement and I think I've got it:
</p>
<blockquote><p>Implement the following C library function, without calling any other routines:
</p>
</blockquote><div class="codehilite"><pre><span class="kt">void</span> <span class="nf">strcpy</span><span class="p">(</span><span class="kt">char</span><span class="o">*</span> <span class="n">dst</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span><span class="o">*</span> <span class="n">src</span><span class="p">)</span> <span class="p">{</span>
    <span class="p">...</span>
<span class="p">}</span>
</pre></div>
<p>Whew!
</p>

<h2>Evaluating the Answer</h2>
<p>There were five key things I looked for when evaluating an answer.
</p>
<ol>
 <li>
     Did they know what the function was supposed to do? That is, was their algorithm an attempt at an actual copy of a string? 
 </li>

 <li>
     Was the algorithm correct? Did they copy all the characters? Did they copy the terminating null? (Later, when we went through the quiz, I'd ask the candidate to walk through their code for the string &quot;Hi&quot; to test/break it.) 
 </li>

 <li>
     Did they follow my instructions - no system calls or other routines?
 </li>

 <li>
     Did they get the <code>src</code> and <code>dst</code> arguments right? 
 </li>

 <li>
     Did they use arrays or pointers? This was more to see how comfortable they were with pointers. 
 </li>
</ol>
<p>I also considered a few other points for &quot;extra credit&quot;: Did they know about the return value? Did they write defensively (check for null arguments)? As you can see, there is a lot to be learned from such a basic question. I also believe that people are creatures of habit: they tend to do things the same way in this setting as they might on the job. If they were sloppy and careless on the interview ... well <em>caveat emptor</em>.
</p>
<p>The other thing I was always thinking about was the context they were in. Sure, interviewing can be stressful, and there were many other problems on the quiz so they couldn't spend all their time on this one, but this was an interview for a job! I expected people to be going a little over the top in order to give a good impression, so details were important. And while I wouldn't say that any one thing on an interview should be a sole determination if someone is hired or not, it was hard for me to get excited about hiring someone that couldn't get this right.
</p>

<h2>The Solution</h2>
<p>Although there are many ways to solve this problem, we can look to the K&amp;R book for the classic solution.
</p>
<div class="codehilite"><pre><span class="kt">void</span> <span class="nf">strcpy</span><span class="p">(</span><span class="kt">char</span><span class="o">*</span> <span class="n">dst</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span><span class="o">*</span> <span class="n">src</span><span class="p">)</span>
<span class="p">{</span>
    <span class="k">while</span> <span class="p">(</span><span class="o">*</span><span class="n">dst</span><span class="o">++</span> <span class="o">=</span> <span class="o">*</span><span class="n">src</span><span class="o">++</span><span class="p">)</span>
       <span class="p">;</span>
<span class="p">}</span>
</pre></div>
<h6>Solution 1: The &quot;classic&quot; K&amp;R answer</h6><p>Its simple, concise and correct. We can quibble about where to put the semicolon or if we should use braces, but that is all just style - the statement that does all the work is in the test of the <code>while</code>. This solution uses a number of features of the C language to get a lot done in essentially one line of code:
</p>
<ul>
 <li>
     A char is copied from one pointer to another using the dereferencing (&quot;*&quot;) operator 
 </li>

 <li>
     Post-incrementing is used to march the pointers forward in memory after the char is copied 
 </li>

 <li>
     It takes advantage of the fact that the result of an assignment is the value of the statement of itself 
 </li>

 <li>
     The terminating zero of a C-string is used as a boolean test for the while loop 
 </li>
</ul>
<p>That's a lot packed into one simple expression! These are all features of C that its creators designed into the language for a purpose. This expression, and slight variations of it, are so common that they are idioms in C programming and any C programmer will likely come across code with statements like these time after time. They better fully understand what's going on here if they are going to master the language (or at least to read and understand other people's code).
</p>
<p>I mentioned earlier that the &quot;real&quot; version of <code>strcpy</code> returns a copy of the destination pointer as a convenience. Thus, a more correct version of the function would be
</p>
<div class="codehilite"><pre><span class="kt">char</span><span class="o">*</span> <span class="nf">strcpy</span><span class="p">(</span><span class="kt">char</span><span class="o">*</span> <span class="n">dst</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span><span class="o">*</span> <span class="n">src</span><span class="p">)</span>
<span class="p">{</span>
    <span class="kt">char</span><span class="o">*</span> <span class="n">ret</span> <span class="o">=</span> <span class="n">dst</span><span class="p">;</span>

    <span class="k">while</span> <span class="p">(</span><span class="o">*</span><span class="n">dst</span><span class="o">++</span> <span class="o">=</span> <span class="o">*</span><span class="n">src</span><span class="o">++</span><span class="p">)</span>
        <span class="p">;</span>

    <span class="k">return</span> <span class="n">ret</span><span class="p">;</span>
<span class="p">}</span>
</pre></div>
<h6>Solution 2: Handle the return value correctly</h6><p>That is, we save a copy of the destination pointer and return it after we've copied the string. While that is more correct, it doesn't demonstrate any real coding skills - its really more of a trivia question about <code>strcpy</code>. That's why I eventually rephrased my interview question to use a <code>void</code> return type (and of course because no one was getting that detail correct, anyway).
</p>
<p>There are a couple of common variations on this solution: using a different looping construct, and using arrays rather than pointers. The following solution shows one variant using a <code>for</code> loop.
</p>
<div class="codehilite"><pre><span class="kt">char</span><span class="o">*</span> <span class="nf">strcpy</span><span class="p">(</span><span class="kt">char</span><span class="o">*</span> <span class="n">dst</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span><span class="o">*</span> <span class="n">src</span><span class="p">)</span>
<span class="p">{</span>
    <span class="kt">int</span> <span class="n">i</span><span class="p">;</span>
    <span class="k">for</span><span class="p">(</span><span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">src</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">!=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">dst</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="n">src</span><span class="p">[</span><span class="n">i</span><span class="p">];</span>
    <span class="p">}</span>
    <span class="n">dst</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>

    <span class="k">return</span> <span class="n">dst</span><span class="p">;</span>
<span class="p">}</span>
</pre></div>
<h6>Solution 3: Using array notation</h6><p>Since we never modify <code>dst</code> - we use <code>i</code> as an index to offset from the starting address - there is no reason to make a temporary copy for the return value. But since we aren't copying and testing for null in the same statement here, the terminating zero is not copied in the loop. Thus we need to terminate the destination string outside of the loop.
</p>
<p>These are all correct solutions in that they copy the string including the terminating null. And although I personally prefer a version using pointers as it demonstrates a degree of understanding and confidence in using them, the array version is perfectly fine.
</p>
<p>There are, of course, dozens of other variations that could be written. We could use a for with pointers, use a while loop with arrays, use a do loop - we could even use recursion if we wanted to be silly (and inefficient). But most often I've found good solutions will look a lot like one of the snippets above.
</p>

<h2>Answers</h2>
<p>I've taken a not-so-random sample of some past solutions to show some of the classic mistakes people make. For dramatic effect, I'll start with the good ones and work my way down through the common mistakes people make, all the way to the really awful answers that I'm not even sure what they were trying to do. Remember, these are actual answers from actual programmers. Additionally, these were people who were often gainfully employed at the time of the interview, had been recommended by recruiters or other people, and had already been screened through a phone interview. All of these snippets are exactly as answered including syntax errors, comments, creative operators, etc.
</p>
<div class="codehilite"><pre><span class="kt">void</span> <span class="nf">strcpy</span><span class="p">(</span><span class="kt">char</span><span class="o">*</span> <span class="n">dst</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span><span class="o">*</span> <span class="n">src</span><span class="p">)</span>
<span class="p">{</span>
    <span class="k">for</span> <span class="p">(;</span> <span class="n">dst</span> <span class="o">!=</span> <span class="n">Null</span> <span class="o">&amp;&amp;</span> <span class="n">src</span> <span class="o">!=</span> <span class="n">Null</span><span class="p">;)</span>
        <span class="o">*</span><span class="n">dst</span><span class="o">++</span> <span class="o">=</span> <span class="o">*</span><span class="n">src</span><span class="o">++</span><span class="p">;</span>
<span class="p">}</span>
</pre></div>
<h6>Answer 1</h6><p>So close! If they would have just replaced the test clause of the for loop with its body, they would have nailed it. But they got confused.
</p>
<ul>
 <li>
     They assumed &quot;Null&quot; is some pre-defined constant for zero
 </li>

 <li>
     They compared addresses and not the value pointed to by the address so this would not terminate 
 </li>

 <li>
     Even if they got the loop test correct, they wouldn't have copied the terminating null character 
 </li>
</ul>
<p>But all in all not a bad answer and the use of pointers for copy-and-increment is promising. You can tell that they've either read up on C or used it in the past, but probably aren't using it day to day.
</p>
<div class="codehilite"><pre><span class="kt">void</span> <span class="nf">strcpy</span><span class="p">(</span><span class="kt">char</span><span class="o">*</span> <span class="n">dst</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span><span class="o">*</span> <span class="n">src</span><span class="p">)</span>
<span class="p">{</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">src</span> <span class="o">==</span> <span class="nb">NULL</span><span class="p">)</span> <span class="k">return</span><span class="p">;</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">dst</span> <span class="o">==</span> <span class="nb">NULL</span><span class="p">)</span> <span class="k">return</span><span class="p">;</span>

    <span class="k">while</span> <span class="p">(</span><span class="n">src</span><span class="p">[</span><span class="n">i</span><span class="o">++</span><span class="p">]</span> <span class="o">!=</span> <span class="sc">&#39;\0&#39;</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">dst</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="n">src</span><span class="p">[</span><span class="n">i</span><span class="p">];</span>
    <span class="p">}</span>
    <span class="n">dst</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="sc">&#39;\0&#39;</span><span class="p">;</span>
<span class="p">}</span>
</pre></div>
<h6>Answer 2</h6><p>Hey, defensive programming! I like that, and I see him copying the null at the end, could we have a winner? Oh, I'm sorry, you got the last char but forgot the first one (and you forgot to define <code>i</code>).
</p>
<div class="codehilite"><pre><span class="kt">void</span> <span class="nf">strcpy</span><span class="p">(</span><span class="kt">char</span><span class="o">*</span> <span class="n">dst</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span><span class="o">*</span> <span class="n">src</span><span class="p">)</span>
<span class="p">{</span>
    <span class="k">while</span><span class="p">(</span><span class="o">*</span><span class="n">src</span><span class="p">)</span>
        <span class="o">*</span><span class="n">dst</span><span class="o">++</span> <span class="o">=</span> <span class="o">*</span><span class="n">src</span><span class="o">++</span><span class="p">;</span>
<span class="p">}</span>
</pre></div>
<h6>Answer 3</h6><p>Another close one! They seem comfortable with pointers and their use in the copy-and-increment idiom. But the while test is going to kick you out of the loop before you copy the terminating 0 (which you would have noticed if you tested it). Sorry.
</p>
<div class="codehilite"><pre><span class="kt">void</span> <span class="nf">strcpy</span><span class="p">(</span><span class="kt">char</span><span class="o">*</span> <span class="n">dst</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span><span class="o">*</span> <span class="n">src</span><span class="p">)</span>
<span class="p">{</span>
    <span class="o">*</span><span class="n">dst</span> <span class="o">=</span> <span class="o">*</span><span class="n">src</span><span class="p">;</span>
<span class="p">}</span>
</pre></div>
<h6>Answer 4</h6><p>Well, it does work for the special case of <code>strcpy(buf, &quot;&quot;)</code>, and I have to admit, it is fast, but I was looking for something a little more ... correct. 
</p>
<div class="codehilite"><pre><span class="kt">void</span> <span class="nf">strcpy</span><span class="p">(</span><span class="kt">char</span><span class="o">*</span> <span class="n">dst</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span><span class="o">*</span> <span class="n">src</span><span class="p">)</span>
<span class="p">{</span>
    <span class="kt">int</span> <span class="n">counter</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
    <span class="k">while</span> <span class="p">(</span><span class="o">*</span><span class="p">(</span><span class="n">src</span> <span class="o">+</span> <span class="n">counter</span><span class="p">)</span> <span class="o">&lt;&gt;</span> <span class="err">&#39;</span><span class="o">/</span><span class="mi">0</span><span class="err">&#39;</span><span class="p">)</span> <span class="p">{</span>
        <span class="o">*</span><span class="p">(</span><span class="n">dst</span> <span class="o">+</span> <span class="n">counter</span><span class="p">)</span> <span class="o">=</span> <span class="o">*</span><span class="p">(</span><span class="n">src</span> <span class="o">+</span> <span class="n">counter</span><span class="p">);</span>
        <span class="n">counter</span><span class="o">++</span><span class="p">;</span>
    <span class="p">}</span>
<span class="p">}</span>
</pre></div>
<h6>Answer 5</h6><p>Well the syntax for &quot;not equals&quot; smells a little like VB and there seems to be confusion on the syntax for an escape character. Overlooking those points, there is still the problem with the terminating null, and while the pointer expressions are valid, I find them less clear than the examples that increment the pointers.
</p>
<p>At least in answers 1-5 you can see some familiarity with C and the use of pointers, plus its clear that people knew what <code>strcpy</code> did, if not exactly how it did it. Now let's move on to some more &quot;creative&quot; answers.
</p>
<div class="codehilite"><pre><span class="kt">void</span> <span class="nf">strcpy</span><span class="p">(</span><span class="kt">char</span><span class="o">*</span> <span class="n">dst</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span><span class="o">*</span> <span class="n">src</span><span class="p">)</span>
<span class="p">{</span>
    <span class="kt">char</span> <span class="n">i</span> <span class="o">=</span> <span class="n">null</span><span class="p">;</span>
    <span class="k">do</span> <span class="k">while</span> <span class="n">i</span> <span class="o">&lt;&gt;</span> <span class="sc">&#39;\n&#39;</span>
    <span class="p">{</span>
        <span class="n">i</span> <span class="o">=</span> <span class="o">&amp;</span><span class="n">src</span><span class="p">;</span>
        <span class="o">&amp;</span><span class="n">dst</span> <span class="o">=</span> <span class="n">i</span><span class="p">;</span>

        <span class="n">dst</span><span class="o">++</span><span class="p">;</span>
        <span class="n">src</span><span class="o">++</span><span class="p">;</span>
    <span class="p">}</span>
<span class="p">}</span>
</pre></div>
<h6>Answer 6</h6><p>There are a number of issues with this answer:
</p>
<ul>
 <li>
     Confusion between the dereferencing and address operators (<code>*/&amp;</code>)
 </li>

 <li>
     Wrong syntax for a <code>do/while</code> loop
 </li>

 <li>
     Wrong syntax for &quot;not equals&quot;
 </li>

 <li>
     They are looking for a terminating end of line character, not a null
 </li>

 <li>
     It doesn't copy the terminating null (or <code>\n</code> for that matter)
 </li>
</ul>
<!-- markdown wants to make this part of the list above -->

<div class="codehilite"><pre><span class="kt">void</span> <span class="nf">strcpy</span><span class="p">(</span><span class="kt">char</span><span class="o">*</span> <span class="n">dst</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span><span class="o">*</span> <span class="n">src</span><span class="p">)</span>
<span class="p">{</span>
    <span class="kt">int</span> <span class="n">idst</span><span class="p">,</span> <span class="n">isrc</span><span class="p">;</span>
    <span class="n">idst</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">isrc</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
    <span class="k">while</span> <span class="p">(</span><span class="n">src</span><span class="p">[</span><span class="n">isrc</span><span class="p">]</span> <span class="o">!=</span> <span class="s">&quot;</span><span class="se">\0</span><span class="s">&quot;</span><span class="p">)</span>
        <span class="n">dst</span><span class="p">[</span><span class="n">idst</span><span class="o">++</span><span class="p">]</span> <span class="o">=</span> <span class="n">src</span><span class="p">[</span><span class="n">isrc</span><span class="o">++</span><span class="p">];</span>
<span class="p">}</span>
</pre></div>
<h6>Answer 7</h6><p>This time they confused the syntax of a char constant with a string constant. This is actually a very nasty bug (if the compiler didn't warn on type mismatches) as &quot;0&quot; evaluates to an address that stores a string constant so the <code>while</code> would never terminate. Oh, and they also forgot the null (are you beginning to see a theme?).
</p>
<div class="codehilite"><pre><span class="kt">void</span> <span class="nf">strcpy</span><span class="p">(</span><span class="kt">char</span><span class="o">*</span> <span class="n">dst</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span><span class="o">*</span> <span class="n">src</span><span class="p">)</span>
<span class="p">{</span>
    <span class="kt">int</span> <span class="n">ilen</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
    <span class="kt">char</span> <span class="o">*</span><span class="n">trav</span><span class="p">,</span> <span class="o">*</span><span class="n">trav2</span><span class="p">;</span>

    <span class="k">for</span><span class="p">(</span><span class="n">trav</span> <span class="o">=</span> <span class="n">src</span><span class="p">;</span> <span class="o">*</span><span class="n">trav</span><span class="o">++</span><span class="p">;</span> <span class="n">ilen</span><span class="o">++</span><span class="p">);</span>
    <span class="n">free</span> <span class="n">dst</span><span class="p">;</span>
    <span class="n">dst</span> <span class="o">=</span> <span class="n">malloc</span><span class="p">(</span><span class="n">ilen</span> <span class="o">+</span> <span class="mi">1</span><span class="p">);</span>
    <span class="n">trav2</span> <span class="o">=</span> <span class="n">dst</span><span class="p">;</span>
    <span class="k">for</span> <span class="p">(</span><span class="n">trav</span> <span class="o">=</span> <span class="n">src</span><span class="p">;</span> <span class="o">*</span><span class="n">trav2</span> <span class="o">=</span> <span class="o">*</span><span class="n">trav</span><span class="o">++</span><span class="p">;</span> <span class="n">trav2</span><span class="o">++</span><span class="p">);</span>
<span class="p">}</span>
</pre></div>
<h6>Answer 8</h6><p>Where do I start?
</p>
<ul>
 <li>
     The first <code>for</code> is counting the length of <code>src</code> (we'll see why in a couple of lines) 
 </li>

 <li>
     Then they try to free <code>dst</code> (with a syntax error on the <code>free</code> call). That's going to surprise the caller! 
 </li>

 <li>
     Now they get (leak) a new chunk of memory that the caller will never know about
 </li>

 <li>
     The second <code>for</code> then actually copies the string correctly! Yeah!
 </li>
</ul>
<p>What's funny is if they would have deleted the first three lines (the <code>for, free</code> and <code>malloc</code>), this would be correct (although a bit overly confusing with inconsistent initialization and incrementing statements). Because those three lines are there, however, it copies the string to our newly allocated chunk of memory that no one will know about as soon as the function exits. But I'm being pedantic - it may never get that far since the <code>free</code> call would likely cause a crash. There's also the issue of not following instructions with the use of <code>malloc</code> (at least they didn't call <code>strlen</code>, though).
</p>
<p>The use of <code>malloc</code> and <code>free</code> is particularly troubling. This demonstrates that the candidate misunderstands a couple of the basic design decisions that went into the C language: that it has no built-in memory operators (i.e. no new or delete), and that the developer is completely responsible for memory management. Allocating memory inside a function that advertises only to copy a string would violate this rule. More importantly, strcpy requires the user pass in the destination address, thus it should be clear that the user is responsible for making sure that address points to a valid destination with enough space to hold the string.
</p>
<p>Note: The <code>strdup</code> function, on the other hand, does violate this rule but clearly states this in its documentation. You can also infer this from its design. It takes only one argument - the source string - and returns the address of a new duplicate of the string. Since the user cannot pass the destination address, the function must allocate it for them - and return it. It is essentially a memory allocator that fills the new chunk of memory for the user, and I think would have been more appropriately called salloc and put in the <code>malloc.h</code> include file rather than <code>string.h</code>. Although <code>strdup</code> may be convenient in certain cases, I still prefer not to use it. I prefer controlling how the memory for the copy is obtained - I might have a static char array that I reuse, or an array on the stack. I think it is much more clear to take responsibility for obtaining the block of memory and calling <code>strcpy</code> with a pointer to it, rather than calling <code>strdup</code> and having to remember to match up a <code>free</code> with each call to it - its difficult enough matching <code>malloc</code> and <code>free</code> calls.
</p>
<div class="codehilite"><pre><span class="kt">void</span> <span class="nf">strcpy</span><span class="p">(</span><span class="kt">char</span><span class="o">*</span> <span class="n">dst</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span><span class="o">*</span> <span class="n">src</span><span class="p">)</span>
<span class="p">{</span>
    <span class="kt">int</span> <span class="n">size</span> <span class="o">=</span> <span class="n">strlen</span><span class="p">(</span><span class="n">src</span><span class="p">);</span>
    <span class="n">malloc</span><span class="p">(</span><span class="n">dst</span><span class="p">,</span> <span class="n">size</span><span class="p">)</span>  <span class="c">// syntax?</span>

    <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">size</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">dst</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="n">src</span><span class="p">[</span><span class="n">i</span><span class="p">];</span>
    <span class="p">}</span>
    <span class="k">return</span> <span class="n">dst</span><span class="p">;</span>
<span class="p">}</span>
</pre></div>
<h6>Answer 9</h6><p>I'm disappointed from the first line. They didn't follow instructions and made a call to <code>strlen</code>. And things only get worse with the call to <code>malloc</code>. But they returned <code>dst</code>! Nice! (Except of course its pointing to a non-terminated string and a different address then was passed in - assuming they would have eventually fixed the <code>malloc</code> call to: <code>dst = malloc(size);</code>.
</p>
<div class="codehilite"><pre><span class="kt">void</span> <span class="nf">strcpy</span><span class="p">(</span><span class="kt">char</span><span class="o">*</span> <span class="n">dst</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span><span class="o">*</span> <span class="n">src</span><span class="p">)</span>
<span class="p">{</span>
    <span class="n">free</span><span class="p">(</span><span class="n">dst</span><span class="p">);</span>
    <span class="n">dst</span> <span class="o">=</span> <span class="n">malloc</span><span class="p">(</span><span class="k">sizeof</span><span class="p">(</span><span class="n">src</span><span class="p">));</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">dst</span><span class="p">)</span> <span class="p">{</span>

    <span class="p">}</span> <span class="k">else</span> <span class="p">{</span> <span class="c">// error; }</span>
<span class="p">}</span>
</pre></div>
<h6>Answer 10</h6><p>Let's see - system calls, freeing <code>dst</code>, and hopefully the string is 3 bytes (4 with the terminating zero) since they are allocating a chunk of memory for the size of an address, and not the length of the string it points to. But then the solution fizzles out. Come on, you weren't shy about using system calls. Why not round it out with a call to <code>memcpy</code> to at least finish it!
</p>

<h2>Looking Back</h2>
<p>There are a number of things I learned about programmers from this one simple question. You can argue whether the question is a good one, whether my criteria for evaluating answers was fair, and whether this question is even relevent anymore. But one thing that you can't argue with is the value of asking people to demonstrate their knowledge by writing code, plus the comparitive value in asking the same question over and over. You learn a lot about people by asking them to solve something real that they should be familiar with and should understand.
</p>
<p>First of all you learn how wildly one programmer's knowledge about a language can vary from another's. When someone says they &quot;know&quot; a language, what does that really mean? Does that mean they know the syntax? That they are comfortable with the standard library? That they understand the internal representation of data structures? That they are comfortable with pointers?
</p>
<p>You also find out if people are curious or not. When they were learning C, did they find out about the K&amp;R book and read it on their own? Sometimes if a candidate does well I'll have them come back for a second round of interviews and I might ask them, how would you do it now? I love it when they ran home, read up on C and learned how to do it better. Or they proactively just emailed me an improved version because they couldn't let go of it.
</p>
<p>What about their style? Is the code legible? Organized? Were they careful? Did they bother to test with the most trivial example to see if it works? Granted, <code>strcpy</code> is too small to answer some of these questions but it wasn't the only programming problem they were given. The point is that looking at someone's code tells a lot about them - it is what you are hiring them to produce and you want to know how they go about creating it. Would you hire a photographer without looking at their portfolio?
</p>
<p>Some people may find it disagreeable that I give someone a written quiz as it creates a stressful situation. I agree. I've always worked in a high pressure environment and want to make sure the candidates can handle that pressure (most of my career I worked on trading desks of Wall Street firms).
</p>
<p>I've done enough of this to know that this isn't a statistical anomoly. The truth is that there are a lot of bad programmers out there. That's not to say that this simple question is the one true determinator of a good or bad programmer - its just one of many data points I've used in interviewing programmers. But I'd certainly make the argument that its positively correlated with a person's skills and knowledge and all other factors being equal, I'd much prefer hiring someone who nailed it then someone who took twenty lines to get it wrong.
</p>]]></content:encoded></item></channel>
</rss>