<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Scott&#039;s Repository Brew</title>
	<atom:link href="http://www.scottphillips.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.scottphillips.com</link>
	<description></description>
	<lastBuildDate>Sat, 19 May 2012 19:23:59 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Introducing Piper</title>
		<link>http://www.scottphillips.com/2012/05/introducing-piper/</link>
		<comments>http://www.scottphillips.com/2012/05/introducing-piper/#comments</comments>
		<pubDate>Sat, 19 May 2012 19:22:23 +0000</pubDate>
		<dc:creator>Scott</dc:creator>
				<category><![CDATA[Projects]]></category>
		<category><![CDATA[DSpace]]></category>
		<category><![CDATA[Play Framework]]></category>
		<category><![CDATA[SWORD]]></category>
		<category><![CDATA[Web Sockets]]></category>

		<guid isPermaLink="false">http://www.scottphillips.com/?p=1055</guid>
		<description><![CDATA[Piper is an internal project we have been working on at Texas A&#38;M University Libraries. The project is just in its initial stages at this point with the first kernel of an idea. I expect to that we will expand its capabilities in the future. Piper is basically a repository batch import tool right now, [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.scottphillips.com/files/2012/05/sandpiper.jpg"><img class="alignright size-full wp-image-1081" src="http://www.scottphillips.com/files/2012/05/sandpiper.jpg" alt="" width="200" /></a>Piper is an internal project we have been working on at <a href="http://library.tamu.edu/">Texas A&amp;M University Libraries</a>. The project is just in its initial stages at this point with the first kernel of an idea. I expect to that we will expand its capabilities in the future. Piper is basically a repository batch import tool right now, and in the future it could grow into becoming an internal repository workflow tool.</p>
<p>How does Piper fit into the repository ecosystem? It is a behind the scene tool for repository administrators to curate collections. For our initial phase we focused on the sole task of ingesting content into <a href="http://www.dspace.org/">DSpace</a> simple. However, in the future, it may bring in workflow capabilities to ensure quality control, integration with other workflow tools like Vireo, and of course additional repository support.</p>
<p><span id="more-1055"></span></p>
<div class="wp-caption alignleft" style="width: 510px">
<p><a href="http://www.scottphillips.com/files/2012/05/PiperNow.png"><img class="size-medium wp-image-1058" style="float: left;margin-left: 7px" src="http://www.scottphillips.com/files/2012/05/PiperNow-300x214.png" alt="" width="245" /></a><a href="http://www.scottphillips.com/files/2012/05/PiperFuture.png"><img class="size-medium wp-image-1057" src="http://www.scottphillips.com/files/2012/05/PiperFuture-300x214.png" alt="" width="245" /></a></p>
<p class="wp-caption-text">Two figures depicting how Piper fits into the repository ecosystem right now, and ways it could possibly fit in in the future.</p>
</div>
<h2>How does Piper work?</h2>
<p>First a content owner works with a librarian to gather their content together. The metadata is captured on a spreadsheet, and all the files are collected into a folder. The folder is copied to shared network storage that both the librarian has access to on their desktop and the server running Piper can access. We chose network file sharing because batches can be bigger than the 4GB limit that the HTTP protocol imposes. In the future,  a direct upload mechanism for small batches may be added primarily for convenience. Next, the librarian logs into Piper through their web browser and uploads the spreadsheet with the metadata for the batch. Currently, the only metadata format supported is a Dublin Core spreadsheet where the first column is a reference to the file, and all other columns are a Dublin Core field specified in dot notation. Piper is designed to be easily extensible so each &#8220;Ingester&#8221; is a simple <a href="www.springsource.org/spring-framework">Spring bean</a>. When uploading a new spread sheet the user selects what format to use.</p>
<p>Once a batch has been ingested into Piper, it is ready to be verified. The current version is read-only, i.e. to make any changes you need to go back to your spreadsheet update it and re-upload it. However, the necessary UI adjustments to edit the content directly with-in Piper will likely be added in a future version. Each repository is pre-configured with a set of verifiers (each is a simple Spring bean for easy extensibility). Each verify will scan the batch and identify any errors &#8211; things that we know will not work, or warnings &#8211; things that might not work. Some things they check are whether the column label is valid Dublin Core, or whether 90% of the items contain a particular field but there is an item with that field missing. Right now there are seven different verifiers almost all of them are completely related to DSpace and Dublin Core. As other formats and repositories are supported the list of verifiers will expand.</p>
<p>Once a batch has been verified and contains no errors it is ready to be deposited. The librarian simply clicks the deposit button and each item is ingested into the repository one at a time via SWORD into the destination collection. Right now Piper only supports SWORD version 1, so it&#8217;s a one shot deal. With support for Sword 2 in a future version, it will allow us to deposit the batch into a repository and then some time later if we find an error re-deposit the item back into the repository updating the existing item in the repository without assigning it a new identifier.</p>
<h2>Technical notes</h2>
<p>The application is developed using the <a href="http://www.playframework.org/">Play Framework</a> using Java, <a href="http://www.springsource.org/">Spring</a>, <a href="http://en.wikipedia.org/wiki/Java_Persistence_API">JPA</a>, etc. These collections of tools proved to be a very productive environment for Piper. One of the cool new HTML 5 technologies that we used in Piper is <a href="http://en.wikipedia.org/wiki/WebSocket">Web Sockets</a>. Web sockets allow the browser to open a persistent TCP connection back to the server for two way communication. We use this extensively in Piper for the spreadsheet view. When the page loads initially the content is blank. The browser opens a web socket back to the web server where Piper starts feeding all the batch content over the web socket back to the browser for display. This allows us to display a page with 20k items on it and not deal with time out issues, or blocking issues. The page is constantly working in the background to display the items while the user is able to interact with their web browser. This approach also allows us to push updates to the users browser. This is important because some of the tasks are long running such as verifiers, ingesters, depositors, etc, may take several minutes to hours to complete. While these are running on the server the user&#8217;s browser is able to update the changes in real time without reloading the page. Very cool stuff.</p>
<h2>What is next?</h2>
<p>Piper is an internal project for Texas A&amp;M University Libraries and has not gained approval for release as an open source project. I hope that we will be able to make it public so that others can check it out and see if it&#8217;s useful for them. Right now it only supports a very limited set of formats that need to be expanded. Most of our development focus for the rest of the year will be on Vireo, so it may be sometime before we jump back to further development on Piper.</p>
<h2>Screenshots</h2>

<a href='http://www.scottphillips.com/2012/05/introducing-piper/pipercreatebatch/' title='PiperCreateBatch'><img width="150" height="150" src="http://www.scottphillips.com/files/2012/05/PiperCreateBatch-150x150.png" class="attachment-thumbnail" alt="PiperCreateBatch" title="PiperCreateBatch" /></a>
<a href='http://www.scottphillips.com/2012/05/introducing-piper/piperbatchlist/' title='PiperBatchList'><img width="150" height="150" src="http://www.scottphillips.com/files/2012/05/PiperBatchList-150x150.png" class="attachment-thumbnail" alt="PiperBatchList" title="PiperBatchList" /></a>
<a href='http://www.scottphillips.com/2012/05/introducing-piper/piperingestbatch/' title='PiperIngestBatch'><img width="150" height="150" src="http://www.scottphillips.com/files/2012/05/PiperIngestBatch-150x150.png" class="attachment-thumbnail" alt="PiperIngestBatch" title="PiperIngestBatch" /></a>
<a href='http://www.scottphillips.com/2012/05/introducing-piper/piperviewbatch/' title='PiperViewBatch'><img width="150" height="150" src="http://www.scottphillips.com/files/2012/05/PiperViewBatch-150x150.png" class="attachment-thumbnail" alt="PiperViewBatch" title="PiperViewBatch" /></a>
<a href='http://www.scottphillips.com/2012/05/introducing-piper/piperverify/' title='PiperVerify'><img width="150" height="150" src="http://www.scottphillips.com/files/2012/05/PiperVerify-150x150.png" class="attachment-thumbnail" alt="PiperVerify" title="PiperVerify" /></a>

]]></content:encoded>
			<wfw:commentRss>http://www.scottphillips.com/2012/05/introducing-piper/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Merging two DSpace Solr based data sets together</title>
		<link>http://www.scottphillips.com/2011/10/merging-two-dspace-solr-based-data-sets-together/</link>
		<comments>http://www.scottphillips.com/2011/10/merging-two-dspace-solr-based-data-sets-together/#comments</comments>
		<pubDate>Tue, 11 Oct 2011 16:51:39 +0000</pubDate>
		<dc:creator>Scott</dc:creator>
				<category><![CDATA[HOWTO]]></category>
		<category><![CDATA[DSpace]]></category>
		<category><![CDATA[Solr]]></category>

		<guid isPermaLink="false">http://www.scottphillips.com/?p=1021</guid>
		<description><![CDATA[Have you ever messed up a DSpace upgrade and somehow ended up resetting your DSpace statistics? I did that. When we upgrade DSpace at A&#38;M we preform a fresh install each time and then restore the data from the old instance into the new instance. This involves connecting the database, linking the asset store, and [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignright size-full wp-image-1037" src="http://www.scottphillips.com/files/2011/10/zipper.png" alt="" width="200" height="150" />Have you ever messed up a <a href="http://www.dspace.org/">DSpace</a> upgrade and somehow ended up resetting your <a href="https://wiki.duraspace.org/display/DSDOC/DSpace+Statistics">DSpace statistics</a>? <em>I did that</em>. When we <a href="http://repository.tamu.edu/">upgrade DSpace at A&amp;M</a> we preform a fresh install each time and then restore the data from the old instance into the new instance. This involves connecting the database, linking the asset store, and copying the DSpace log directory. We like to do it this way so that our configs are fresh each time. Our documented installation procedures lists the exact settings (about 5) that need to be touched for each production install. All other parameters in the <tt>dspace.cfg</tt> are maintained in our local SVN copy. This prevents the problem of never know exactly how your DSpace is configured if you do the recommended upgrade procedure by modifying the <tt>dspace.cfg</tt> each upgrade with new parameters.</p>
<p><span id="more-1021"></span></p>
<h2>What happened?</h2>
<p>Our documented upgrade procedures calls for us to copy of the Solr directory from the old instance into the fresh instance. However I typoed the command so that the old copy of the Solr directory ended up <em>inside</em> the fresh copy of the Solr directory. Then we didn&#8217;t catch the error for a few weeks. So on the repository statistics page we were only showing stats from when the upgrade occurred, all the other months were zeroed out.</p>
<p><img class="alignleft size-full wp-image-1040" src="http://www.scottphillips.com/files/2011/10/wrongway.png" alt="" width="200" height="154" /></p>
<p style="margin-top: 100px">Opps!</p>
<h2 class="clear">How to fix it?</h2>
<p>If you found your self in a similar predicament then you can recover. The import thing is that you have both your old statistics data and your new statistics data. You just need them combined into one data set. DSpace uses <a href="http://lucene.apache.org/solr/">Solr</a> (which is built upon <a href="http://lucene.apache.org/">Lucene</a>) for storing statistics information. Because of this you have two basic approaches one at the Solr level and the other at the Lucene level. The basic concept is that you need to merge the two indexes together. <a href="http://wiki.apache.org/solr/MergingSolrIndexes">The solr wiki describes these two methods</a>. At first I attempted down the Solr path but I ran into a road block early when I was unable to issue the Solr command to create a new core where the two merged indexes would reside. Then I tried the other option and the Lucene MergeTool worked well. Here are the steps I followed to restore statistics.</p>
<h3>Step by Step Instructions</h3>
<p><strong>1) Identify DSpace&#8217;s copy of the Lucene libraries: <tt>lucene-core</tt> and <tt>lucene-misc</tt>.</strong></p>
<p>You will find these inside DSpace&#8217;s solr webapp: <tt>&lt;solr webapp&gt;/WEB-INF/lib/lucene-*.jar</tt> For DSpace 1.7.x the version for these libraries were 2.9.3. It is important that you use the same Lucene version that wrote the original indexes. We&#8217;ll use both of these paths below in step 5 as <tt>&lt;path to lucene-core jar&gt;</tt> and <tt>&lt;path to lucene-misc jar&gt;</tt>.</p>
<p><strong>2) Identify both Lucene indexes.</strong></p>
<p>Typically the solr-based statistics index is stored inside the DSpace install directory: <tt>&lt;dspace directory&gt;/solr/statistics/data/index</tt>. Inside the directory you should see at least one &#8220;<tt>.cfs</tt>&#8221; file along with a &#8220;<tt>segments.gen</tt>&#8221; file. The other copy will likely come from a back of copy of DSpace you have stashed away somewhere. We&#8217;ll use both of these paths below as <tt>/path/to/oldindex1</tt> and <tt>/path/to/oldindex2</tt>.</p>
<p><strong>3) Shutdown your DSpace instance.</strong></p>
<p>The merge tool requires that all the indexes it is reading be closed so while the merge is processing you can not be recording any new statistics.</p>
<p><strong>4) Run the Lucene merge tool.</strong></p>
<pre>java -cp &lt;path to lucene-core jar&gt;:&lt;path to lucene-misc jar&gt;
    org/apache/lucene/misc/IndexMergeTool /path/to/newindex
    /path/to/oldindex1 /path/to/oldindex2</pre>
<p>The command above will merge both the old index into the new index. The command takes about the same amount of time as it does to copy both indexes.</p>
<p><strong>5) Restore the combined index.</strong></p>
<pre>mv &lt;dspace directory&gt;/solr/statistics/data/index /path/to/someplace/safe
cp -r /path/to/newindex   &lt;dspace directory&gt;/solr/statistics/data/index</pre>
<p><strong>6) Restart DSpace and check the statistics.</strong></p>
<p>Hopefully everything works and you have a full set of statistics. Let others know that it worked in the comments.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.scottphillips.com/2011/10/merging-two-dspace-solr-based-data-sets-together/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Find All Restricted Items Within DSpace</title>
		<link>http://www.scottphillips.com/2011/08/find-all-restricted-items-within-dspace/</link>
		<comments>http://www.scottphillips.com/2011/08/find-all-restricted-items-within-dspace/#comments</comments>
		<pubDate>Fri, 19 Aug 2011 21:35:38 +0000</pubDate>
		<dc:creator>Scott</dc:creator>
				<category><![CDATA[HOWTO]]></category>
		<category><![CDATA[DSpace]]></category>
		<category><![CDATA[SQL]]></category>
		<category><![CDATA[Tips]]></category>

		<guid isPermaLink="false">http://www.scottphillips.com/?p=1005</guid>
		<description><![CDATA[Here is an SQL query you can copy-and-paste into DSpace to find all items which have restricted access or contain bundles / bitstreams which are restricted. Restricted means that the object does not have an authorization policy enabling anonymous read. It&#8217;s actually quite hard to find the absence of something with SQL. After trying various [...]]]></description>
			<content:encoded><![CDATA[<p>Here is an SQL query you can copy-and-paste into <a href="http://www.dspace.org/">DSpace</a> to find all items which have restricted access or contain bundles / bitstreams which are restricted. Restricted means that the object does not have an authorization policy enabling anonymous read. </p>
<p>It&#8217;s actually quite hard to find the absence of something with SQL. After trying various methods the way I came up with to solve this problem is a sub select that counts how many anonymous access policies exist for each object and if there are none then report those. The query is broken down into three distant parts one for each object time. Then all the objects are combined via <a href="http://www.postgresql.org/docs/9.0/static/functions-subquery.html#AEN16668">PostgreSQL set operators</a> and sub selects (again!). This means that if you have a huge number of restricted items in your repository the query <a href="http://stackoverflow.com/questions/1009706/postgresql-max-number-of-parameters-in-in-clause">might</a> fail or take an obscene amount of time/memory to run. I tried using a <a href="http://en.wikipedia.org/wiki/Join_%28SQL%29">left outer join</a> but couldn&#8217;t get it to handle the case where both no access policies exists and only non anonymous access policies exist.</p>
<p>The approach used here is inelegant and has some serious performance problems. However it worked my immediate purpose. We had no idea how many or which items are restricted in our repository (answer: just under 300). This task is a good candidate for a <a href="http://www.dspace.org/1_7_1Documentation/Curation%20System.html">DSpace curation task</a>, to find all items in a collection which are have restricted access. Or the opposite, find all items which are NOT restricted.</p>
<p><span id="more-1005"></span><br />
<strong>Here is the code for you to cut and paste:</strong></p>
<pre>
SELECT DISTINCT ON (handle) handle
FROM handle
WHERE 

handle IN
(
<span style="color: #008022;font-weight: bold">-- ---------------------------------------
-- Select Items which are restricted
-- ---------------------------------------</span>
SELECT DISTINCT ON (handle) handle
FROM item
INNER JOIN handle
ON item.item_id = handle.resource_id AND handle.resource_type_id = 2
WHERE
(
<span style="color: #008022;font-weight: bold">-- Count how many anonymous access policies exist for each item</span>
SELECT count(*)
FROM item AS item2anonymous, resourcepolicy AS rp
WHERE
item2anonymous.item_id = item.item_id AND
item2anonymous.item_id = rp.resource_id AND
rp.resource_type_id = 2 AND <span style="color: #008022;font-weight: bold">-- Type   = Item</span>
rp.action_id = 0 AND        <span style="color: #008022;font-weight: bold">-- Action = Read</span>
rp.epersongroup_id = 0      <span style="color: #008022;font-weight: bold">-- Group  = Anonymous</span>
)
&lt; 1
) 

OR handle IN
(
<span style="color: #008022;font-weight: bold">-- ---------------------------------------
-- Select Bundles which are restricted
-- ---------------------------------------</span>
SELECT DISTINCT ON (handle) handle
FROM item2bundle
INNER JOIN bundle AS bun
ON bun.bundle_id = item2bundle.bundle_id
INNER JOIN handle
ON item2bundle.item_id = handle.resource_id AND handle.resource_type_id = 2
WHERE
(
<span style="color: #008022;font-weight: bold">-- Count how many anonymous access policies exist for each bundle</span>
SELECT count(*)
FROM bundle AS bun2anonymous, resourcepolicy AS rp
WHERE
bun2anonymous.bundle_id = bun.bundle_id AND
bun2anonymous.bundle_id = rp.resource_id AND
rp.resource_type_id = 1 AND <span style="color: #008022;font-weight: bold">-- Type   = Bundle</span>
rp.action_id = 0 AND        <span style="color: #008022;font-weight: bold">-- Action = Read</span>
rp.epersongroup_id = 0      <span style="color: #008022;font-weight: bold">-- Group  = Anonymous</span>
)
&lt; 1
) 

OR handle IN
(
<span style="color: #008022;font-weight: bold">-- ---------------------------------------
-- Select Bitstreams which are restricted
-- ---------------------------------------</span>
SELECT DISTINCT ON (handle) handle
FROM bitstream AS bit
INNER JOIN bundle2bitstream
ON bit.bitstream_id = bundle2bitstream.bitstream_id
INNER JOIN item2bundle
ON bundle2bitstream.bundle_id = item2bundle.bundle_id
INNER JOIN handle
ON item2bundle.item_id = handle.resource_id AND handle.resource_type_id = 2
WHERE
(
<span style="color: #008022;font-weight: bold">-- Count how many anonymous access policies exist for each bitstream</span>
SELECT count(*)
FROM bitstream AS bit2anonymous, resourcepolicy AS rp
WHERE
bit2anonymous.bitstream_id = bit.bitstream_id AND
bit2anonymous.bitstream_id = rp.resource_id AND
rp.resource_type_id = 0 AND <span style="color: #008022;font-weight: bold">-- Type = Item</span>
rp.action_id = 0 AND        <span style="color: #008022;font-weight: bold">-- Action = Read</span>
rp.epersongroup_id = 0      <span style="color: #008022;font-weight: bold">-- Group = Anonymous</span>
)
&lt; 1

)
</pre>
<p style="height: 50px">
]]></content:encoded>
			<wfw:commentRss>http://www.scottphillips.com/2011/08/find-all-restricted-items-within-dspace/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Preserving Character Encodings of a DSpace Metadata Export using MS Excel 2011 on OS X</title>
		<link>http://www.scottphillips.com/2011/07/character-encodings-dspace-excel-os-x/</link>
		<comments>http://www.scottphillips.com/2011/07/character-encodings-dspace-excel-os-x/#comments</comments>
		<pubDate>Wed, 20 Jul 2011 14:41:38 +0000</pubDate>
		<dc:creator>Scott</dc:creator>
				<category><![CDATA[HOWTO]]></category>
		<category><![CDATA[DSpace]]></category>
		<category><![CDATA[OAI-ORE]]></category>
		<category><![CDATA[OS X]]></category>
		<category><![CDATA[TDL]]></category>
		<category><![CDATA[Tips]]></category>

		<guid isPermaLink="false">http://www.scottphillips.com/?p=977</guid>
		<description><![CDATA[The problem I recently ran into was updating the metadata for a particular collection that was being moved from TDL&#8217;s repository into A&#38;M&#8217;s repository. I able to quickly move the collection into the new repository using OAI-PMH harvesting with ORE support. However, the metadata needed a bit of cleaning up for it&#8217;s new repository home, [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignright size-full wp-image-996" src="http://www.scottphillips.com/files/2011/07/encodings.png" alt="Stencil Alphabet" width="200" height="266" /> The problem I recently ran into was updating the metadata for a particular collection that was being moved from TDL&#8217;s repository into <a href="http://repository.tamu.edu/handle/1969.1/94990">A&amp;M&#8217;s repository</a>. I able to quickly move the collection into the new repository using <a href="http://repository.tamu.edu/handle/1969.1/86479">OAI-PMH harvesting with ORE support</a>. However, the metadata needed a bit of cleaning up for it&#8217;s new repository home, such as changing <code>dc.contributor.author</code> to <code>dc.author</code> and inconsistent formats used in other fields. This is a perfect task for <a href="http://blog.stuartlewis.com/2010/02/10/dspace-1-6-what-will-be-in-it-for-me/">Stuart&#8217;s Bulk Metadata Export</a> tool. This <a href="http://www.dspace.org/">DSpace</a> feature allows an administrator to download a <a href="http://en.wikipedia.org/wiki/Comma-separated_values">Comma Separate Values</a> (CSV) file of all the metadata in a particular collection, then open it up in MS Excel and edit the metadata naturally. Finally once the metadata is ready to go you can upload it back to the repository and all the fields will be updated correctly. It is a very nice feature that can save a ton of time.</p>
<h2>The Problem</h2>
<p>When I opened the file in Excel some of the characters were not showing up correctly. Specifically characters in titles and names which used non-English marks, in this case there were all from the extended Latin character set. If you ignore these problems, later when you try to upload the CSV file DSpace will pick up on these changes and cause the garbled characters to be introduced into the repository.</p>
<p><a href="http://www.scottphillips.com/files/2011/07/garbled.png"><img class="size-full wp-image-987" src="http://www.scottphillips.com/files/2011/07/garbled.png" alt="" width="500" /></a><br />
<span id="more-977"></span></p>
<p>DSpace uses <a href="http://en.wikipedia.org/wiki/UTF-8"><code>UTF-8</code></a> file encoding for everything as does almost all well-behaving application out there. Somewhere between downloading the file and opening it with Excel the encoding is being mis-interpreted. Excel has a CSV import tool where you can specify the character encoding but that tool does not work with DSpace&#8217;s export because the columns become miss-aligned if there are any new lines or charge returns in the metadata.</p>
<h2>The Solution</h2>
<p>The solution to this problem turned out to be using a command line tool I was not familiar with: &#8220;<a href="http://en.wikipedia.org/wiki/Iconv"><code>iconv</code></a>&#8220;. It&#8217;s apparently a standard Linux command that is available with the default install of OS X. After some experimentation I found that when you &#8220;<em>double click</em>&#8221; open a CSV file Excel wants the file to be encoded using the default encoding found on Windows machines: <a href="http://en.wikipedia.org/wiki/Windows-1252"><code>Windows-1252</code></a>. This encoding is very similar to <a href="http://en.wikipedia.org/wiki/ISO/IEC_8859-1"><code>ISO-8859-1</code></a> but has some key differences. Use the <code>iconv</code> command to convert the file into this encoding and you should be able to open the file with Excel OS X with just a double click. All the characters should be preserved.</p>
<pre>iconv -f UTF-8 -t WINDOWS-1252 original.csv &gt; excel.csv</pre>
<p>However, you can&#8217;t upload this encoding back to DSpace because it won&#8217;t know what to do with it. DSpace expects it to be in the standard <code>UTF-8</code> encoding. Once you are finished editing the file to your satisfaction in Excel you will need to convert it to <code>UTF-8</code>. When you save the file in Excel you will have two valid formating options you may select from the entire list shown below: &#8220;<code>Comma Separated Values (.csv)</code>&#8221; or &#8220;<code>Windows Comma Separated (.csv)</code>&#8220;. The key difference between these two formats is what character encoding Excel will use when saving the file.</p>
<p><a href="http://www.scottphillips.com/files/2011/07/excel.png"><img class="size-medium wp-image-978 aligncenter" src="http://www.scottphillips.com/files/2011/07/excel-300x242.png" alt="" width="300" height="242" /></a></p>
<p>If you went with the default Comma Separated Values format then you will need to run following command to convert the file back to <code>UTF-8</code>.</p>
<pre>iconv -f MAC -t UTF-8 excel.csv &gt; upload.csv</pre>
<p>However if you chose the other valid option and saved the file as a &#8220;<code>Windows Comma Separated (.csv)</code>&#8221; then you will need to use the alternate command to convert the file back to <code>UTF-8</code>:</p>
<pre>iconv -f WINDOWS-1252 -t UTF-8 excel.csv &gt; upload.csv</pre>
<p>You are ready to upload the resulting &#8220;<code>upload.csv</code>&#8221; file to DSpace or use command line tool for bulk editing.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.scottphillips.com/2011/07/character-encodings-dspace-excel-os-x/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>SvnBot 1.1 Released</title>
		<link>http://www.scottphillips.com/2011/02/svnbot-1-1/</link>
		<comments>http://www.scottphillips.com/2011/02/svnbot-1-1/#comments</comments>
		<pubDate>Fri, 25 Feb 2011 22:29:31 +0000</pubDate>
		<dc:creator>Scott</dc:creator>
				<category><![CDATA[Projects]]></category>
		<category><![CDATA[IRC]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[SVN]]></category>
		<category><![CDATA[TDL]]></category>

		<guid isPermaLink="false">http://www.scottphillips.com/?p=691</guid>
		<description><![CDATA[The SvnBot is a simple single purpose IRC robot that monitors one or more SVN repositories. When changes are committed to a source repository the robot makes an announcement in an IRC channel. The purpose of the tool is to allow a team of developers to keep up to date on changes that other team [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://www.scottphillips.com/files/2011/01/Robot.jpg" alt="" width="200" height="240" class="alignright size-full wp-image-828" />The SvnBot is a simple single purpose IRC robot that monitors one or more SVN repositories. When changes are committed to a source repository the robot makes an announcement in an IRC channel. The purpose of the tool is to allow a team of developers to keep up to date on changes that other team members are making. Here at <a href="http://www.tdl.org/">TDL</a> we have a geographically distributed team of software developers some in Austin and Lubbock along with my self in College Station. This is one tool that helps the team keep in sync with each other.</p>
<p>There are already many tools available that do this task [<a href="#ref1">1</a>][<a href="#ref2">2</a>][<a href="#ref3">3</a>], however they all require the use of SVN commit-hooks. <a href="http://svnbook.red-bean.com/en/1.1/ch05s02.html">Commit-hooks</a> are run on the repository’s server allowing external tools to be notified when specific events occur. Using commit-hooks can work reasonably well <em>if</em> you have access to the server’s configuration, but that is not always the case. Instead of relying on commit-hooks; the SvnBot runs independently, periodically polling the repository for any updates. When an update in found a message will be announced in IRC.</p>
<p>Below is an example IRC message. The message includes the author and their commit message along with some brief statistics about the number of files affected and a URL. If multiple files were affected by a single commit then the url reported is to the common path, i.e. the closest directory that contains all the affected files.</p>
<p style="margin-left: 30px"><span style="color: #b1009f">Scott</span>: &#8220;<strong>SAND-30, reviewed the pom file: removing unneeded dependencies, declaring output to be UTF-8 and added a few comments.</strong>&#8221; <span style="color: #7b7b7b">(Rev 666: 1 file modified)</span> <a href="https://texasdl.jira.com/svn/SAND/svnbot/trunk/pom.xml">https://texasdl.jira.com/svn/SAND/svnbot/trunk/pom.xml</a></p>
<p>Special thanks to the <a href="http://www.tdl.org/">Texas Digital Library</a>, my employer, for allowing me to release this as an open source project.<br />
<span id="more-691"></span></p>
<h2>Download</h2>
<ul>
<li><a href="http://sourceforge.net/projects/svnbot/files/svnbot-1.1-bin.zip/download">SvnBot 1.1 &#8211; Java Binary Distribution (ZIP)</a></li>
<li><a href="http://sourceforge.net/projects/svnbot/files/svnbot-1.1-src.zip/download">SvnBot 1.1 &#8211; Java Source Distribution (ZIP)</a></li>
<li><a href="https://svnbot.svn.sourceforge.net/svnroot/svnbot/">SVN Repository</a></li>
</ul>
<h2>How to use</h2>
<p>From the binary distribution simply run the Java jar file with the &#8220;<code>-c</code>&#8221; parameter passing the path to a configuration file. An example configuration file is provided with the with both the binary and source distributions.</p>
<pre style="margin-left: 30px">java –jar svnbot-1.1-jar-with-dependencies.jar –c path/to/config</pre>
<p>Since many will want to deploy SvnBot on a server and forget about it, you&#8217;ll probably want to run SvnBot within a <a href="http://en.wikipedia.org/wiki/Nohup"><code>nohup</code></a> operator. The <code>nohup</code> command will allow the Java process to continue running even after you&#8217;re shell has disconnected (i.e. ignore the &#8220;hang up&#8221; signal). Here&#8217;s an example as for quick reference:</p>
<pre style="margin-left: 30px">nohup java -jar svnbot-1.1-jar-with-dependencies.jar -c path/to/config
      &lt; /dev/null &gt;&gt; /dev/null 2&gt;&gt; /dev/null &amp;</pre>
<h2>Configuration</h2>
<p>The configuration file format is a mix of XML and standard Java properties similar to <a href="http://httpd.apache.org/">Apache’s httpd</a> configuration format. There are three types of configuration sections: <code><span style="color: #0b0b9b;font-weight: bold"> &lt;irc&gt;</span></code>, <code><span style="color: #0b0b9b;font-weight: bold"> &lt;log4j&gt;</span></code>, and multiple <code><span style="color: #0b0b9b;font-weight: bold"> &lt;repository&gt; </span></code> tags.</p>
<p style="padding-left: 30px"><code><span style="color: #0b0b9b;font-weight: bold"> &lt;irc&gt; </span></code>This section configures the IRC server and channel where to announce changes.</p>
<p style="padding-left: 30px"><code><span style="color: #0b0b9b;font-weight: bold"> &lt;log4j&gt; </span></code>This section configures how logging messages from SvnBot will be handled. This uses standard log4j syntax.</p>
<p style="padding-left: 30px"><code><span style="color: #0b0b9b;font-weight: bold"> &lt;repository&gt; </span></code>This section configures an SVN repository to monitor: specify the url, authentication, polling frequency, etc. The SVN bot is capable of monitoring multiple repositories, just add multiple  definitions for each repository.</p>
<p>Below is the example configuration file. Note only the IRC&#8217;s <code>server</code>, <code>channel</code>, and repository&#8217;s <code>url</code> are required parameters. All other parameters are optional, either for special circumstances or have reasonable defaults.</p>
<pre style="color: #157015"># SvnBot Configuration
#
# java -jar svnbot-1.1-jar-with-dependencies.jar -c path/to/this/config
#
# Note you must use XML escaping such as: &amp;amp; &amp;lt; &amp;gt; &amp;quot &amp;apos

<span style="color: #0b0b9b;font-weight: bold">&lt;irc&gt;</span>
    # The hostname, port, and channel of the IRC server
    <span style="color: #993300"><strong>server</strong> = irc.freenode.net</span>
    <span style="color: #993300"><strong>channel</strong> = #svnbot</span>
    #<span style="color: #993300">port</span> = 6667

    # The nickname to use when communicating on IRC
    #<span style="color: #993300">nick</span> = SvnBot

    # Authentication credentials for the IRC Server
    #<span style="color: #993300">username</span> = your-irc-username
    #<span style="color: #993300">password</span> = your-irc-password

    # The notice message is sent to everyone in the channel when the bot
    # first starts, letting others know that the bot has joined the
    # channel.
    #<span style="color: #993300">notice</span> = I am an SVN Bot. I report commits to the ??? repository.
<span style="color: #0b0b9b;font-weight: bold">&lt;/irc&gt;</span>

# Optional logging configuration.
<span style="color: #0b0b9b;font-weight: bold">&lt;log4j&gt;</span>
    # This section uses Log4j configuration syntax. The example below
    # will log messages to /var/log/svnbot.log. However, if no log4j
    # section exists then all messages INFO and higher will be sent to
    # stdout.
    #
    # See <a href="http://logging.apache.org/log4j/1.2/manual.html">http://logging.apache.org/log4j/1.2/manual.html</a> for more
    # information

    # Log everything INFO and greater to a File
    #<span style="color: #993300">log4j.rootLogger</span> = INFO, FILE

    # Define the FILE's location
    #<span style="color: #993300">log4j.appender.FILE</span> = org.apache.log4j.FileAppender
    #<span style="color: #993300">log4j.appender.FILE.File</span> = /var/log/svnbot.log

    # Define the FILE's layout
    #<span style="color: #993300">log4j.appender.FILE.layout</span>=org.apache.log4j.PatternLayout
    #<span style="color: #993300">log4j.appender.FILE.layout.ConversionPattern</span> = \
              %d{ISO8601} [%t] %-5p %c %x - %m%n
<span style="color: #0b0b9b;font-weight: bold">&lt;/log4j&gt;</span>

# Repeatable repository configuration.
<span style="color: #0b0b9b;font-weight: bold">&lt;repository <span style="color: #993300">type="svn"</span>&gt;</span>
    # The URL of your SVN repository
    <span style="color: #993300"><strong>url</strong> = https://yourepository/</span>

    # Authentication credentials for the SVN Server
    #<span style="color: #993300">username</span> = your-svn-username
    #<span style="color: #993300">password</span> = your-svn-password

    # Targeted paths restrict the SvnBot's focus to only monitor changes
    # within the specified paths. There can be multiple paths separated
    # by a comma and each path must be anchored to the root of the
    # repository. Only commits to the repository that affects a file under
    # one of these paths will be reported in IRC.
    #<span style="color: #993300">targetPaths</span> = project1/ project2/subproject, ...

    # The amount of time between polling the repository. Periodically
    # SvnBot will check the repository to see if there have been any
    # new commits. The default interval is 30 seconds. However, if
    # your repository is under heavy load you should use a higher
    # interval such as 60 or 150 seconds.
    #<span style="color: #993300">pollinterval</span> = 30

    # Nicknames are used to translate the repository's usernames into
    # something more friendly. The username on the left side will be
    # replaced with the nickname from the right side when the commit
    # is announced in IRC.
    <span style="color: #0b0b9b;font-weight: bold">&lt;nicknames&gt;</span>
        <span style="color: #993300">william = Bill</span>
        <span style="color: #993300">charles = Chuck</span>
        <span style="color: #993300">biglong@emailaddress.com = Short &amp;amp; Sweet</span>
    <span style="color: #0b0b9b;font-weight: bold">&lt;/nicknames&gt;</span>
<span style="color: #0b0b9b;font-weight: bold">&lt;/repository&gt;</span>

# Repeat the repository section for each repository.
<span style="color: #0b0b9b;font-weight: bold">&lt;repository&gt;</span>
    ...
<span style="color: #0b0b9b;font-weight: bold">&lt;/repository&gt;</span></pre>
<h2>How to compile</h2>
<p><strong>Prerequisites</strong>:</p>
<ul>
<li><a href="http://www.oracle.com/technetwork/java/javase/downloads/index.html">Java 1.6</a> or higher</li>
<li><a href="http://maven.apache.org/">Apache Maven</a> 2.2 or higher</li>
</ul>
<p>This project uses Apache Maven as its build and distribution management system. To compile the project use the following maven command:</p>
<pre style="margin-left: 30px">mvn package</pre>
<p>Two binary files will be compiled into the “<code>/target</code>” directory. You will find “<code>svnbot-&lt;version&gt;-jar-with-dependencies.jar</code>” which is a runnable jar file. The other jar file included is “<code>svnbot-&lt;version&gt;.jar</code>” which just includes the svnbot class files without other dependencies for SVN or logging. This jar file is suitable for inclusion in other projects.</p>
<p>This project uses the <a href="http://www.jibble.org/pircbot.php">PircBot</a> library to handle communication with the IRC Server. Another important dependency is the pure Java implementation of the SVN protocol with the <a href="http://svnkit.com/">SvnKit</a> library.</p>
<h2>References</h2>
<p>[<a name="ref1">1</a>] &#8220;<a href="http://projects.bleah.co.uk/misc/wiki/SvnBot">SvnBot &#8211; Miscbits</a>&#8220;, A Python SVN Notification IRC bot, using commit-hooks.</p>
<p>[<a name="ref2">2</a>] &#8220;<a href="http://www.javalinux.it/wordpress/2009/10/15/writing-an-irc-bot-for-svn-commit-notification/">Writing an irc bot for svn commit notification</a>&#8220;, A simple Perl script to send notification of SVN changes to IRC, also using commit-hooks.</p>
<p>[<a name="ref3">3</a>] &#8220;<a href="https://github.com/RJ/irccat">IRCcat</a>&#8220;, A Java-based IRC bot enabling shell scripts to send messages to IRC through a local port. This can be combined with a commit-hook to announce changes in IRC.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.scottphillips.com/2011/02/svnbot-1-1/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Mixed XML and Property files?</title>
		<link>http://www.scottphillips.com/2011/02/mixed-xml-and-property-files/</link>
		<comments>http://www.scottphillips.com/2011/02/mixed-xml-and-property-files/#comments</comments>
		<pubDate>Tue, 01 Feb 2011 18:11:51 +0000</pubDate>
		<dc:creator>Scott</dc:creator>
				<category><![CDATA[HOWTO]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Tips]]></category>
		<category><![CDATA[XML]]></category>

		<guid isPermaLink="false">http://scottphillips.com/?p=594</guid>
		<description><![CDATA[Have you ever wanted the simplicity of a plain old Java properties file but with just a little bit of grouping provided by XML? I’ve been working on a small side-project recently and it requires a simple configuration file of a dozen items or so. The project needed a repeatable set of configuration parameters, so [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://scottphillips.com/files/2011/01/Configuration.png" alt="" width="200" height="200" class="alignright size-full wp-image-651 no-style" style="margin: 0" />Have you ever wanted the simplicity of a plain old Java <a href="http://download.oracle.com/javase/6/docs/api/java/util/Properties.html">properties</a> file but with just a little bit of grouping provided by XML? I’ve been working on a small side-project recently and it requires a simple configuration file of a dozen items or so. The project needed a repeatable set of configuration parameters, so that it could connect to several SVN servers. Each connection needed a URL, username, password, and a few other ancillary properties. This is a pain to do in a plain old properties file. You have to do something with the naming of the properties to relate them together, such as:</p>
<pre>
property.1.url = http://...
property.1.username = Bob
property.1.password = Bob’s secret

property.2.url = http://...
property.2.username = Joe
property.2.password = Joe’s secret
</pre>
<p>This way works but it’s sort of annoying and can be confusing for someone else to understand what’s going on. They would likely need to read the documentation, especially if it’s more complex with multiple types of repeating parameters. There are several alternatives, you could try encoding all the parameters into one property but that’s even harder for a user to figure out. A slightly better alternative is to use something hierarchical like XML, thus:</p>
<pre>
&lt;properties&gt;
	&lt;repeatable&gt;
		&lt;url&gt;http://...&lt;/url&gt;
		&lt;username&gt;Bob&lt;/username&gt;
		&lt;password&gt;Bob’s secret&lt;/password&gt;
	&lt;/repeatable&gt;
	&lt;repeatable&gt;
		&lt;url&gt;http://...&lt;/url&gt;
		&lt;username&gt;Joe&lt;/username&gt;
		&lt;password&gt;Joe’s secret&lt;/password&gt;
	&lt;/repeatable&gt;
&lt;/properties&gt;
</pre>
<p>This is easier to understand, but it’s very verbose. Each property is labeled twice, once to open the tag and again to close the tag.  XML is good for complex things like HTML or specific file formats with a dedicated reader. However, XML is not great for humans to read, let alone edit quickly.</p>
<h2>A better solution, combine both!</h2>
<p>Instead of either XML or properties file we can munge the two together to create something that is easier for users to manage.</p>
<pre>
property.one = value1
property.two = value2

&lt;repeatable&gt;
	url = http://...
	username = Bob
	password = Bob’s secret
&lt;/repeatable&gt;

&lt;repeatable&gt;
	url = http://...
	username = Joe
	password = Joe’s secret
&lt;/repeatable&gt;
</pre>
<p>The combined format is similar to Apache’s <a href="http://httpd.apache.org/docs/2.2/configuring.html">httpd configuration</a> format where name/value pairs are also mixed with nestable elements. It’s very close to the simplicity of a plain old properties file, but has just enough expressivity to handle grouping of elements. It’s a win-win.<br />
<span id="more-594"></span></p>
<h2>How to parse the mixed format?</h2>
<p>It’s easy, thanks to the Apache’s <a href="http://commons.apache.org/configuration/">Common Configuration</a> project! Here’s a simple class to demonstrate parsing the combined file format. The full class can be downloaded <a href="http://scottphillips.com/files/2011/01/ConfigurationExample.java">here</a>. Below I will walk through the important bits that make processing the config file easy:</p>
<pre>
<span style="color: #8D1D68;font-weight:bold">private final static</span> String <span style="color: #0000C0">PREPEND</span> = <span style="color: #0000C0">"&lt;?xml version=\"1.0\"?&gt;\n&lt;xml&gt;\n"</span>;
<span style="color: #8D1D68;font-weight:bold">private final static</span> String <span style="color: #0000C0">APPEND</span> = <span style="color: #0000C0">"\n&lt;/xml&gt;"</span>;
<div style="border-bottom: 1px dashed grey;width: 100%;margin: 2em 0 0 0;padding: 0"></div>

<a href="http://download.oracle.com/javase/6/docs/api/java/util/List.html">List</a>&lt;<a href="http://download.oracle.com/javase/6/docs/api/java/io/InputStream.html">InputStream</a>&gt; inputStreams = <span style="color: #8D1D68;font-weight:bold">new</span> <a href="http://download.oracle.com/javase/6/docs/api/java/util/ArrayList.html">ArrayList</a>&lt;<a href="http://download.oracle.com/javase/6/docs/api/java/io/InputStream.html">InputStream</a>&gt;();
inputStreams.add(<span style="color: #8D1D68;font-weight:bold">new</span> <a href="http://download.oracle.com/javase/6/docs/api/java/io/ByteArrayInputStream.html">ByteArrayInputStream</a>(<span style="color: #0000C0">PREPEND</span>.getBytes()));
inputStreams.add(<span style="color: #8D1D68;font-weight:bold">new</span> <a href="http://download.oracle.com/javase/6/docs/api/java/io/FileInputStream.html">FileInputStream</a>(configFile));
inputStreams.add(<span style="color: #8D1D68;font-weight:bold">new</span> <a href="http://download.oracle.com/javase/6/docs/api/java/io/ByteArrayInputStream.html">ByteArrayInputStream</a>(<span style="color: #0000C0">APPEND</span>.getBytes()));

<a href="http://download.oracle.com/javase/6/docs/api/java/io/InputStream.html">InputStream</a> combinedInputStreams = <span style="color: #8D1D68;font-weight:bold">new</span> <a href="http://download.oracle.com/javase/6/docs/api/java/io/SequenceInputStream.html">SequenceInputStream</a>(
	<a href="http://download.oracle.com/javase/6/docs/api/java/util/Collections.html">Collections</a>.<em>enumeration</em>(inputStreams));
</pre>
<p>This is a little trick to makes the configuration file a well formed XML document. It pre-pends an XML declaration and root level element to the beginning of the document while closing it at the end. All these parts are combined together using a <code><a href="http://download.oracle.com/javase/6/docs/api/java/io/SequenceInputStream.html">SequenceInputStream</a></code> which just reads from the first inputstream before moving on to the next on the list until all the inputstreams have been exhausted. Next we move on to parsing the configuration file.</p>
<pre>
<a href="http://commons.apache.org/configuration/apidocs/org/apache/commons/configuration/XMLConfiguration.html">XMLConfiguration</a> xmlConfiguration = <span style="color: #8D1D68;font-weight:bold">new</span> <a href="http://commons.apache.org/configuration/apidocs/org/apache/commons/configuration/XMLConfiguration.html">XMLConfiguration</a>();
xmlConfiguration.setDelimiterParsingDisabled(<span style="color: #8D1D68;font-weight:bold">true</span>);
xmlConfiguration.load(combinedInputStreams);
</pre>
<p> Next, we simply read the configuration file as a normal XML-based configuration file using the standard Apache commons library. One thing to note is that before reading in the configuration file it is a good thing to disable <code><a href="http://commons.apache.org/configuration/apidocs/org/apache/commons/configuration/AbstractConfiguration.html#setDelimiterParsingDisabled(boolean)">DelimiterParsing</a></code> which is enabled by default. Normally the configuration parser will separate multiple values separated by a comma into a list, but since at this level each value is really a sub-properties file it gets in the way if a comma appears anywhere in the document. Next we move on to reading the root-level properties.</p>
<pre>
String rootConfig = xmlConfiguration.getString(<span style="color: #0000C0">""</span>);
<a href="http://commons.apache.org/configuration/apidocs/org/apache/commons/configuration/PropertiesConfiguration.html">PropertiesConfiguration</a> rootProperties = <span style="color: #8D1D68;font-weight:bold">new</span> <a href="http://commons.apache.org/configuration/apidocs/org/apache/commons/configuration/PropertiesConfiguration.html">PropertiesConfiguration</a>();
rootProperties.load(<span style="color: #8D1D68;font-weight:bold">new</span> <a href="http://download.oracle.com/javase/6/docs/api/java/io/ByteArrayInputStream.html">ByteArrayInputStream</a>(rootConfig.getBytes()));
</pre>
<p>Here we take all the text that is at the root of the configuration file (<em>aka. Everything that is not inside a tag</em>) and parse that as a properties file. Similar to how we read in the XML portion, this time we just pass all the text into the Apache’s <code><a href="http://commons.apache.org/configuration/apidocs/org/apache/commons/configuration/PropertiesConfiguration.html">PropertiesConfiguration</a></code> object. From that object you can get all the properties parsed for strings, numbers, lists, etc by there names. After this point it’s just as simple as processing a properties file. Lastly, we move on to show how to process the repeatable sub-sections.</p>
<pre>
<a href="http://download.oracle.com/javase/6/docs/api/java/util/List.html">List</a>&lt;String&gt; repeatableConfigs = xmlConfiguration.getList(<span style="color: #0000C0">"repeatable"</span>);
<span style="color: #8D1D68;font-weight:bold">for</span> (String repeatableConfig : repeatableConfigs) {
	<a href="http://commons.apache.org/configuration/apidocs/org/apache/commons/configuration/PropertiesConfiguration.html">PropertiesConfiguration</a> repeatableProperties =
		<span style="color: #8D1D68;font-weight:bold">new</span> <a href="http://commons.apache.org/configuration/apidocs/org/apache/commons/configuration/PropertiesConfiguration.html">PropertiesConfiguration</a>();
	repeatableProperties.load(
		<span style="color: #8D1D68;font-weight:bold">new</span> <a href="http://download.oracle.com/javase/6/docs/api/java/io/ByteArrayInputStream.html">ByteArrayInputStream</a>(repeatableConfig.getBytes()));

	<span style="color: #3F7F5F">// Do something...</span>
}
</pre>
<p>Here we get as a single list all the nestable elements with the name “<code>repeatable</code>”. From there we do the same thing as before, hand them to the <code><a href="http://commons.apache.org/configuration/apidocs/org/apache/commons/configuration/PropertiesConfiguration.html">PropertiesConfiguration</a></code> object to parse the properties. It&#8217;s all pretty simple.</p>
<h2>Thoughts</h2>
<p>This is an elegant solution that forges a compromise between the expressiveness of XML and the simplicity of a plain old Java property file. This method is easy to implement and feels natural to any system administrator who’s use to editing configuration files on a regular basis.</p>
<p>There is one gotcha though, since we process the document as if it were an XML document all XML entities must be escaped. So if you need to include &amp;, &lt;, or &gt; then you’ll need to escape them with “<code>&amp;amp;</code>”, “<code>&amp;lt;</code>”, or “<code>&amp;gt;</code>” or else you&#8217;ll create parsing exceptions.</p>
<p><strong>Download the full class</strong>: <a href="http://scottphillips.com/files/2011/01/ConfigurationExample.java">ConfigurationExample.java</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.scottphillips.com/2011/02/mixed-xml-and-property-files/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Buying an Expiring Domain Name</title>
		<link>http://www.scottphillips.com/2011/01/buying-an-expiring-domain-name/</link>
		<comments>http://www.scottphillips.com/2011/01/buying-an-expiring-domain-name/#comments</comments>
		<pubDate>Sun, 23 Jan 2011 20:52:22 +0000</pubDate>
		<dc:creator>Scott</dc:creator>
				<category><![CDATA[Commentary]]></category>
		<category><![CDATA[HOWTO]]></category>
		<category><![CDATA[Domains]]></category>

		<guid isPermaLink="false">http://www.scottphillips.com/?p=793</guid>
		<description><![CDATA[I recently bought a domain name after it expired and learned a lot about the process. I had been watching the name for many years, periodically doing a whois lookups on it to check it status. Early last November I did a lookup on it and noticed that it passed it&#8217;s expiration date back in [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://www.scottphillips.com/files/2011/01/TimeExpired.jpg" alt="" width="200" height="200" class="alignright size-full wp-image-825" />I recently <a href="http://www.scottphillips.com/2011/01/moving-domains-scottphillips-com/">bought a domain name</a> after it expired and learned a lot about the process. I had been watching the name for many years, periodically doing a <a href="http://en.wikipedia.org/wiki/Whois">whois lookups</a> on it to check it status. Early last November I did a lookup on it and noticed that it passed it&#8217;s expiration date back in October, I thought awesome I might get a chance to grab it! I started researching the domain expiration process [<a href="#ref1">1</a>][<a href="#ref2">2</a>][<a href="#ref3">3</a>] so that I wouldn&#8217;t make any mistakes. Ultimately I was able to buy the domain name, <a href="http://www.scottphillips.com/">www.scottphillips.com</a>, after several months and an auction. This post is about what I learned from the process, hopefully it will help someone else who&#8217;s looking into an expired domain name for themselves.</p>
<p>The domain expiration process is not always predictable and may take up to 80 days past the actual expiration date listed for the domain. While the domain is going through this process it will traverse through several states before actually being deleted.  Then once a domain is deleted from the system it becomes available for anyone to register it during the &#8220;drop period&#8221;. There are a hand full of companies that specialize in catching these freshly deleted domain names during this period. For this post I will start by describing the life cycle of a .com domain name. Next I will discuss the services that specialize in catching dropped domain names. Finally, I will conclude with a few thoughts of my own about the process.<br />
<span id="more-793"></span></p>
<h2>The Life-Cycle of a .com Domain Name</h2>
<p>The protocol used by the <code>.com</code> top level domain is the Registry Registrar Protocol (RRP) [<a href="#ref4">4</a>][<a href="#ref5">5</a>] which is also used by the <code>.net</code> extension along with the country codes. The RRP describes the possible states and some policy for handling expiring domain names. Other top level extensions such as (i.e <code>.biz</code>, <code>.info</code>, <code>.name</code>, etc..) use a newer protocol called Extensible Provisioning Protocol (EEP) [<a href="#ref6">6</a>]. Just a few days ago (January 2011) the owner of the <code>.com</code> extension, VeriSign, has agreed to switch to the EEP protocol [<a href="#ref7">7</a>] but as far as I can tell there is no time frame for the migration. If you&#8217;re looking at a domain name with an extension that uses EEP than this information may be incorrect.</p>
<p>The first question you probably have is when will the domain name be available? The process is not speedy and current domain owners are usually given a long grace period to renew their domain name. Even in the case where the current owner has no intention of renewing the domain name there typically is no way for them to tell their registrar that, it still has to go through the process.</p>
<p>After a domain name has reached it’s expiration date the current registrar is in the driver seat. Technically once the domain has expired it is their property. Luckily for the owner, registrars often allow the owner to renew the name during the grace periods. For all the reputable registrars the process takes 80 days: 45 days for Registrar Hold + 30 days for the Redemption Period + 5 days while Pending Deletion. However some of the less scrupulous registrars will change the time periods to their advantage.</p>
<p><img src="http://www.scottphillips.com/files/2011/01/domain_chart.png" alt="" width="550" height="289" class="alignnone size-full wp-image-832 no-style" style="margin: 23px 0" /></p>
<p style="padding-left: 30px;text-indent: -30px"><span style="font-variant: small-caps;font-size: 1.25em;font-weight: bold">Active:</span><br />
This is the default status for a domain once it has been registered. The domain may be modified and will actively support a website, mail, or other services.  The owner holds the rights to the name and may renew the registration indefinitely or transfer the name to another registrar.</p>
<p style="padding-left: 30px;text-indent: -30px"><span style="font-variant: small-caps;font-size: 1.25em;font-weight: bold">Registrar Hold:</span><br />
After a domain has reached it&#8217;s expiration date it typically transitions into the “Registrar Hold” state at the discretion of the registrar. The registrar may keep the domain in this state for anywhere from 0 to 45 days. During this time the registrar is free to implement any number of policies as laid out in their individual terms of service [<a href="#ref8">8</a>][<a href="#ref9">9</a>][<a href="#ref10">10</a>][<a href="#ref11">11</a>][<a href="#ref12">12</a>]. Registrars have several options at this point including keeping the domain for themselves, selling or auctioning off it to another person, or letting it expire. The only thing that ICANN will force them to do is if they are going to let it expire it must go into the Redemption Period. However, most reputable registrars will allow the original owner to renew the registration with no extra fees during this period.</p>
<p style="padding-left: 30px;text-indent: -30px"><span style="font-variant: small-caps;font-size: 1.25em;font-weight: bold">Redemption Period:</span><br />
After a domain has expired before it can be deleted ICANN mandates that it must go through a 30-day redemption period. During this period the domain name will no longer be in the zone file meaning the website, email, or other services will no longer work for the domain. During this period the original owner may renew the registration for the domain, however there is a penalty fee.</p>
<p style="padding-left: 30px;text-indent: -30px"><span style="font-variant: small-caps;font-size: 1.25em;font-weight: bold">Pending Deletion:</span><br />
After a domain has gone through the redemption period it is moved to the pending delete status. This status lasts 5 days during with the domain may not be modified, nor can the original owner renew its registration. At the end of 5 days during the time period between 11am and 2pm Pacific Time the domain will be deleted, typically called the “drop period”. Once it&#8217;s deleted it is available for anyone to register it.</p>
<p style="padding-left: 30px">Once a domain has entered the pending delete status the original owner has lost all chances of renewing the registration. If they want to renew the domain name they will be on equal footing with everyone else attempting to register it during the drop period.</p>
<p>There are a few other states a domain may be in but they are all for abnormal situations such as legal action associated with the domain name. They are: <span style="font-variant: small-caps;font-weight: bold">Registry Lock</span>, <span style="font-variant: small-caps;font-weight: bold">Registrar Lock</span>, and <span style="font-variant: small-caps;font-weight: bold">Registry Hold</span>.</p>
<h2>Catching a Dropped Domain Name</h2>
<p>A fleet of companies have specialized in acquiring deleted domains during the drop period. These aftermarket companies have researched the precise timing and little tricks to get any advantage they can during the drop period. They’ve invested in their infrastructure to have the available bandwidth to repeatedly query the DNS system to see when a domain is available, but not too much so that they get banned. Most of these companies work on a close auction model. This works by potential buyers placing minimum bids on the domain prior to a domains deletion. Then if the company acquires the domain it will go on auction with only those who placed bids previously being able to participate.</p>
<p>Many of the individual registrars have formed exclusive partnerships to handle aftermarket auctions for their expired domain names. Instead of dropping the domain for anyone to grab the domain is handed over to the partner company who will auction it off to the highest bidder.  Unfortunately it is hard to find out which registrars have exclusive partnerships [<a href="#ref13">13</a>][<a href="#ref14">14</a>], and they seem to change often [<a href="#ref15">15</a>][<a href="#ref16">16</a>].  I am not aware of any good rule you can use to find out which drop catching service you should use. Your best bet is to search the current registrar’s website to see if they explicitly state whether they have an exclusive arrangement with an aftermarket company or not. Sometimes it’s buried in their terms of service, other times they have made a press release announcing the partnership. However, If you can’t find this, then your best bet is to place initial bids with them all. That way you&#8217;ll still be in the game no matter which service catches the domain.</p>
<p><strong><a href="http://www.pool.com/">Pool.com</a></strong></p>
<p style="padding-left: 30px">Pool.com pioneered the current model where buyers only pay if they acquire the drop with out any upfront fees. Now days this is pretty much the standard across the board. The minimum bid price for pool.com is $60. Multiple people may place bids on the same name. The system will not tell you if anyone else is bidding against you until Pool.com acquires the drop. Then, if multiple people placed bids the domain name goes to a closed eBay style auction.</p>
<p><strong><a href="https://www.snapnames.com/">SnapNames</a></strong></p>
<p style="padding-left: 30px">SnapNames use to be the exclusive auction site for Network Solutions domains. The site is still going strong after that partnership ended. However, there has been a bit of scandal at the company where an employee was caught placing shill bids on domain auctions [<a href="#ref17">17</a>]. It looks like the company took appropriate action when it found out about the problem, refunding buyers money. However it is still a big black eye for the outfit, especially one in a business where trust and fairness is important.</p>
<p style="padding-left: 30px">The minimum bid with SnapNames is $79, and you do not need to pay unless you win the auction. In fact, you don’t even need to provide SnapNames with payment information until after you win. If SnapNames catches the drop then it goes into a 3-day auction between only those who placed a bid on the domain name prior to the drop. The auction is run as a typical eBay style auction where you can place multiple bids, set a maximum bid, etc. If no one else is interested in the domain then you’ll automatically win the auction at the minimum bid.</p>
<p style="padding-left: 30px">SnapNames caught the drop for my domain. The site will not tell you if anyone else placed bids on the domain name until after they acquire the drop. When that happened I found out that 9 other people also placed minimum bids on my domain! However the vast majority of those didn’t place a bid higher than their initial minimum bid. The bidding for my auction was obviously just between other Scott Phillips’s. I don’t believe the normal “<a href="http://en.wikipedia.org/wiki/Domain_name_speculation">domainers</a>” were involved in the auction.</p>
<p><strong><a href="http://www.namejet.com/">NameJet</a></strong></p>
<p style="padding-left: 30px">NameJet is the newest company to the market, being formed when the partnership between Network Solutions and SnapNames ended. NameJet is the exclusive aftermarket auction site for Network Solutions, eNom, and Bulk Register [<a href="#ref18">18</a>]. If the domain you are interested in currently registered with one of these registrars then NameJet is the only site you need to deal with.</p>
<p style="padding-left: 30px">The minimum bid price starts at $69 and as normal you do not have to pay unless you win the auction. They offer both public auctions where anyone can place a bid, or private auctions where only those placed a bid before the drop can participate. I am not exactly sure how they decide which auction system is used, but I expect that the public auction is only used for the really popular domain names. Of course, if no one else is interested then you’ll automatically win the auction at the minimum bid.</p>
<p><strong><a href="http://www.godaddy.com/">GoDaddy</a></strong></p>
<p style="padding-left: 30px">GoDaddy’s website is truly amazing in how hard it is to navigate. I can’t possibly imagine creating a more confusing experience. With their system you first purchase a “DomainAlert Pro Backordering” slot for $20.99. This slot can be used again and again until a backorder is successful. Once you’ve purchased that slot you can select which domain name you want to place a back order on. One thing to note is that they only allow one back order per domain name. So if, after purchasing the slot, it gives you a confusingly generic error when placing the back order it may be that someone else already has the one. As your consolation prize you can re-use the slot on another domain name.</p>
<p style="padding-left: 30px">If GoDaddy catches the drop then you’ll receive the domain right away. Since they only allow one back order per domain there is no post-drop auction like the other services. However, it seems like GoDaddy’s system rarely gets the drop. Unless the domain name you’re interested in is currently registered with GoDaddy then I would pass on using them.</p>
<h2>Final Thoughts</h2>
<p>The domain expiration process is still a bit like the wild-wild west. I think the basic problem is a conflict between the registrars and ICANN. The current system is tilted towards each individual registrar because they can create these exclusive partnerships with auction services. Undoubtedly they get a portion of the final sell price for each domain sent their way.</p>
<p>If ICANN was an effective organization they could set up a single waiting list or auction style system that would be better for everyone involved. Until they fix the current any-thing-goes system individual buyers are left open to potential fraud, lots of confusion, and an amazingly enormous waste of resources. Really, the best system that ICANN could figure out is one where people build big systems to continually ping the DNS system for the drop? Or where companies forge hidden back-room deals deciding where the domain goes? Why can&#8217;t the whole process be transparent and above-board where we as a society are able to better utilize the limited amount of domains.</p>
<p>There have been some proposals for fixing the current system [<a href="#ref19">19</a>]. However they seem to be tilted towards benefiting “domainers” instead of the common consumer. I wish ICANN would stand up and create a system that would be fair for end-users to understand and participate in. Until that time we’re stuck with this crazy system.</p>
<p>If you’re looking to try and catch an expiring domain name, good luck. I hope this article helped you navigate the waters and if it did please drop me a note letting me know.</p>
<h2>References</h2>
<p>[<a name="ref1">1</a>] &#8220;<a href="http://www.mikeindustries.com/blog/archive/2005/03/how-to-snatch-an-expiring-domain">How to snatch an expiring domain</a>&#8220;, A great blog post by Mike Davidson back in 2005. This post was the inspiration for my post with updated information 5 years later.</p>
<p>[<a name="ref2">2</a>] &#8220;<a href="http://blog.auinteractive.com/how-to-lose-a-snap-names-domain-auction">How to Lose a Snap Names Domain Auction</a>&#8220;, A funny blog post about the auction experience.</p>
<p>[<a name="ref3">3</a>] &#8220;<a href="http://www.webmasterworld.com/forum25/3485.htm">So best way to buy expiring domain?</a>&#8221; A forum thread about purchasing expired domain names.</p>
<p>[<a name="ref4">4</a>] &#8220;<a href="http://tools.ietf.org/html/rfc2832">NSI Registry Registrar Protocol (RRP) Version 1.1.0</a>&#8220;, The initial version of the RRP protocol</p>
<p>[<a name="ref5">5</a>] &#8220;<a href="http://tools.ietf.org/html/rfc3632">VeriSign Registry Registrar Protocol (RRP) Version 2.0.0</a>&#8220;, A revised version of the RRP protocol.</p>
<p>[<a name="ref6">6</a>] &#8220;<a href="http://tools.ietf.org/html/rfc5730">Extensible Provisioning Protocol (EPP)</a>&#8220;, The current version of the EPP protocol.</p>
<p>[<a name="ref7">7</a>] &#8220;<a href="http://www.icann.org/en/tlds/agreements/verisign/appendix-07-06jan10.htm">.COM Agreement Appendix 7</a>&#8220;, The agreement between VeriSign and ICANN regarding the .com registry where VeriSign agrees to migrate to the EPP protocol.</p>
<p>[<a name="ref8">8</a>] &#8220;<a href="http://www.networksolutions.com/support/domain-deletion-policy/">Network Solutions: Domain Deletion Policy</a>&#8220;, The agreement says that they will probably give you a grace period but by no means have too.</p>
<p>[<a name="ref9">9</a>] &#8220;<a href="http://help.godaddy.com/article/608">GoDaddy: What is your process for handling expired domain names?</a>&#8220;, An FAQ about GoDaddy&#8217;s expiration process. Note that the agreement states all domains will be auction off by GoDaddy skipping the ICANN redemption period.</p>
<p>[<a name="ref10">10</a>] &#8220;<a href="http://www.namecheap.com/legal/domains/registration-agreement.aspx">Name Cheap: Registration Agreement</a>&#8220;, The agreement says that they will give you at least 27 days to renew the domain after expiration, but after that it&#8217;s up to them what they want to do.</p>
<p>[<a name="ref11">11</a>] &#8220;<a href="http://faq.1and1.com/domains/other_issues_regarding_domains/2.html">1&amp;1: What is a redemption period?</a>&#8220;, As I read the FAQ from 1&amp;1 they state that they will skip the registrar-hold status and immediately go to the redemption period (which imposes a penalty fee).</p>
<p>[<a name="ref12">12</a>] &#8220;<a href="http://www.tucowsdomains.com/topic/renewal-and-expiration/">Tucows: What happens to domain names when they expire?</a>&#8220;, When a domain expires Tucows will give you the full 45 day grace period to renew after which they make take the domain, auction it, or let it go into the redemption period before deleting.</p>
<p>[<a name="ref13">13</a>] &#8220;<a href="http://www.buyexpiringdomains.com/Partner-Domains.html">Partnered Domain Names Understanding them</a>&#8220;, a web page describing the partnerships between registrars and aftermarket auction services.</p>
<p>[<a name="ref14">14</a>] &#8220;<a href="http://www.cvul.com/domain-industry/which-drop-catch-service-to-use-for-expiring-domain-names/">Which Drop Catch Service to Use for Expiring Domain Names</a>&#8220;, An extensive list of registrar and aftermarket auction partnerships. I am not aware of how accurate this information is or when it was last updated.</p>
<p>[<a name="ref15">15</a>] &#8220;<a href="http://www.thewhir.com/web-hosting-news/060508_SnapNames_Register_com_Renew_Deal">SnapNames, Register.com Renew Deal</a>&#8220;, Press release announcing the renewal of the exclusive partnership between SnapNames and Register.com.</p>
<p>[<a name="ref16">16</a>] &#8220;<a href="http://domainnamewire.com/2009/02/09/tucows-inks-expired-domains-deal-with-namejet/">Tucows Inks Expired Domains Deal with NameJet</a>&#8220;, an article about the new partnership between Tucows and NameJet.</p>
<p>[<a name="ref17">17</a>] &#8220;<a href="http://www.oregonlive.com/business/index.ssf/2010/05/snapnames_sues_former_vice_pre.html">SnapNames sues former vice president for $33 million, alleging he rigged website auctions</a>&#8220;, an article about the scandal surrounding SnapNames&#8217;s employee placing shill bids.</p>
<p>[<a name="ref18">18</a>] &#8220;<a href="http://www.namejet.com/Pages/About.aspx">About NameJet</a>&#8220;, this page lists the exclusive partnerships for NameJet. It&#8217;s the only one that lists them by name.</p>
<p>[<a name="ref19">19</a>] &#8220;<a href="http://www.icann.org/en/meetings/lisbon/presentation-tutorial-expiring-25mar07.pdf">ICANN Tutorial: Changes in the expiry process</a>&#8220;, Slides from a presentation given by the CEO of pool.com at a 2007 ICANN meeting on potential changes to the expiration process.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.scottphillips.com/2011/01/buying-an-expiring-domain-name/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Moving to a new domain: scottphillips.com</title>
		<link>http://www.scottphillips.com/2011/01/moving-domains-scottphillips-com/</link>
		<comments>http://www.scottphillips.com/2011/01/moving-domains-scottphillips-com/#comments</comments>
		<pubDate>Mon, 10 Jan 2011 01:17:08 +0000</pubDate>
		<dc:creator>Scott</dc:creator>
				<category><![CDATA[Commentary]]></category>
		<category><![CDATA[Domains]]></category>

		<guid isPermaLink="false">http://scottphillips.com/?p=660</guid>
		<description><![CDATA[The blog is moving to a new domain: scottphillips.com! The previous owner let the domain expire and I had a chance to snap it up. I learned a lot about how the domain expiration process works, I’ll write up more about that on my next post. For now I wanted to let everyone know about [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://scottphillips.com/files/2011/01/dotCom.png" alt="" title="dotCom" width="200" height="150" class="alignright size-full wp-image-661" />The blog is moving to a new domain: <a href="http://scottphillips.com/">scottphillips.com</a>! The previous owner let the domain expire and I had a chance to snap it up. I learned a lot about how the domain expiration process works, I’ll write up more about that on <a href="http://www.scottphillips.com/2011/01/buying-an-expiring-domain-name/">my next post</a>. For now I wanted to let everyone know about the new domain. The current <a href="http://scott.phillips.name/">scott.phillips.name</a> domain will last for the foreseeable future, at least a few years. But if you have this blog in your RSS reader you should switch the feed’s URL now.</p>
<p>Way back in 1996 when I was in high school I first had the idea to register my name as a domain name. I looked it up at the time with a whois search and it was available. But at that time domains were much more expensive, it was on the order of $30 a year. I’m not sure if there were cheaper alternatives available at the time or not, but I wasn’t aware of any. That’s a fairly high price for a high school student with no job to register a domain name.<img src="http://scottphillips.com/files/2011/01/sadFace.png" alt="" title="Sad Face" width="75" height="75" class="alignleft size-full wp-image-669 no-style" style="margin: 2px;" /> I thought about it for a while and figured it would be worth it, even at the price. But by that time, just a few months later, someone else in Australia had picked up the domain. Presumably by another Scott Phillips… </p>
<p><span id="more-660"></span>Since then I&#8217;ve checked on the domain every so often. It’s never had a website hosted on it, the guy was just sitting on it. A few years later when I was out of high school and in college with a job I sent an email to the guy offering to buy it. I offered to buy it for $300 and he declined. I’d always imagined him laughing at me while writing the reply. I’ll never know.</p>
<p>When I got around to putting up my blog I needed a domain name and most of the Scott Phillips related names were already taken. Like the ones with dashes, or alternative extensions. However, there was one available on the relatively new “<a href="http://en.wikipedia.org/wiki/.name"><code>.name</code></a>” registry. This registry is different because they sell third level names, so the registrar retains ownership of “<code>phillips.name</code>” and sells “<code>scott.phillips.name</code>”. This actually seemed like a cool idea it allows a lot more people to share the domain names separated by something that seemed natural. </p>
<p>The third level concept hasn’t caught on (yet?). You will have to search high and low for a respected registrar that will even let you register it, the big ones don’t support it <a href="http://www.networksolutions.com/">Network Solutions</a>, <a href="http://www.namecheap.com/">Namecheap</a>, <a href="http://www.godaddy.com/">Go Daddy</a>, etc. Also, I’ve been suspicious of using the domain for my primary email address because I fear that someday in the future I won’t be able to renew the registration on the domain. That’s why I hopped at the chance to get the <code>.com</code> version!<br />
<br/><br />
<br/><br />
<em>P.S.</em> If you are another Scott Phillips reading this and wanted the domain name as well, unfortunately there’s no good way to share domain names. Sorry.<br />
<br/><br />
<br/></p>
]]></content:encoded>
			<wfw:commentRss>http://www.scottphillips.com/2011/01/moving-domains-scottphillips-com/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>OS X Terminal slow to bring up a prompt?</title>
		<link>http://www.scottphillips.com/2010/12/os-x-terminal-slow-to-bring-up-a-prompt/</link>
		<comments>http://www.scottphillips.com/2010/12/os-x-terminal-slow-to-bring-up-a-prompt/#comments</comments>
		<pubDate>Wed, 22 Dec 2010 21:18:13 +0000</pubDate>
		<dc:creator>Scott</dc:creator>
				<category><![CDATA[HOWTO]]></category>
		<category><![CDATA[OS X]]></category>

		<guid isPermaLink="false">http://scott.phillips.name/?p=558</guid>
		<description><![CDATA[Today I was in the middle of some tasks and needed a bash shell. Naturally, I fired up the Terminal.app from OS X and waited&#8230;.. and waited&#8230;.  The window popped up but the bash prompt didn&#8217;t come up, it just stayed there with one character blinking at me. It took forever for the prompt to [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://scott.phillips.name/files/2010/12/TerminalApp-Icon.png"><img class="no-style alignright size-full wp-image-565" src="http://scott.phillips.name/files/2010/12/TerminalApp-Icon.png" alt="" width="200" height="178" /></a>Today I was in the middle of some tasks and needed a bash shell. Naturally, I fired up the Terminal.app from OS X and waited&#8230;.. and waited&#8230;.  The window popped up but the bash prompt didn&#8217;t come up, it just stayed there with one character blinking at me. It took forever for the prompt to appear, it seemed like a few minutes but was probably more like 20 seconds. You know how time scales depending on how busy you are. Fed up with that, I goggled [<a href="#ref1">1</a>][<a href="#ref2">2</a>][<a href="#ref3">3</a>] around to see what if anyone else has had similar problems.</p>
<p>It turns out there&#8217;s a well known solution to the problem:</p>
<pre>
sudo rm -f /private/var/log/asl/*.asl
</pre>
<p>The speculation is the that a build up of log files causes the terminal app to continually slow down when opening a new prompt. After deleting the log files and opening a new terminal window the prompt appears immediately. I hadn&#8217;t noticed this behavior before Snow Leopard so it looks like apple forgot to clear something out on a regular basis or added a new check. In either case, it&#8217;s annoying.</p>
<p><span id="more-558"></span></p>
<h3>References</h3>
<p>[<a name="ref1">1</a>] &#8220;OSX Terminal Slow To Launch&#8221;, <a href="http://www.proposedsolution.com/solutions/osx-terminal-slow-launch/">http://www.proposedsolution.com/solutions/osx-terminal-slow-launch/</a></p>
<p>[<a name="ref2">2</a>] &#8220;Speed up a slow Terminal by clearing log files&#8221;, <a href="http://osxdaily.com/2010/05/06/speed-up-a-slow-terminal-by-clearing-log-files/">http://osxdaily.com/2010/05/06/speed-up-a-slow-terminal-by-clearing-log-files/</a></p>
<p>[<a name="ref3">3</a>] Apple Support Discussion on the problem, <a href="http://discussions.apple.com/thread.jspa?threadID=2178316">http://discussions.apple.com/thread.jspa?threadID=2178316</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.scottphillips.com/2010/12/os-x-terminal-slow-to-bring-up-a-prompt/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>WordCamp Austin, 2010</title>
		<link>http://www.scottphillips.com/2010/12/wordcamp-austin-2010/</link>
		<comments>http://www.scottphillips.com/2010/12/wordcamp-austin-2010/#comments</comments>
		<pubDate>Fri, 10 Dec 2010 02:12:19 +0000</pubDate>
		<dc:creator>Scott</dc:creator>
				<category><![CDATA[Reports]]></category>
		<category><![CDATA[Austin]]></category>
		<category><![CDATA[Blogging]]></category>
		<category><![CDATA[Mood Cards]]></category>
		<category><![CDATA[Smush It]]></category>
		<category><![CDATA[WordCamp]]></category>
		<category><![CDATA[WordPress]]></category>

		<guid isPermaLink="false">http://scott.phillips.name/?p=528</guid>
		<description><![CDATA[I was able to attend WordCamp Austin this past weekend. It was the first time I’d been to a WordPress conference so I wasn&#8217;t exactly sure what to expect. It turned out to be small but very well run conference with several interesting topics.  For this post I am going to review my notes from [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignright size-full wp-image-547 no-style" src="http://scott.phillips.name/files/2010/12/WordCampAustinLogo.png" alt="" width="170" height="170" />I was able to attend <a href="http://austinwordcamp.org/">WordCamp Austin</a> this past weekend. It was the first time I’d been to a <a href="http://wordpress.org/">WordPress</a> conference so I wasn&#8217;t exactly sure what to expect. It turned out to be small but very well run conference with several interesting topics.  For this post I am going to review my notes from the conference and highlight the items or topics that I thought were interesting.</p>
<p><a href="http://central.wordcamp.org/">WordCamps</a> are typically small 1-day conferences held all over the world  focusing on WordPress related topics. For any given weekend there are likely several  conferences being held. The big problem is that the camps almost always sellout before the first speaker is even scheduled. So you you either have to be &#8216;in the know&#8217; about what are the good WordCamps or take your luck of the draw as to whether the camp will be right for you. I was afraid that the conference would be too flufy (i.e. topics like how to build a community, or just using WordPress) instead of technical topics. It turned out that most of the presentations were at least mildly technical with very few covering topics of just how to use WordPress. It seemed to be a direct match with the audience. I&#8217;d say that if you feel you&#8217;re a programmer you&#8217;re probably not going to get the most of out these conferences. Technical concepts are not covered in depth as you would expect from other communities.</p>
<p>With that said, I&#8217;m a <em>newb</em> to the WordPress community. I&#8217;ve built a few themes for my self (i.e. this blog) and a few others. So take what&#8217;s said here with a grain of salt from an outsider&#8217;s perspective.</p>
<p><span id="more-528"></span></p>
<h3>Mood Cards</h3>
<p><strong>Presentation</strong>: <a href="http://www.byjohnchandler.com/2010/12/04/before-the-famous-five-minute-install/">Before the Famous Five Minute Install</a></p>
<p><strong>Author</strong>: <a href="http://www.byjohnchandler.com/">John Chandler</a></p>
<p>John&#8217;s presentation presented topics about what to do when building a new website &#8212; the stuff you should do before you start coding or designing. His presentation focused on techniques to illicit what the customer is looking for along with typical work flows. The basics of his topic were the straight forward water-fall design model. However one of the items he talked about was really interesting: Mood Cards. These are just simply a set of different web page styles to get the customer thinking about what they are looking for. The idea is that you either print these out and have your customer look through them in person, or send them a website with them all listed. With these the customer will have a starting point to say things like, a little bit of &#8220;<em>this&#8221;</em> and &#8220;<em>that</em>&#8221; &#8211; but I don&#8217;t like <em>&#8220;this</em> <em>other one</em>&#8220;. In short it helps start the conversation about designing a website.</p>
<p>Mood Cards online: <a href="http://moodcards.lyricalmedia.com/">http://moodcards.lyricalmedia.com/</a></p>
<h3>Multiple Column Posts</h3>
<p><strong>Presentation</strong>: <a href="http://www.billerickson.net/wordcamp-austin-wordpress-beyond-blogging/">WordPress Beyond Blogging</a></p>
<p><strong>Author</strong>: <a href="http://www.billerickson.net/">Bill Erickson</a></p>
<p>Bill&#8217;s presentation covered several topic areas focusing on how to use WordPress as a plain old CMS &#8211; more than just blogging. That seems to be the mantra going around the community recently. While he covered many topics, the one that I thought was most usefull was his tip to use the <code>&lt;h5&gt;</code> tag to separate out multiple columns. The idea is that instead of using short codes, or embedding HTML in the post, is to use the <code>&lt;h5&gt;</code> header tag to delineate when multiple columns should be displayed. A filter would look for these headings and add the appropriate HTML to separate out the content into multiple columns. The reason why this approch is better than short codes is that it works seemlessly with the <code>WYSWYG</code> editor. It&#8217;s also hard for the end-user editing their content to mess it up.</p>
<h3>PODS is Crazy!</h3>
<p><strong>Presentation</strong>: <a href="http://austinwordcamp.org/2010/wordcamp-atx-nick-batik/"><em>The Sky’s the Limit – Migrating Static Sites using PODS</em></a></p>
<p><strong>Author</strong>: <a href="https://pleiadesservices.com/">Nick Batik</a></p>
<p>The next presentation was about a system called <a href="http://podscms.org/">PODS</a>. Nick&#8217;s presented this system as a very powerful plugin to build WordPress websites from sets of structurally defined data. The basic concept is that you&#8217;d import your data into a table, then configure WordPress to pull that information out to create pages/posts/comments/whatever using the data. He demonstrated a website he was in the process of building that pulled in static content from a book to publish it online.</p>
<p>The problem with PODS is that it has all the negatives of dynamic content (slow, complex, etc..) without any of the benefits of static content (fast, simple, easy to understand). The PODS plugin provides a horribly complex administrative interface that any end-user looking at it would give up after a few minutes. The user interface expects the end-user to provide embedded php code, know concepts about <a href="http://codex.wordpress.org/The_Loop">the loop</a>, and other <a href="http://codex.wordpress.org/Template_Hierarchy">WordPress-isms</a>. In short it&#8217;s basically un-usable. WordPress isn&#8217;t an end-all tool for every task. It&#8217;s a great platform to build your typically website with a great (probably the best) interface to allow end users to edit the content on their website. But PODS trys to extend WordPress beyond this use case and utterly fails. Perhaps there are some very small use-cases where PODS would be useful, such as a WordPress site solely administered by a php developer. But really, at that point you should step back and ask your self why? Why not just take that php developer and create some one-time scripts to convert the static content into static HTML. That would end up being much simpler to maintain over time, and probably easier that spending any amount of time figuring out the crazily complex PODS system.</p>
<h3>Permalinks <span style="text-decoration: line-through">should</span> must start with numbers</h3>
<p><strong>Presentation</strong>: <a href="http://www.slideshare.net/jaredatch/common-wordpress-mistakes-and-more-wordcamp-austin-2010"><em>Common WordPress Mistakes</em></a></p>
<p><strong>Author</strong>: <a href="http://jaredatchison.com/">Jared Atchison</a></p>
<p>Jared&#8217;s presentation about common mistakes people make with WordPress included one that I hadn’t know about. Apparently when trying to find the correct page, post, or whatever for a particular URL it tries to optimize the search. If the first part of the path, the stuff before the first slash, is numeric then it will narrow down the search to just those posts that start with the number. If it&#8217;s something else then WordPress can&#8217;t do this optimization and has to check every type of object to see if it fits. This seems like an easy gotcha that could quite easily slow down your WordPress site. It also explains why I can&#8217;t create a page name that consists of nothing but a number.</p>
<p>If you stick with the permalink options provided you&#8217;ll be fine, they all start with a number.</p>
<h3>Custom Post Types</h3>
<p><strong>Presentation</strong>: <a href="http://bit.ly/cptpresentation">Custom Post Types</a></p>
<p><strong>Author</strong>: <a href="http://wptheming.com/">Devin Price</a></p>
<p>Devin covered the new features from WordPress 3.0: <a href="http://codex.wordpress.org/Custom_Post_Types">Custom Post Types</a>. These are a great step forward for the WordPress community and bring the platform up to par with <a href="http://drupal.org/">Drupal</a>, or <a href="http://www.joomla.org/">Joomla</a>. They have been needed for a while. Devin went into the concepts and explained several use-cases. I haven&#8217;t had a chance to use them but I&#8217;d like to add them to my other personal blog that I run.</p>
<h3>Stephanie&#8217;s Static HTML Importer</h3>
<p><strong>Presentation</strong>: <a href="http://www.slideshare.net/stephanieleary/importing-migrating"><em>Content Importing</em></a></p>
<p><strong>Author</strong>: <a href="http://sillybean.net/">Stephanie Leary</a></p>
<p>Stephanie&#8217;s presentation about importing content into wordpress was light on details. But it’s nice to know that there are lots of tools to aid in migrating content into WordPress. The one that she highlighted more than any other was her very own <a href="http://wordpress.org/extend/plugins/import-html-pages/">Static HTML Importer plugin</a>. It definitely looked like a great plugin to use if you have a static HTML site that needs to be imported into WordPress. One interesting thing to note is that Stephanie works at <a href="http://www.tamu.edu/">A&amp;M</a>, in the <a href="http://writingcenter.tamu.edu/">University Writing Center</a>. It is a bit weird to drive to Austin to hear a presentation by someone so close. In fact somewhere around half of the presenters and WordCamp Austin are or were living in the Bryan/College Station area in the last year. It seems to me that they really should have been WordCamp AggieLand.</p>
<h3>Optimizing WordPress</h3>
<p><strong>Presentation</strong>: <a href="http://austinwordcamp.org/2010/wordcamp-atx-jason-cohen/">Optimizing WordPress</a></p>
<p><strong>Author</strong>: <a href="http://wpengine.com/">Jason Cohen</a></p>
<p>Jason&#8217;s presentation was by far the best presentation that day. I wish some of the other presentations were a bit shorter so that Jason could have had more time. He covered caching and optimization topics, I suggest checking out this presentation when the video is posted. (I&#8217;ll post an update when that happens) Specifically he went through the options of the <a href="http://wordpress.org/extend/plugins/w3-total-cache/">W3 Total Cache</a> plugin, when to use them and when not to use them. One thing I learned is that the object cache is almost always a loss, you&#8217;re better off caching at the Database and HTML levels. It just dosn&#8217;t take PHP much time to recreate an object from the database so all your doing is wasting memory. He also cautioned about using disk caches if your disk speed is slow, like if you&#8217;re in a hosted cloud environment like <a href="http://www.rackspacecloud.com/index.php">RackSpace</a> or <a href="http://aws.amazon.com/ec2/">Amazon&#8217;s EC2</a>. In some circumstances you&#8217;re better of skipping the disk-based cache.</p>
<p>Another topic that I will highlight here is &#8220;<a href="http://www.smushit.com/ysmush.it/">Smush It</a>&#8221; from Yahoo. The tool works as a plugin to Firefox and when run will take a webpage and compress the images to the smallest possible size without effecting the visual effect. Jason did a demo of using the tool on <a href="http://www.cnn.com/">CNN</a>&#8216;s webpage and it was able to save 25% of the image bandwidth. That&#8217;s a pretty huge effect. I ran it on one of my websites and saw around a %10 savings. In either case, it&#8217;s a pretty nifty and easy to use tool.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.scottphillips.com/2010/12/wordcamp-austin-2010/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Buying a domain from HugeDomains.com</title>
		<link>http://www.scottphillips.com/2010/11/buying-a-domain-from-hugedomains-com/</link>
		<comments>http://www.scottphillips.com/2010/11/buying-a-domain-from-hugedomains-com/#comments</comments>
		<pubDate>Sun, 28 Nov 2010 18:34:23 +0000</pubDate>
		<dc:creator>Scott</dc:creator>
				<category><![CDATA[Commentary]]></category>
		<category><![CDATA[Domains]]></category>

		<guid isPermaLink="false">http://scott.phillips.name/?p=509</guid>
		<description><![CDATA[I recently found my self in a position of negotiating the purchase of a domain from a domain reseller (or “domainers” as they are commonly referred to as [1]). It is one of those outfits that buys up domains in bulk to resell them at a huge markup. Before I contacted HugeDomains I googled around [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignright size-full wp-image-521" src="http://scott.phillips.name/files/2010/11/sale.png" alt="Sale sign" width="200" height="200" />I recently found my self in a position of negotiating the purchase of a domain from a domain reseller (or “domainers” as they are commonly referred to as [<a href="#ref1">1</a>]). It is one of those outfits that buys up domains in bulk to resell them at a huge markup. Before I contacted HugeDomains I googled around [<a href="#ref2">2</a>] [<a href="#ref3">3</a>] and couldn’t find much in the way of specifics on how to negotiate with them. I wasn&#8217;t even sure if they negotiate at all for their domains. I am writing this article about my experience buying a domain from HugeDomains.com in the hope that it will help someone else buying a domain from them. My purpose is to try and document as much information as I can about the transaction.</p>
<h2>Background</h2>
<p>First a bit of background about <a href="http://hugedomains.com/">HugeDomains.com</a>. They are a relatively new outfit which started getting into the domain reselling business around 2006. The company behind HugeDomains.com is TurnCommerce.com operated by Andrew Reberry[<a href="#ref4">4</a>]. Andrew&#8217;s company is also affiliated with the domain registrar NameBright.com[<a href="#ref5">5</a>]. All of these companies are located in Denver, Colorado. The company TurnCommerce has an <a href="http://www.bbb.org/denver/business-reviews/internet-shopping/turncommerce-in-denver-co-90006713">A+ rating with the BBB</a> (as of November 2010). While you can disagree with their business model, they seem to be a reputable business. I can state that my interactions with the company were entirely professional, and once a deal was reached the transaction was processed quickly. We had the domain in our possession, transferred to my preferred registrar within 3 days.</p>
<p><span id="more-509"></span></p>
<h2>The Value of a Name</h2>
<p>Figuring out the value of domain names is tricky. There are lots of factors which can influence the value of any particular domain name. Here are a few general ones:</p>
<p style="margin-left: 60px; text-indent: -30px;"><strong>1) </strong><strong>How short is the name? </strong>People like shorter names because they are easier to remember and quicker to type.</p>
<p style="margin-left: 60px; text-indent: -30px;"><strong>2) <em>What is the potential business value</em></strong><em>?</em> If the name is obviously linked to a product or service that people are looking for then its value is higher.</p>
<p style="margin-left: 60px; text-indent: -30px;"><strong>3) <em>What extension is it</em>?</strong> For whatever reason everyone want&#8217;s a &#8220;<code>.com</code>&#8221; domain name. All other extensions are worth a lot less.</p>
<p style="margin-left: 60px; text-indent: -30px;"><strong>4) <em>Is the name brandable</em>?</strong> Short or long, some names are easier for people to connect with a concept or business.</p>
<p style="margin-left: 60px; text-indent: -30px;"><strong>5) </strong><strong><em>Does it contain weird characters?</em></strong> Things like dashes or numbers in strange places detract from the value of a domain.</p>
<p style="margin-left: 60px; text-indent: -30px;"><strong>6) <em>Are there other alternatives</em>?</strong> If people can choose something else for cheaper that also works, then the value is lower.</p>
<p>At the end of the day value is completely relative. How much a domain name is worth to me is different that what it is worth to someone else. HugeDomains.com and other resellers have to walk a fine line trying to figure out the market price for a particular domain. They try to figure out for someone who is interested in using this domain for it&#8217;s maximum potential (i.e. building a business on it) how much would that person be willing to pay for the domain. It&#8217;s a hard thing to do and the success of their business depends upon it.</p>
<h2>Our Story</h2>
<p>We were in the process of setting up a new website and needed a domain name. Unfortunately, it was already taken! HugeDomains.com already owned it and wanted $1,495 for it. In my opinion that’s an outrageously high sum of money for this particular domain name. Their website offers a one click buy button for the domain at the full asking price. Since we weren&#8217;t willing to do that, I sent an email inquiring about the domain to see if they even negotiate. Around a month later we received a reply from Christian Bosse, on behalf of hugedomains.com, he invited us to submit an offer for the domain.</p>
<p>At this point we needed to figure out how much we valued the domain. We were certainly not willing to spend the full asking price for the domain and after a long conversation about it we settled that our maximum amount we were willing to pay was $700, but that we&#8217;d make an an initial offer at $400. We received their reply to our initial offer very quickly, just a few hours later. The reply stated that HugeDomains won&#8217;t accept less than $500 for any domain that they sell. It then goes on to say that for domains in the $1500 range they typically accept offers in the $800 &#8211; $1000 range depending on &#8220;certain&#8221; factors. What those factors are, I have no idea. At they very end Christian adds one more wrinkle stating that our next offer would be &#8220;final&#8221;, what does that mean?</p>
<p>After receiving the reply we were unsure of what our next step should be:</p>
<div style="margin-left: 60px; text-indent: -30px;"><strong>First</strong>, I had assumed that the negotiation was going to be like a car, where you say a number, they reply with a another number, etc. This continues until you reach a mutually agreeable number or you decide to part ways. His email indicates that we are making a “final offer” possibly meaning we could not make future offers for some amount of time. We didn’t know.</div>
<div style="margin-left: 60px; text-indent: -30px;"><strong>Second</strong>, he states that they would not accept offers for less than $500. I don’t know if this is something he just added this to get us to increase our offer or if this is an actual policy they follow.</div>
<div style="margin-left: 60px; text-indent: -30px;"><strong>Third</strong>, he presents a new price range ($800 &#8211; $1000). If he’s voluntarily offering this information I am pretty sure it must be higher that what he actually thinks the value of the domain is. We immediately feel comfortable that we will eventually get the domain for at most our maximum value of $700. But the question is how much lower will he go?</div>
<p>As I saw it there were three options, 1) stick with the current $400 offer, 2) increase it to $500 the possible faux minimum, or 3) go straight to our $700 maximum. At this point we consult with several other co-workers having an engaging discussion on the best negotiation strategies to maximize returns. We had fun running through several scenarios and trying to predict what their response would be. However it boils down to the simple fact that we don’t know enough to predict their response or find the optimal strategy… it’s just a guess. In the end we decided that we would increase our bid to $500. Our plan was that they would either accept it right away or we would wait a few more months and try again at a slightly higher amount.</p>
<p>Our $500 offer was accepted right away. The domain transfer proceeded smoothly and everything worked just as expected. Happy ending, and we’ll have our website up soon.</p>
<h2>Thoughts</h2>
<p>I’ll never know if I should have stuck to my original offer of $400. I don’t know if they really have a $500 minimum policy or not. I am inclined to think I made a mistake and they would have taken the $400 offer. However I am glad that I didn’t jump immediately to my maximum bid!</p>
<p>Lastly, HugeDomains.com likely made a lot of money, even at the lower offer of $500. The prices that registrar&#8217;s pay for domains is fixed by ICANN at $6 for a 1 year registration [<a href="#ref6">6</a>]. However, we don&#8217;t know how much it costs them to acquire the domain initially nor do we know know the overhead of employing staff to handle negotiations, etc. It seems reasonable to assume that they picked up the registration for relatively cheap from someone else dropping their registration and I doubt their overhead costs for a small operation are that high. The business also operates on high volume with multiple hundreds of thousands of domains in their inventory. Factoring all these things in, it seems to me that their business model is highly profitable with extremely high markups.</p>
<h2>References</h2>
<p>[<a name="ref1">1</a>] &#8220;<em>Domain name speculation</em>&#8221; on Wikepedia, <a href="http://en.wikipedia.org/wiki/Domain_name_speculation">http://en.wikipedia.org/wiki/Domain_name_speculation</a></p>
<p>[<a name="ref2">2</a>] &#8220;<em>HugeDomains.com = the next BuyDomains.com</em>&#8220;,  <a href="http://www.domainstryker.com/huge-domains-andrew-reberry/">http://www.domainstryker.com/huge-domains-andrew-reberry/</a></p>
<p>[<a name="ref3">3</a>] &#8220;<em>Dear HugeDomains.com: $995 is Too Much for this Blog Name – Ya Pirates.</em>&#8220;, <a href="http://signalwriter.blogspot.com/2009/11/dear-hugedomainscom-995-is-too-much-for.html">http://signalwriter.blogspot.com/2009/11/dear-hugedomainscom-995-is-too-much-for.html</a></p>
<p>[<a name="ref4">4</a>] Andrew Reberry&#8217;s Linked in Profile, <a href="http://www.linkedin.com/in/andrewreberry">http://www.linkedin.com/in/andrewreberry</a></p>
<p>[<a name="ref5">5</a>] &#8220;<em>NameBright&#8217;s Terms and Conditions</em>&#8220;, <a href="http://www.namebright.com/Terms.aspx">http://www.namebright.com/Terms.aspx</a></p>
<p>[<a name="ref6">6</a>] &#8220;<em>Revised VeriSign .com Registry Agreement: Appendix G</em>&#8220;, <a href=" http://www.icann.org/en/tlds/agreements/verisign/registry-agmt-appg-com-16apr01.htm">http://www.icann.org/en/tlds/agreements/verisign/registry-agmt-appg-com-16apr01.htm</a></p>
<h3>Updates</h3>
<p><strong>January 21st, 2011:</strong> The article was revised at the request of HugeDomains.com. I also took the opportunity to correct several grammatical problems and cleanup the article a bit.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.scottphillips.com/2010/11/buying-a-domain-from-hugedomains-com/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>Make Tellurium / Selenium work with Firefox and Snow Leopard</title>
		<link>http://www.scottphillips.com/2010/09/make-tellurium-selenium-work-with-firefox-and-snow-leopard/</link>
		<comments>http://www.scottphillips.com/2010/09/make-tellurium-selenium-work-with-firefox-and-snow-leopard/#comments</comments>
		<pubDate>Tue, 28 Sep 2010 21:48:56 +0000</pubDate>
		<dc:creator>Scott</dc:creator>
				<category><![CDATA[HOWTO]]></category>
		<category><![CDATA[Selenium]]></category>
		<category><![CDATA[Tellurium]]></category>
		<category><![CDATA[Testing]]></category>

		<guid isPermaLink="false">http://scott.phillips.name/?p=486</guid>
		<description><![CDATA[TDL&#8217;s been playing around with Tellerium / Selenium for functional web-based tests for a little while now. Unfortunately everyone who&#8217;s been messing around with it has been running them on Ubuntu and the others of us on the team have OS X. When you run the Tellerium tests on a Mac with Snow Leopard they [...]]]></description>
			<content:encoded><![CDATA[<p>TDL&#8217;s been playing around with <a href="http://code.google.com/p/aost/">Tellerium</a> / <a href="http://seleniumhq.org/">Selenium</a> for functional web-based tests for a little while now. Unfortunately everyone who&#8217;s been messing around with it has been running them on Ubuntu and the others of us on the team have OS X. When you run the Tellerium tests on a Mac with Snow Leopard they fail to start Firefox with the following output from the Selenium server:</p>
<pre>Preparing Firefox profile...
dyld: Library not loaded: /usr/lib/libsqlite3.dylib
Referenced from: /System/Library/Frameworks/Security.framework/Versions/A/
   Security
Reason: Incompatible library version: Security requires version 9.0.0 or
   later, but libsqlite3.dylib provides version 1.0.0</pre>
<h2>What is Wrong?</h2>
<p>The problem Firefox is complaining about is the version of libsqllite3.dylib found. Snow Leopard ships with a version of <code>libsqllite</code> in <code>/usr/lib</code> and Firefox also provides it&#8217;s own version of <code>libsqllite3</code>.  Unfortunately there is a bug in Selenium 1.0.1 with how it calls Firefox that was patched in next version 1.0.2, however because of other bugs Tellerium is staying with 1.0.1 for now. You can read more about that decision in the mailing list thread:</p>
<p><a href="http://www.mail-archive.com/tellurium-users@googlegroups.com/msg02149.html">http://www.mail-archive.com/tellurium-users@googlegroups.com/msg02149.html</a></p>
<p>The actual problem, at least from an outside view point, is dirt simple. When Selenium calls Firefox it sets up a set of environmental variables, one of which is:</p>
<pre>DYLD_LIBRARY_PATH="null:/Applications/Firefox.app/Contents/MacOS"</pre>
<p>As you can probably see there&#8217;s a problem with the path with the &#8220;<code>null</code>&#8221; that snuck in there. So somewhere in Selenium there&#8217;s some place where they forgot to check for a <code>null</code> value. When Firefox starts up it&#8217;s not able to find it&#8217;s local copy of <code>libsqllight</code> because of the invalid path arguments. But Firefox will happily guess a good set of paths if you forget to set the <code>DYLD_LIBRARY_PATH</code>, so a simple solution is to place a simple bash script in-between Selenium and Firefox that simple removes the corrupted path.</p>
<p><span id="more-486"></span></p>
<h2>A Simple Solution</h2>
<p><strong>1) </strong> Create simple Selenium 2 firefox shell script which removes the corrupted path variable. Here&#8217;s mine:</p>
<pre>#!/bin/bash

unset DYLD_LIBRARY_PATH;
/Applications/Firefox.app/Contents/MacOS/firefox-bin $@</pre>
<p><strong>2)</strong> Change Selenium to call your bash script directly instead of Firefox. Inside the connector configuration for your project you&#8217;ll have a line configuring your scripts to use Firefox, after the designation &#8220;<code>*firefox</code>&#8221; you can provide a path to where to find the browser. Change that path to the  shell script that you created in step 1.</p>
<pre>connectior {
     /** snip... **/
     browser = "*firefox /full/path/to/your/selenium2firefox.sh"</pre>
<h2>Conclusion</h2>
<p>Beyond the basic hackish-nature of the working around the big downside is to using the approach is that Tellerium is not closing the old Firefox instances once it&#8217;s finished with them. That is at least a big improvement over the previous state of not able to run, so you can at least hobble along for a little while.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.scottphillips.com/2010/09/make-tellurium-selenium-work-with-firefox-and-snow-leopard/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>In-place SVN Import</title>
		<link>http://www.scottphillips.com/2010/09/in-place-svn-import/</link>
		<comments>http://www.scottphillips.com/2010/09/in-place-svn-import/#comments</comments>
		<pubDate>Mon, 06 Sep 2010 12:00:00 +0000</pubDate>
		<dc:creator>Scott</dc:creator>
				<category><![CDATA[HOWTO]]></category>
		<category><![CDATA[SVN]]></category>
		<category><![CDATA[Tips]]></category>

		<guid isPermaLink="false">http://scott.phillips.name/?p=396</guid>
		<description><![CDATA[I discovered an SVN trick today: how to do an in-place import into SVN. Normally when you run “svn import” it will leave the file system alone creating a copy on the repository. Then you have to do an &#8220;svn checkout&#8221; to pull the files back down under version control. The import/checkout process normally this [...]]]></description>
			<content:encoded><![CDATA[<p>I discovered an <a href="http://subversion.tigris.org/faq.html#in-place-import">SVN trick today</a>: how to do an in-place import into SVN. Normally when you run “<code>svn import</code>” it will leave the file system alone creating a copy on the repository. Then you  have to do an &#8220;<code>svn checkout</code>&#8221; to pull the files back down under version control.</p>
<p>The import/checkout process normally this is a pain. However there are a few instances where it&#8217;s a really big pain such as Unix&#8217;s <code>etc/</code> directory. You can’t just delete <code>etc/</code> and recheck it out from version control or lots of stuff will break.  The other place I&#8217;ve found this usefull is for Xcode when starting new projects. Use the in-place import instead of <a href="http://developer.apple.com/tools/subversionxcode.html">Apple&#8217;s suggestion</a> of creating two projects.</p>
<p>The process is quite simple. First create an empty directory is the repository, then checkout the empty directory into your existing location. Finally, run add the new files and then commit them into the repository.</p>
<pre style="margin: 30px;font-weight: bolder">svn mkdir https://your-svn-repo.com/new/directory/

svn checkout https//your-svn-repo.com/new/directory/  .

svn add *

svn commit –m “Initial in-place import of directory”
</pre>
]]></content:encoded>
			<wfw:commentRss>http://www.scottphillips.com/2010/09/in-place-svn-import/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Maven vs Grails</title>
		<link>http://www.scottphillips.com/2010/05/maven-vs-grails/</link>
		<comments>http://www.scottphillips.com/2010/05/maven-vs-grails/#comments</comments>
		<pubDate>Mon, 31 May 2010 12:00:09 +0000</pubDate>
		<dc:creator>Scott</dc:creator>
				<category><![CDATA[Reports]]></category>
		<category><![CDATA[Build systems]]></category>
		<category><![CDATA[Grails]]></category>
		<category><![CDATA[Groovy]]></category>
		<category><![CDATA[Ivy]]></category>
		<category><![CDATA[Maven]]></category>

		<guid isPermaLink="false">http://scott.phillips.name/?p=431</guid>
		<description><![CDATA[We have a Grails 1.1 which uses ant + Ivy as its build mechanism. The Ivy configuration has always been wonky, I wasn’t the person who set it up so I’m not exactly sure why it is that way. You would have to run this special ant target, “copy-jars”, which movies the libraries downloaded with [...]]]></description>
			<content:encoded><![CDATA[<p>We have a <a href="http://www.grails.org/">Grails</a> 1.1 which uses <a href="http://ant.apache.org/ivy/">ant + Ivy</a> as its build mechanism. The Ivy configuration has always been wonky, I wasn’t the person who set it up so I’m not exactly sure why it is that way. You would have to run this special ant target, “<code>copy-jars</code>”, which movies the libraries downloaded with Ivy around because some grails command would delete them.  When it came time to upgrade to the latest version, Grails 1.2.2, we wanted to address this problem. <a href="http://www.grails.org/Maven+Integration">One of the new features touted for this release is its maven integration</a> and since we use <a href="http://maven.apache.org/">Maven</a> for several of our other projects we figured this would be a good choice.</p>
<p>Our experiment upgrading to Grails 1.2.2 with Maven integration took a few days to work our way through the errors and gotchas.  In the end we were able to “successfully” upgrade our project – it compiled and passed all the tests. However it didn’t work that well. I’ve included my notes from the upgrade at the end of the post for anyone else who’s trying to do something similar. However in the end <a href="http://www.grails.org/Maven+Integration">Grails + Maven integration</a> sucked for a few reasons.</p>
<ul>
<li>Tests will fail if you do not run a <code>mvn clean</code> in-between. So you can’t run <code>mvn test</code>, fix something then, re-run <code>mvn test</code> again and expect it to work.</li>
<li>Running the <code>mvn package</code> command required a larger amount of memory. I had to increase memory available via<code> JAVA_TOOL_OPTIONS=-Xmx512m</code>. This wasn’t required for our old Ivy-based build system nor is it required when using a pure grails build.</li>
<li>The build is fragile, minor changes would break in unexpected ways. But the worst part is any errors encountered were always hidden behind <a href="http://java.sun.com/j2se/1.5.0/docs/guide/jpda/jdi/com/sun/jdi/InvocationException.html">java invocation errors</a> masking the real error. This makes debugging the build much harder.</li>
<li>In general the integration is very immature, maybe in the future this will improve. But at this state if I can’t recommend this to anyone.</li>
</ul>
<p>In the end we decided that <a href="http://www.grails.org/Maven+Integration">Maven integration</a> just wasn’t worth it, and we needed something better. The next option we looked at was a pure grails-based build system. This option worked well, there we’re no errors or gotchas like the previous experiment. The best part is that it is able to pull dependencies from Maven without dealing with maven! It’s pretty simple, <a href="http://www.grails.org/doc/latest/guide/3.%20Configuration.html#3.7%20Dependency%20Resolution">as described in the manual</a>, just set your dependencies into <code>grails-app/conf/BuildConfig.groovy</code>. That’s it, you’re done. Run “<code>grails compile</code>” and you’re dependencies will be downloaded and the application compiled. This just took a an hour to figure it all out and show it working compared to the days it took getting the Maven integration barely working. In the end I highly recommend using a pure-grails based build, it is a 1000 times better than the traditional Maven integration.<br />
<span id="more-431"></span></p>
<h2>Grails + Maven Notes</h2>
<p>Here are the notes I took during the upgrade to Grails + Maven</p>
<ol>
<li>Since no one has released Maven <a href="http://mvnrepository.com/artifact/org.grails/grails-maven-archetype">archetypes for Grails 1.2.2</a> we have to first upgrade using Maven to 1.2.0, then later upgrade to 1.2.2. Really how much different can the archetype be between these minor point versions?</li>
<li>Update you’re <code>~/.m2/settings.xml</code> to include a reference to the Maven plugins for Grails.
<pre>&lt;plugingroups&gt;
      &lt;plugingroup&gt;org.grails&lt;/pluginGroup&gt;
&lt;/pluginGroups&gt;</pre>
</li>
<li>Create a new <code>pom.xml</code> file for the project.
<pre>mvn grails:create-pom –DgroupId=<em>[groupId]</em> -DartifactId=<em>[artifactId]</em></pre>
<p>You’ll probably want to open the <code>pom.xml</code> file and add a few additional entries such as <code>&lt;name&gt;</code>, <code>&lt;description&gt;</code>, <code>&lt;url&gt;</code>, etc.</li>
<li>Next add your project’s dependencies into the <code>pom.xml</code>. This is obviously going to be very different for each project. But here are a few errors that I ran into for our specific project.
<p>Our project uses <a href="http://mvnrepository.com/artifact/xerces/xerces">Xerces</a> and <a href="http://mvnrepository.com/artifact/xalan/xalan">Xalan</a> for some XML processing, however these API&#8217;s conflicted with the Grails-based commands and their dependency on these libraries. Below is the error that I received (hopefully to help anyone searching for this error on Google). To fix the problem I needed to exclude our projects version of Xerces and Xalan and use the Grails version of these libraries.</p>
<pre style="font-size: 9px">
Embedded error: java.lang.reflect.InvocationTargetException
loader constraint violation: when resolving overridden method "org.apache.
xerces.jaxp.SAXParserImpl.getXMLReader()Lorg/xml/sax/XMLReader;" the
class loader (instance of org/codehaus/groovy/grails/cli/support/
GrailsRootLoader) of the current class, org/apache/xerces/jaxp/
SAXParserImpl, and its superclass loader (instance of &lt;bootloader&gt;),
have different Class objects for the type org/xml/sax/XMLReader
used in the signature
</pre>
<p>The second gotcha was our dependency on <a href="http://mvnrepository.com/artifact/dumbster/dumbster/1.6">dumbster:dumbster:1.6</a> which records dependencies on both <a href="http://mvnrepository.com/artifact/javax.activation/activation/1.0.2">javax.activation:1.0.2</a> and <a href="http://mvnrepository.com/artifact/javax.mail/mail/1.3.2">javax.mail:mail:1.3.2</a> both of which have entries in the Maven central repository but no artifacts exists with the entries. It’s weird, you can get the <code>pom.xml</code> files but not the actual jar files. I fixed this by adding in dependencies for the new versions of both libraries.</li>
<li>Run the maven upgrade command. In theory you should just run “<code>mvn grails:exec -Dcommand="upgrade"</code>”, but theory doesn’t always work. There’s some bug with running the upgrade command, it will fail with the following error:
<pre style="font-size: 9px">
Embedded error: java.lang.reflect.InvocationTargetException
/Users/scott/Development/Eclipse/Workspaces3.5/DSpace/trunk/null/src/war not found.
</pre>
<p>To get around this you will need to install Grails, and run the upgrade command directly.</p>
<pre>
GRAILS_HOME=/path/to/temporary/grails/home
cd &lt;project/dir&gt;
$GRAILS_HOME/bin/grails upgrade
unset GRAILS_HOME
</pre>
</li>
<li>Check you plugins. You can use &#8220;<code>mvn grails:list-plugins</code>&#8221; to get your current list. Then use &#8220;<code>mvn grails:uninstall-plugin –DpluginName=&lt;name&gt;</code>&#8221; and &#8220;<code>mvn grails:install-plugin –DpluginName=”&lt;name&gt; &lt;version&gt;”</code>&#8220;.</li>
<li>Remove the old build system&#8217;s such as any <code>build.xml</code> files left over.</li>
<li>Change the <code>pom.xml</code> files to reference the latest version of Grails. You should also use the same steps from step 5 to run the Grails upgrade command.</li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://www.scottphillips.com/2010/05/maven-vs-grails/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Adding OAI­-ORE Support to Repository Platforms</title>
		<link>http://www.scottphillips.com/2010/04/adding-oai%c2%ad-ore-support-to-repository-platforms/</link>
		<comments>http://www.scottphillips.com/2010/04/adding-oai%c2%ad-ore-support-to-repository-platforms/#comments</comments>
		<pubDate>Tue, 20 Apr 2010 12:00:06 +0000</pubDate>
		<dc:creator>Scott</dc:creator>
				<category><![CDATA[Publications]]></category>
		<category><![CDATA[OAI-ORE]]></category>
		<category><![CDATA[OR09]]></category>
		<category><![CDATA[Repositories]]></category>

		<guid isPermaLink="false">http://scott.phillips.name/?p=473</guid>
		<description><![CDATA[Our presentation at Open Repositories 2009 has been published in the Journal of Digital Information. Congratulations Alexey. Alexey Maslov, James Creel, Adam Mikeal, Scott Phillips, John Leggett, Mark McFarland. Adding OAI­-ORE Support to Repository Platforms. Journal of Digital Information, Volume 11, Number 1. Available at http://journals.tdl.org/jodi/article/view/749]]></description>
			<content:encoded><![CDATA[<p>Our presentation at <a href="https://or09.library.gatech.edu/">Open Repositories 2009</a> has been published in the Journal of Digital Information. Congratulations Alexey.</p>
<p style="padding-left: 30px"><em>Alexey Maslov, James Creel, Adam Mikeal, Scott Phillips, John  Leggett, Mark McFarland. </em>Adding OAI­-ORE Support to Repository Platforms. Journal of Digital Information, Volume 11, Number 1. Available at <a href="http://journals.tdl.org/jodi/article/view/749">http://journals.tdl.org/jodi/article/view/749</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.scottphillips.com/2010/04/adding-oai%c2%ad-ore-support-to-repository-platforms/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>DSpace Functional Tests?</title>
		<link>http://www.scottphillips.com/2010/04/dspace-functional-tests/</link>
		<comments>http://www.scottphillips.com/2010/04/dspace-functional-tests/#comments</comments>
		<pubDate>Sun, 04 Apr 2010 17:33:24 +0000</pubDate>
		<dc:creator>Scott</dc:creator>
				<category><![CDATA[Reports]]></category>
		<category><![CDATA[DSpace]]></category>
		<category><![CDATA[Manakin]]></category>
		<category><![CDATA[TDL]]></category>
		<category><![CDATA[Testing]]></category>
		<category><![CDATA[Vireo]]></category>

		<guid isPermaLink="false">http://scott.phillips.name/?p=417</guid>
		<description><![CDATA[The Texas Digital Library has been focusing on testability for our projects. Since DSpace is related too or part of most of our projects we’ve been looking for a way to increase DSpace’s testability. Traditionally this would mean adding unit tests and integration tests. However as DSpace currently stands is hard to break it up [...]]]></description>
			<content:encoded><![CDATA[<p>The <a href="http://www.tdl.org/">Texas Digital Library</a> has been focusing on testability for our projects. Since DSpace is related too or part of most of our projects we’ve been looking for a way to increase DSpace’s testability. Traditionally this would mean adding unit tests and integration tests. However as DSpace currently stands is hard to break it up into individual components that can be tested in isolation.  You’ll quickly find that writing tests for DSpace pull in the entire system, plus databases, and a file system. To address this problem we’ve created a simple framework for adding both integration tests and functional tests which improve the reliability of our projects. I’m interested to see if this is something the greater DSpace community would be interested in?</p>
<p><strong>The goals of our project were</strong> to create a mechanism where we could run complete functional tests. Functional tests evaluate the entire system as the end user would use it, so think of it as opening a web browser and evaluating the output – but completely automated. They test everything all together. Ideal it would be better to test each component individual, but this is in practical for DSpace for two reasons 1) DSpace is highly integrated and nearly impossible to separate from the database and file systems, 2) Creating unit test for all of DSpace is very time consuming it is simpler to write a few functional tests that cover a wide set of features over the whole application. It gets you to a point where you can reliably verify the software quicker. If you’re working on unit tests for DSpace please do not let this stand in your way.</p>
<p><span id="more-417"></span></p>
<p><strong>The main concept is</strong> to script the install of a test DSpace, with a full configuration and setup. Then we start DSpace in an <a href="http://winstone.sourceforge.net/">embedded webserver</a> and then run through several scenarios just as a normal user would. This tests the whole application, using a database, a file system, and a full build. The ant script where you normally run “<code>ant fresh_install</code>” has a new target “<code>ant test</code>”. You pass it a few parameters such as what database to use. The script will then run through a fresh install of DSpace into a local <code>/test</code> directory, setup some communities and collections, and import some basic items. Then <a href="http://www.junit.org/">JUnit</a>-based tests are run against the embedded webserver using <a href="http://htmlunit.sourceforge.net/">HtmlUnit</a> to simplify verifying the HTML output.</p>
<p><strong>Here is how to run it.</strong> After compiling using a “<code>mvn package</code>”, <code>cd</code> into <code>target/dspace-*-build.dir/</code> directory. Then run “<code>ant test</code>” you may need to pass it some parameters as listed below. Each parameter has a default so if you configure you’re database connections the same way then it can be as simple as running “<code>ant test</code>” without any parameters.</p>
<pre> -Dtest.db.driver="org.postgresql.Driver"
 -Dtest.db.url="jdbc:postgresql://localhost:5432/dspacetest"
 -Dtest.db.username="dspacetest"
 -Dtest.db.password="dspacetest"
 -Dtest.dspace.dir=”./test/"
<!--more--> -Dtest.config=”./test/config/dspace.cfg"</pre>
<p>We’ve used this approach rather successfully for two of our DSpace-based projects here at TDL: <a href="http://www.tdl.org/etds/">an ETD submission system called vireo</a>, and a <a href="https://wikis.tdl.org/tdl/Learning_Objects_Repository">learning object repository</a>. These projects haven’t moved to 1.6 yet, but I do have a patch available for DSpace 1.5.2. Most of the test cases we’ve created so far are specific to the project we’re working on. However the patch includes 4 manakin tests, which are really just an example of how tests work within this framework.</p>
<p>Download the patch: <a href="http://scott.phillips.name/files/2010/04/Dspace-1.5.2-FunctionalTest-V2.patch.txt">DSpace-1.5.2-FunctionalTest-V2.patch</a></p>
<p>The question is, is this something that the DSpace community would like? You can follow the discussion of this topic on the <a href="http://www.mail-archive.com/dspace-devel@lists.sourceforge.net/msg03063.html">dspace-devel mailing list</a>.</p>
<p><em>Update 4/8/2010:</em> The original patch was missing a class, that has been corrected.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.scottphillips.com/2010/04/dspace-functional-tests/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>How to setup Eclipse, Tomcat, and DSpace for Development</title>
		<link>http://www.scottphillips.com/2010/02/how-to-setup-eclipse-tomcat-and-dspace-for-developmen/</link>
		<comments>http://www.scottphillips.com/2010/02/how-to-setup-eclipse-tomcat-and-dspace-for-developmen/#comments</comments>
		<pubDate>Tue, 02 Feb 2010 12:00:00 +0000</pubDate>
		<dc:creator>Scott</dc:creator>
				<category><![CDATA[HOWTO]]></category>
		<category><![CDATA[DSpace]]></category>
		<category><![CDATA[Eclipse]]></category>
		<category><![CDATA[Tomcat]]></category>

		<guid isPermaLink="false">http://scott.phillips.name/?p=295</guid>
		<description><![CDATA[This is an updated guide describing how I configure Eclipse, Tomcat, and DSpace for my development. A previous version of this guide was written for Eclipse 3.4 and this version has been updated for the latest versions of both Eclipse and DSpace. One of my motivations forcing my move to the newer version of Eclipse [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://scott.phillips.name/files/2010/02/Eclipse_Tomcat_DSpace.jpg"><img class="size-full wp-image-304 alignright" src="http://scott.phillips.name/files/2010/02/Eclipse_Tomcat_DSpace.jpg" alt="" width="200" height="200" /></a>This is an updated guide describing how I configure <a href="http://www.eclipse.org/">Eclipse</a>, <a href="http://tomcat.apache.org/">Tomcat</a>, and <a href="http://www.dspace.org/">DSpace</a> for my development. A <a href="http://scott.phillips.name/2009/05/howto-dspace-eclipse-tomcat/">previous version</a> of this guide was written for Eclipse 3.4 and this version has been updated for the latest versions of both Eclipse and DSpace. One of my motivations forcing my move to the newer version of Eclipse is the ability to handle Java 1.6. <a href="http://developer.apple.com/java/">Apple has decided</a> to stop shipping both 32bit and 64bit versions, instead they are only providing 64bit Java binaries. Eclipse 3.5 is the first version to support 64bit on OS X.</p>
<h4>Versions:</h4>
<p style="padding-left: 30px">Eclipse: <a href="http://www.eclipse.org/downloads/">3.5</a><br />
Tomcat: <a href="http://tomcat.apache.org/download-60.cgi">6.x</a><br />
DSpace <a href="http://scm.dspace.org/svn/repo/dspace/branches/dspace-1_5_x/">1.5.x</a> or <a href="http://scm.dspace.org/svn/repo/dspace/trunk/">1.6.x</a></p>
<p><span id="more-295"></span>1) Download Eclipse</p>
<p>Obtain a fresh version of Eclipse from the <a href="http://www.eclipse.org/downloads/">official download site</a>, from which there are several variants. You can choose either &#8220;Eclipse for Java EE Developers&#8221; or &#8220;Eclipse for Java Developers&#8221;. Once you&#8217;ve selected the variant, choose the distribution that is appropriate for you (Mac, Windows, Linux, and 32bit vs 64bit). Once downloaded, unpack the software where you would like it to be installed, however it&#8217;s best if you do not try start Eclipse until after step #2.</p>
<div id="attachment_301" class="wp-caption aligncenter" style="width: 410px"><a href="http://www.eclipse.org/downloads/"><img class="size-full wp-image-301" src="http://scott.phillips.name/files/2010/01/Eclipse_Download_Packages.png" alt="Eclipse Download Packages screenshot" width="400" /></a><p class="wp-caption-text">Download Eclipse, either Java for EE Developers or just Java for Developers.</p></div>
<h2>2) Install the Sysdeo Tomcat plugin</h2>
<p><a href="http://www.eclipsetotale.com/tomcatPlugin.html"><img class="alignright size-full wp-image-309" src="http://scott.phillips.name/files/2010/01/Download_Sysdeo_Tomcat_Plugin.png" alt="" width="300" /></a>The Sysdeo Tomcat plugin allows you to start, stop, and restart Tomcat from within Eclipse. The plugin also allows you to debug a running web application by setting break points, stepping through the code line by line, and examining memory locations through the standard Eclipse debugger. Download the latest version of the <a href="http://www.eclipsetotale.com/tomcatPlugin.html">Sysdeo Tomcat plugin</a> and un-compress the software into Eclipse&#8217;s plugin directory. Next start Eclipse for the first and the plugin will be recognized. If you have previously started Eclipse then it will not immediately recognize the new plugin. In this case you need to start Eclipse with the &#8220;-clean&#8221; option from the command line.</p>
<p><a href="http://www.eclipsetotale.com/tomcatPlugin.html">http://www.eclipsetotale.com/tomcatPlugin.html</a></p>
<h2>3) Install the SVN &amp; Maven plugins</h2>
<div id="attachment_316" class="wp-caption alignright" style="width: 160px"><a href="http://scott.phillips.name/files/2010/01/Plugins-Install_New_Software.png"><img class="size-full wp-image-316" src="http://scott.phillips.name/files/2010/01/Plugins-Install_New_Software.png" alt="" width="150" /></a><p class="wp-caption-text">Help -&gt; Install New Software</p></div>
<p>Fortunately for the other plugins the process is easier than the Tomcat plugin, you will be able to use the built-in updating mechanism. There are two more plugins that you need: <a href="http://www.eclipse.org/subversive/">Subversive</a> and <a href="http://m2eclipse.sonatype.org/">M2Eclipse</a>. Subversive enables you to checkout the DSpace source code and keep up-to-date with the latest changes to the platform from <a href="http://scm.dspace.org/svn/repo/">DSpace&#8217;s SVN Repository</a>. <a href="http://maven.apache.org/">Maven</a>&#8216;s M2Eciplse plugin is used to support DSpace&#8217;s new Maven-based build system. Begin by selecting &#8220;Help -&gt; Install New Software&#8230;&#8221; from the main menu.</p>
<p>Near the top of the Install New Software dialog box there is a button labeled &#8220;Add&#8230;&#8221;, this lets you add a new site the list of places where Eclipse will look to get software from. We need to add a new site for M2Eclipse&#8217;s plugin. Click the &#8220;Add&#8230;&#8221; button and enter the information blow into the Add Site dialog box.:</p>
<pre style="margin: 0px 60px;font-size: 1.2em"><strong>Name</strong>: <code>M2Eclipse</code>

<strong>Location</strong>: <a href="http://m2eclipse.sonatype.org/update"><code>http://m2eclipse.sonatype.org/update</code></a>
</pre>
<div id="attachment_322" class="wp-caption aligncenter" style="width: 410px"><a href="http://scott.phillips.name/files/2010/01/Plugins-Add_Maven_Site.png"><img class="size-full wp-image-322 " src="http://scott.phillips.name/files/2010/01/Plugins-Add_Maven_Site.png" alt="" width="400" /></a><p class="wp-caption-text">Add new M2Eclipse site.</p></div>
<p>Click &#8220;OK&#8221; to add the new site. You will be returned to the Install New Software dialog box, change select field labeled &#8220;Work With&#8221; to &#8220;&#8211;All Available Sites&#8211;&#8221;. This will bring up a big list of all the available software you can install from all knows sources, including the one you just added.</p>
<div id="attachment_325" class="wp-caption aligncenter" style="width: 410px"><a href="http://scott.phillips.name/files/2010/01/Plugins-All_Available_Sites.png"><img class="size-full wp-image-325 " src="http://scott.phillips.name/files/2010/01/Plugins-All_Available_Sites.png" alt="" width="400" /></a><p class="wp-caption-text">Select the &quot;--All Available Sites--&quot; option.</p></div>
<p>From the large list of Eclipse plugins and other components select the three components listed below for SVN and Maven:</p>
<ul>
<li>Collaboration
<ul>
<li>Subversive SVN Team Provider (Incubation)</li>
</ul>
</li>
<li>Maven Integration
<ul>
<li>Maven Embedder</li>
<li>Maven Integration for Eclipse (Required)</li>
</ul>
</li>
</ul>
<p>Once the plugins are selected, click the &#8220;Next &gt;&#8221; button and follow the dialogs to install the plugins. After they are installed you will be prompted to restart Eclipse which you should agree too. When Eclipse restarts there should be a dialog box waiting for you to select which SVN implementation to use. You can either choose to use JavaHL which uses your local system implementation or SVNKit which is a pure Java implementation. If you clicked the window away, or it did not appear, then switch to the &#8220;SVN Repository Exploring&#8221; perspective and the dialog box will appear.</p>
<div id="attachment_341" class="wp-caption aligncenter" style="width: 410px"><a href="http://scott.phillips.name/files/2010/01/Plugins-Select_SVN_Connector.png"><img class="size-full wp-image-341" src="http://scott.phillips.name/files/2010/01/Plugins-Select_SVN_Connector.png" alt="" width="400" /></a><p class="wp-caption-text">Select which type of SVN implementation to us, either JavaHL to use your OS&#039;s local binaries or SVN Kit for a pure Java implementation.</p></div>
<h2>4) Check out the DSpace source code</h2>
<p>The next step is to check out the DSpace source code from the DSpace SVN Repository. If the &#8220;Welcome&#8221; tab is displayed when you start eclipse, click that away to return to the workbench. In the Project Explorer panel, left hand side, select &#8220;New -&gt; Project&#8230;&#8221; from the context menu.</p>
<div id="attachment_345" class="wp-caption aligncenter" style="width: 410px"><a href="http://scott.phillips.name/files/2010/01/DSpace-New_Project.png"><img class="size-full wp-image-345" src="http://scott.phillips.name/files/2010/01/DSpace-New_Project.png" alt="" width="400" /></a><p class="wp-caption-text">Project Explorer -&gt; New -&gt; Project...</p></div>
<p>This will bring up a &#8220;Select a wizard&#8221; dialog box, select entry the &#8220;Project from SVN&#8221; wizard and click the next button.You will then be asked to configure an SVN Repository to check out code from. In the URL field put DSpace&#8217;s repository URL and leave all other fields at their defaults. Then click next.</p>
<pre style="margin: 0px 60px;font-size: 1.2em"><strong>URL:</strong> <a href="http://scm.dspace.org/svn/repo/"><code>http://scm.dspace.org/svn/repo/</code></a></pre>
<div id="attachment_346" class="wp-caption aligncenter" style="width: 410px"><a href="http://scott.phillips.name/files/2010/01/DSpace-Select_Repository.png"><img class="size-full wp-image-346" src="http://scott.phillips.name/files/2010/01/DSpace-Select_Repository.png" alt="" width="400" /></a><p class="wp-caption-text">Enter the Root DSpace SVN Repository</p></div>
<p>You may be asked to accept a certificate from the repository, if so, accept it.</p>
<p>The next step will ask you to select a specific version of DSpace to checkout. You can choose a specific taged version such as 1.5.1 or 1.5.2. These tags will never change, they have already been released. Alternatively you can choose to check out a branch or mainline trunk these versions will continue to change over time as new features and bug fixes are added.  Click the &#8220;Browse&#8230;&#8221; button to select the version of DSpace you want to work with. When you are done, click the &#8220;OK&#8221; button followed by the &#8220;Finished&#8221; button.</p>
<div id="attachment_347" class="wp-caption aligncenter" style="width: 410px"><a href="http://scott.phillips.name/files/2010/01/DSpace-Select_DSpace_Version.png"><img class="size-full wp-image-347" src="http://scott.phillips.name/files/2010/01/DSpace-Select_DSpace_Version.png" alt="" width="400" /></a><p class="wp-caption-text">Select which version of DSpace to check out, either a tag, branch or the main line trunk.</p></div>
<p>The final step will ask you where to place DSpace within your Eclipse workspace. Select the last option &#8220;Check out as a project with the name specified:&#8221; and enter any name you choose.</p>
<h2>5) Configure Maven</h2>
<p>After checking out DSpace&#8217;s source code the next step is to compile. DSpace uses Maven as its build system but before you can start to compile you&#8217;ll need to configure Maven. Locate the project you just checked out in the Project Explorer on the left hand side of you Eclipse workbench. Right clicking on the project will bring up a context menu. There will be a Maven submenu with several options, you need to enable dependency management and nested modules. After ensuring those two options are turned on use Maven to update the project&#8217;s configuration. Between each of these steps Eclipse may need to rebuild its workspace, however after it is finished you should not see any compilation errors.</p>
<ul>
<li>Maven -&gt; Enable Dependency Management</li>
<li>Maven -&gt; Enable Nested Modules</li>
<li>Maven -&gt; Update Project Configuration</li>
</ul>
<div id="attachment_350" class="wp-caption aligncenter" style="width: 410px"><a href="http://scott.phillips.name/files/2010/01/Maven-Context_Menu.png"><img class="size-full wp-image-350" src="http://scott.phillips.name/files/2010/01/Maven-Context_Menu.png" alt="" width="400" /></a><p class="wp-caption-text">Configure maven and update the project&#039;s configuration.</p></div>
<p>Next you need to create a &#8220;Run Configuration&#8221; to compile DSpace. From Eclipse&#8217;s top tool bar select &#8220;the green arrow -&gt; Run Configurations&#8230;&#8221; as shown below:</p>
<div id="attachment_351" class="wp-caption aligncenter" style="width: 234px"><a href="http://scott.phillips.name/files/2010/01/Maven-Run_Configuration_Menu.png"><img class="size-full wp-image-351" src="http://scott.phillips.name/files/2010/01/Maven-Run_Configuration_Menu.png" alt="" width="224" height="124" /></a><p class="wp-caption-text">The Green Arrow -&gt; Run Configurations...</p></div>
<p>Next the &#8220;Run Configuration&#8221; dialog box will appear. Locate &#8220;Maven&#8221; on the list, and right click to select the &#8220;New&#8221; option from the context menu.</p>
<div id="attachment_352" class="wp-caption aligncenter" style="width: 195px"><a href="http://scott.phillips.name/files/2010/01/Maven-New_Run_Configuration.png"><img class="size-full wp-image-352" src="http://scott.phillips.name/files/2010/01/Maven-New_Run_Configuration.png" alt="" width="185" height="126" /></a><p class="wp-caption-text">Create a new Maven-based run configuration.</p></div>
<p>This will change the right hand side of the dialog box displaying the details for a maven command. Fill out the following parameters as shown below in the picture.</p>
<pre style="margin: 0px 60px;font-size: 1.2em"><strong>Name</strong>: <code>MVN Package</code>
<strong>Base Directory</strong>: <code>${project_loc}</code>
<strong>Goals</strong>: <code>package</code>
</pre>
<p>Next add a new parameter &#8220;dspace.config&#8221; that points to the full path to you dspace.cfg.</p>
<div id="attachment_353" class="wp-caption aligncenter" style="width: 410px"><a href="http://scott.phillips.name/files/2010/01/Maven-Detailed_Run_Configuration.png"><img class="size-full wp-image-353" src="http://scott.phillips.name/files/2010/01/Maven-Detailed_Run_Configuration.png" alt="" width="400" /></a><p class="wp-caption-text">Detailed view of the completed run configuration</p></div>
<p>When you are finished click the &#8220;Run&#8221; button to compile DSpace. I suggest you also create another Run Configuration for the Maven &#8220;clean&#8221; goal.</p>
<h2>6) Configure Tomcat</h2>
<p>The last component needing configuration is the Tomcat Sysdeo Plugin. This plugin enables you to start and stop Tomcat from within Eclipse and run web applications in the debugger. However the plugin needs to know where your Tomcat is installed and where to run the web application. First, start with general Tomcat configuration. Select the main Eclipse preferences, &#8220;Eclipse -&gt; Preferences&#8221; from the main menu (under windows this may be located in the File menu).</p>
<div id="attachment_366" class="wp-caption aligncenter" style="width: 410px"><a href="http://scott.phillips.name/files/2010/01/Tomcat-Global_Preferences_1.png"><img class="size-full wp-image-366" src="http://scott.phillips.name/files/2010/01/Tomcat-Global_Preferences_1.png" alt="" width="400" /></a><p class="wp-caption-text">Select the appropriate Tomcat version and path.</p></div>
<p>Inside the &#8220;Preferences&#8221; dialog box expand the &#8220;Tomcat&#8221; option from the left hand hierarchy. At this screen select the correct version of Tomcat that you are using, and enter the full path to the Tomcat directory. Next select the &#8220;Tomcat Manager App&#8221; from the left hand hierarchy and enter your Tomcat username and password. If you have not set up a Tomcat manager account then use the &#8220;Add user to tomcat-users.xml&#8221; button to create one. When you are finished click &#8220;OK&#8221;.</p>
<div id="attachment_367" class="wp-caption aligncenter" style="width: 410px"><a href="http://scott.phillips.name/files/2010/01/Tomcat-Global_Preferences_2.png"><img class="size-full wp-image-367" src="http://scott.phillips.name/files/2010/01/Tomcat-Global_Preferences_2.png" alt="" width="400" /></a><p class="wp-caption-text">Provide a username and password to authenticate with Tomcat&#039;s manager application.</p></div>
<p>The next step is to configure the DSpace project to use Tomcat. In the main menu select the project&#8217;s properties by &#8220;Project -&gt; properties&#8221;, and select &#8220;Tomcat&#8221; from the left hand hierarchy. Enter the following fields and click &#8220;OK&#8221;.</p>
<pre style="margin: 0px 60px;font-size: 1.2em"><strong>Is a Tomcat Project</strong>: <code>Check<code>

<strong>Context Name</strong>: <code>dspace</code>

<strong>Subdirectory</strong>: <code>dspace-xmlui/dspace-xmlui-webapp/
              target/dspace-xmlui-webapp-1.5.2/</code>
</code></code></pre>
<p>You need to change the sub-directory to point to a complete <span style="text-decoration: underline">W</span>ebapplication <span style="text-decoration: underline">AR</span>chive (WAR) file or directory. This directory will change if you want to check out the JSPUI or the XMLUI, and it also changes slighly for each different version of DSpace. Below are some examples you can use to extrapolate where which directory to use.</p>
<p><strong>DSpace JSPUI 1.5.2</strong>: <code>dspace-jspui/dspace-jspui-webapp/target/dspace-jspui-webapp-1.5.2/</code></p>
<p><strong>DSpace XMLUI 1.5 branch</strong>: <code>dspace-xmlui/dspace-xmlui-webapp/target/dspace-xmlui-webapp-1.5.2-SNAPSHOT/</code></p>
<div id="attachment_368" class="wp-caption aligncenter" style="width: 410px"><a href="http://scott.phillips.name/files/2010/01/Tomcat_Project_Preferences.png"><img class="size-full wp-image-368" src="http://scott.phillips.name/files/2010/01/Tomcat_Project_Preferences.png" alt="" width="400" /></a><p class="wp-caption-text">Declare the project as a Tomcat Project, set a context name, and provide the path to the web-application.</p></div>
<h2>9) Deploy the web application</h2>
<p>Congratulations your done! To test that everything work start Tomcat, then use the Update and Reload commands to publish DSpace&#8217;s web-application into Tomcat. Use you&#8217;re web browser to check that DSpace is running.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.scottphillips.com/2010/02/how-to-setup-eclipse-tomcat-and-dspace-for-developmen/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Grails HTTP-based test plugins</title>
		<link>http://www.scottphillips.com/2009/11/grails-http-test-plugins/</link>
		<comments>http://www.scottphillips.com/2009/11/grails-http-test-plugins/#comments</comments>
		<pubDate>Wed, 18 Nov 2009 17:51:23 +0000</pubDate>
		<dc:creator>Scott</dc:creator>
				<category><![CDATA[Reports]]></category>
		<category><![CDATA[Grails]]></category>
		<category><![CDATA[TDL]]></category>
		<category><![CDATA[Testing]]></category>

		<guid isPermaLink="false">http://scott.phillips.name/?p=257</guid>
		<description><![CDATA[There are two leading frameworks for doing http-based integration tests within the Grails framework: functional-test and webtest. Both are packaged as grails plugins and accomplish the same task in a similar manner. The main library behind the scene is HtmlUnit. This is a well thought out library for abstracting web conversations, i.e get this url, [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://scott.phillips.name/files/2009/11/grails-logo.png"><img class="alignright size-full wp-image-260" src="http://scott.phillips.name/files/2009/11/grails-logo.png" alt="grails-logo" width="249" height="222" /></a>There are two leading frameworks for doing http-based integration tests within the <a href="http://www.grails.org/">Grails framework</a>: functional-test and webtest. Both are packaged as grails plugins and accomplish the same task in a similar manner. The main library behind the scene is <a href="http://htmlunit.sourceforge.net/">HtmlUnit</a>. This is a well thought out library for abstracting web conversations, i.e get this url, check that it has a form, click the submit button. It’s basically a headless browser which even supports Javascript. <a href="http://htmlunit.sourceforge.net/">HtmlUnit</a> appears to have taken the mantle from <a href="http://httpunit.sourceforge.net/">HttpUnit</a> as the premeir library in this small arena. There has only been one release since 2006 for HttpUnit while HtmlUnit has been very active with 11 releases in the same period.</p>
<p style="padding-left: 30px"><a href="http://www.grails.org/plugin/webtest"><strong>Webtest</strong></a> is the more established project with its first release in 2007. Webtest is provided by <a href="http://webtest.canoo.com/">Canoo</a> as a free open source plugin to Grails. The tests are run via an <a href="http://ant.apache.org/">Ant script</a> that packages together the test cases and runs through each one.</p>
<p style="padding-left: 30px"><a href="http://grails.org/plugin/functional-test"><strong>Functional-test</strong></a> is a relatively new project with its first release in early 2009. Functional test is basically the same as webtests however instead of using the Ant-based scripts everything is based on <a href="http://www.junit.org/">Junit</a> and implemented in plain old java or groovy.</p>
<p>We’ve chosen to proceed with using the Junit-based plugin for our HTTP integration tests. Our primary reason in deciding to use Junit is that it integrates well with our other unit-based tests which also use Junit instead of using disparate frameworks for the various types of tests. While this plugin may be newer it’s traffic on the mailing lists is growing and substantial.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.scottphillips.com/2009/11/grails-http-test-plugins/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>HOWTO: Layout Django Apps</title>
		<link>http://www.scottphillips.com/2009/07/howto-package-django-apps/</link>
		<comments>http://www.scottphillips.com/2009/07/howto-package-django-apps/#comments</comments>
		<pubDate>Wed, 01 Jul 2009 12:15:57 +0000</pubDate>
		<dc:creator>Scott</dc:creator>
				<category><![CDATA[HOWTO]]></category>
		<category><![CDATA[Django]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[SVN]]></category>

		<guid isPermaLink="false">http://scott.phillips.name/?p=152</guid>
		<description><![CDATA[The Texas Digital Library has been using the Django framework for a growing number of our smaller projects. Typically, if there&#8217;s not already a well established open source solution for the task at hand then the default answer is to write it in Django. Our faculty directory and request systems are already implemented in Django [...]]]></description>
			<content:encoded><![CDATA[<p>The <a href="http://www.tdl.org/">Texas Digital Library</a> has been using the <a href="http://www.djangoproject.com/">Django framework</a> for a growing number of our smaller projects. Typically, if there&#8217;s not already a well established open source solution for the task at hand then the default answer is to write it in Django. Our <a href="http://directory.tdl.org/">faculty directory</a> and <a href="http://services.tdl.org/">request systems</a> are already implemented in Django and we are currently refreshing our account management system into the framework. As our use of Django has grown, our practice of storing these projects in the version control repository as one unit has shown its weakness. Within the version control repository, we have passwords, database connections, and file locations stored; in general different settings between production vs development. This makes it hard to keep track of the settings for production vs pre-production and the development instances. This post is my attempt at describe a set of best practices for how to store Django projects in a code repository and deploy them between production and development environments.</p>
<h3>Django Applications vs Projects</h3>
<p>Django provides tools and concepts to break up your websites into very small components that can be re-used between sites. The first and most basic distinction that needs to be understood is Django&#8217;s difference between applications and projects.</p>
<p style="padding-left: 30px">A <em>Django Application</em> is a single self-contained set of related features that depends upon the Django framework. By convention, a Django application typically will contain: <code>admin.py</code>, <code>models.py</code>, <code>urls.py</code>, <code>views.py</code>, and other files. These applications should contain no configuration that needs to be changed between installations.</p>
<p style="padding-left: 30px">A <em>Django Project</em> is a collection of applications that share a single database configuration, work together in a URL space, and share a common set of application configurations. Projects may contain multiple applications or just one. Django requires that projects have at least three files: <code>manage.py</code>, <code>settings.py</code>, and <code>urls.py</code>. I believe it is best to consider Django Projects like Apache&#8217;s <code>httpd.conf</code>.</p>
<p><span id="more-152"></span></p>
<h3>Application Layout</h3>
<pre style="margin: 0 30px">django_application/
    |
    |---------- module_name/
    |              |
    |              |---------- models.py
    |              |
    |              |---------- templates/
    |              |
    |              |---------- urls.py
    |              |
    |              |---------- views.py
    |
    |---------- site_media/
    |              |
    |              |---------- module_name/
    |
    |---------- docs/</pre>
<p>The application is where all the code that accomplishes whatever tasks the application requires will reside. It is best to split applications into the smallest possible units to increase the possibility that your application can be reused in another project. As outlined above, there should be three key directories:</p>
<ol>
<li>a python module,</li>
<li>site_media, and</li>
<li>a directory for any relevant documentation.</li>
</ol>
<p>The python module will contain all the python source files for your application, this can be easily created using the <code>django-admin.py</code> command &#8216;<code>startapp</code>&#8216;. Each application should have a set of default templates that reside in a special directory named &#8216;<code>templates/</code>&#8216; inside the python module. The default template loaders know to search inside this directory for each applications. The next major directory, &#8216;<code>site_media/</code>&#8216;, should contain all the static content that is to be served directly from the web server. In order to support the most flexible options during application deployment by local convention, all the files should reside in a sub-directory named the same as the python module. This will make it easier for the web server to separate site media from multiple applications effectively. When building your application make sure that all references to static content take the form of &#8216;<code>/site_media/module_name/*</code>&#8216;, this way multiple application&#8217;s static content can co-exist in the same URL name space.</p>
<h3>Project Layout</h3>
<pre style="margin: 0 30px">django_project/
    |
    |---------- manage.py
    |
    |---------- settings.py
    |
    |---------- urls.py</pre>
<p>Django project should contain only a minimal amount of source code, instead focusing on configuration for the applications. The default layout of projects can be created quickly using the <code>django-admin.py</code> command. These files should be in some sort of version control but they will contain passwords and settings that change between environment that make it difficult to share. This is the same problem any system administrator will face with other configuration files. Locally we use a separate part of our version control repository just for configuration that has limited access set aside just for these types of configuration files.</p>
<h3>How to Deploy</h3>
<p>Following the patterns outlined here ultimately provides a set of flexible deployment options of your applications and site, avoiding situations where passwords and database connection parameters reside in your source code repository. Each application may have specialized installation instructions. However, the steps outlined below will work for most applications.  This assumes that python, django, your database, and apache are already installed.</p>
<h4>1) Install the Django Applications</h4>
<p style="padding-left: 30px">First, check out the desired applications from the code repository and install them into your local python installation. There are three methods you can use:</p>
<p style="padding-left: 30px"><strong>A)</strong> Symlink the python <code>module_name/</code> directory into python&#8217;s site-packages.<br />
<em>(Probably best when actively developing)</em></p>
<p style="padding-left: 30px"><strong>B)</strong> Run the install script to compile a python egg, and install in site-packages.<br />
<em>(Probably best in a production environment)</em></p>
<p style="padding-left: 30px"><strong>C)</strong> Modify the pythonpath environmental variable to include the python module.</p>
<h4>2) Create a Django project</h4>
<p style="padding-left: 30px">Use the <code>django-admin.py</code> program to quickly create a new Django project for your site. Using the command will create a default project with all the basic configuration files awaiting your editing.</p>
<pre style="margin: 0 30px"><strong>$</strong> django-admin.py startproject <strong><em></em></strong></pre>
<h4>3) Customize <code>settings.py</code></h4>
<p style="padding-left: 30px">The settings file controls a lot of how your website will work. Below are the four major areas of settings that need to be configured for a particular installation.</p>
<p style="padding-left: 30px"><strong>A)</strong> Database &amp; URL settings</p>
<p style="padding-left: 30px">Configure your project to use a particular database. This includes all the <code>DATABASE_<em>*</em></code> settings for your particular environment. You also need to configure the URL parameters for media.</p>
<p style="padding-left: 30px"><strong>B)</strong> Installed applications</p>
<p style="padding-left: 30px">Configure the <code>INSTALLED_APPS</code> array to include the applications desired for your Django project. Because these applications are installed directly as a python site wide package, all you need to do is include the <code>module_name</code> of each application in the array.</p>
<p style="padding-left: 30px"><strong>C)</strong> Application specific settings</p>
<p style="padding-left: 30px">Each application will typically contain a set of parameters that need to be configured before it will operate properly. Consult each application&#8217;s documentation for specific instructions.</p>
<h4>4) Customize  <code>urls.py</code></h4>
<p style="padding-left: 30px">Django&#8217;s <code>urls.py</code> file controls what particular URLs are mapped to specific functions. At the site level you need to map your installed applications into the website&#8217;s URL namespace. Some applications may require a particular namespace, and others may require multiple mappings. For a simple basic configuration, add a line for each application as shown below, where <code>URL-NAMESPACE</code> is replaced with the url under which this application should be installed and <code>MODULE-NAME</code> is the application to install.</p>
<pre style="margin: 0 30px">(r'^<strong>URL-NAMESPACE</strong>/', include('<strong>MODULE-NAME.urls</strong>')),</pre>
<p style="padding-left: 30px">For example, if you want the application &#8216;<code>myapp</code>&#8216; to be used for all URLs that begin with &#8216;<code>myapp</code>&#8216;, then use the following line:</p>
<pre style="margin: 0 30px">(r'^myapp/', include('myapp.urls')),</pre>
<p style="padding-left: 30px">As another example, if you want one application (ex: <code>myapp</code>) to be installed at the root of the URL namespace, use the following line:</p>
<pre style="margin: 0 30px">(r'.', include('myapp.urls')),</pre>
<h4>5) Setup Apache directives</h4>
<p style="padding-left: 30px">There is a basic Apache configuration needed for Django to operate properly. Beyond this basic configuration listed below, an additional step is required to enable across to each application&#8217;s static content. If your site is using the default templates provided by the application(s), then you need to map each application’s <code>site_media</code> directory so that it is accessible over the web.</p>
<p style="margin: 0 30px">Just for reference, this is the basic Django configuration:</p>
<pre style="margin: 0 30px">
SetHandler python-program
PythonHandler django.core.handlers.modpython
SetEnv DJANGO_SETTINGS_MODULE mysite.settings
PythonPath "['/opt/auth/sites'] + sys.path"
</pre>
<p style="padding-left: 30px">Then for each application, map the corresponding location using Apache&#8217;s aliases and locations. One possible configuration is:</p>
<pre style="margin: 0 30px">Alias /site_media/app1 /full/path/to/app1/site_meda/app1

Alias /site_media/app2 /full/path/to/app2/site_media/app2

SetHandler None
</pre>
<h6>Resources:</h6>
<p>These ideas are not all my own. Here are the resources I used while creating this guide:</p>
<ul>
<li>James  Bennett. <span style="text-decoration: underline">Practical Django Projects</span>. Apress (June 23, 2008). ISBN 978-1590599969.<br />
Available from <a href="http://www.amazon.com/Practical-Django-Projects-Pratical/dp/1590599969">Amazon</a>, <a href="http://www.apress.com/book/view/1590599969">Apress</a>, or see a preview on <a href="http://books.google.com/books?id=xX5nkG8v7m0C&amp;lpg=PP1&amp;dq=Practical%20Django%20Projects&amp;pg=PT1">Google Books</a>.</li>
</ul>
<ul>
<li> &#8220;Django tips: laying out an application&#8221;,<br />
<a href="http://www.b-list.org/weblog/2006/sep/10/django-tips-laying-out-application/">http://www.b-list.org/weblog/2006/sep/10/django-tips-laying-out-application/</a></li>
</ul>
<ul>
<li>A forum thread where someone asks the basic question, how should I layout my Django application?<br />
<a href="http://stackoverflow.com/questions/44135/project-design-fs-layout-for-large-django-projects">http://stackoverflow.com/questions/44135/project-design-fs-layout-for-large-django-projects</a></li>
</ul>
<ul>
<li>The semi-official Django Do and Don&#8217;t list.<br />
<a href="http://code.djangoproject.com/wiki/DosAndDontsForApplicationWriters">http://code.djangoproject.com/wiki/DosAndDontsForApplicationWriters</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.scottphillips.com/2009/07/howto-package-django-apps/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Vireo @ JCDL 2009</title>
		<link>http://www.scottphillips.com/2009/06/vireo-jcdl-2009/</link>
		<comments>http://www.scottphillips.com/2009/06/vireo-jcdl-2009/#comments</comments>
		<pubDate>Tue, 16 Jun 2009 18:22:39 +0000</pubDate>
		<dc:creator>Scott</dc:creator>
				<category><![CDATA[Publications]]></category>
		<category><![CDATA[DSpace]]></category>
		<category><![CDATA[ETD]]></category>
		<category><![CDATA[JCDL09]]></category>
		<category><![CDATA[Manakin]]></category>
		<category><![CDATA[Repositories]]></category>
		<category><![CDATA[Vireo]]></category>

		<guid isPermaLink="false">http://scott.phillips.name/?p=180</guid>
		<description><![CDATA[My colleague, Adam Mikeal, presented our paper titled &#8220;Large-scale ETD repositories: a case study of a digital library application&#8221; at JCDL 2009, where it was nominated for best paper! The paper describes at a high level the Texas Digital Library&#8217;s implementation of a state-wide electronic thesis and dissertation (ETD) system, delving into the theoretical, technical, [...]]]></description>
			<content:encoded><![CDATA[<p>My colleague, <a href="http://adammikeal.org/">Adam Mikeal</a>, presented our paper titled &#8220;<a href="http://portal.acm.org/citation.cfm?doid=1555400.1555423">Large-scale ETD repositories: a case study of a digital library application</a>&#8221; at<a href="http://www.jcdl2009.org/"> JCDL 2009</a>, where it was nominated for best paper! The paper describes at a high level the <a href="http://www.tdl.org/about-tdl/projects/#vireo">Texas Digital Library&#8217;s</a> implementation of a state-wide <span style="text-decoration: underline">e</span>lectronic <span style="text-decoration: underline">t</span>hesis and <span style="text-decoration: underline">d</span>issertation (ETD) system, delving into the theoretical, technical, and political issues that were encountered. Vireo is the main component of the system, a <a href="http://www.dlib.org/dlib/november07/phillips/11phillips.html">Manakin</a>-based addon to <a href="http://www.dspace.org/">DSpace</a> that handles the ETD work flow, starting with a student&#8217;s initial submission through an iterative staff review, cataloging by the library, and on to final publication.</p>
<p>Citation:</p>
<p style="padding-left: 30px">Mikeal, A., Creel, J., Maslov, A., Phillips, S., Leggett, J., and McFarland, M. 2009. Large-scale ETD repositories: a case study of a digital library application. <em>In Proceedings of the 2009 Joint international Conference on Digital Libraries </em>(Austin, TX, USA, June 15 &#8211; 19, 2009). JCDL &#8217;09. ACM, New York, NY, 135-144. DOI= <a href="http://doi.acm.org/10.1145/1555400.1555423">http://doi.acm.org/10.1145/1555400.1555423</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.scottphillips.com/2009/06/vireo-jcdl-2009/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Where to host a blog?</title>
		<link>http://www.scottphillips.com/2009/06/where-to-host-a-blog/</link>
		<comments>http://www.scottphillips.com/2009/06/where-to-host-a-blog/#comments</comments>
		<pubDate>Sun, 07 Jun 2009 14:58:48 +0000</pubDate>
		<dc:creator>Scott</dc:creator>
				<category><![CDATA[Commentary]]></category>
		<category><![CDATA[Blogging]]></category>
		<category><![CDATA[TDL]]></category>

		<guid isPermaLink="false">http://www.aggiescott.com/?p=17</guid>
		<description><![CDATA[One of the questions I faced when starting this blog is where should I host it? There are lots of options from several commercial blogging services or from the many free blogging services such as Blogger, SquareSpace, ExpressionEgine, or WordPress.com. Because of my employment there is also the option to use the Texas Digital Library&#8217;s [...]]]></description>
			<content:encoded><![CDATA[<p>One of the questions I faced when starting this blog is where should I host it? There are lots of options from several commercial blogging services or from the many free blogging services such as <a href="http://www.blogger.com/">Blogger</a>, <a href="http://www.squarespace.com">SquareSpace</a>, <a href="http://expressionengine.com/">ExpressionEgine</a>, or <a href="http://wordpress.com">WordPress.com</a>. Because of my employment there is also the option to use the <a href="http://blogs.tdl.org/">Texas Digital Library&#8217;s blogging service</a> based upon WordPress. Then lastly because I have the technical skills and available hosting, I can self publish my blog. I ultimately decided to self publish this blog using my own means instead of using a blogging service, here are the factors that effected my decision:</p>
<p><span id="more-17"></span></p>
<p><strong>Identity</strong>: The URL a blog is hosted at is extremely important. First, it is one of the first things new readers will notice about your blog, and second it must persist. I have chosen to go with the relatively new top-level domain: &#8220;<code>.name</code>&#8220;. I find the domain interesting. It is a new breed of domains where registrars are able to sell 3rd level domains, i.e. I own <code>scott.phillips.name</code>, but I do not own <code>phillips.name</code> as the generic surnames are reserved. The URL is a clean representation of what the site is, this is the blog of Scott Phillips &#8211; a person. As for the second property, persistence, because I own the domain it will persist for as long as this blog is relevant. I will have the option to migrate the blog between technologies and across services. Most of the free or near free blogging services support independently owned domains along with self hosting, however TDL&#8217;s service does not support these types of URLs.</p>
<p><strong>Customization</strong>: Writing a blog is a significant investment of time that should not be wasted on a poorly built site. I want to be able to have a unique look-and-feel that represents my blog, and have control over the version availability of plugins. This level of customization is impossible for most blogging services to provide because the many support issues which can be created. I want the ability to do custom programming, or more likely tweaking code to my taste. The only option that supports this is self publishing a blog, really no shared service can provide this.</p>
<p><strong>Credibility</strong>: The TDL blogging service hosted at the <code>blogs.tdl.org</code> domain, along with all the other services offered by TDL, gives a blog the instant credibility that the blog is scholarly in purpose and raises it above the numerous blogs. For others this may be a more important factor, but I feel this is not enough of an incentive to overcome the other two factors.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.scottphillips.com/2009/06/where-to-host-a-blog/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Microsoft tools for repositories workshop</title>
		<link>http://www.scottphillips.com/2009/05/microsoft-ecosystem/</link>
		<comments>http://www.scottphillips.com/2009/05/microsoft-ecosystem/#comments</comments>
		<pubDate>Thu, 28 May 2009 20:14:04 +0000</pubDate>
		<dc:creator>Scott</dc:creator>
				<category><![CDATA[Reports]]></category>
		<category><![CDATA[OR09]]></category>
		<category><![CDATA[Repositories]]></category>
		<category><![CDATA[SWORD]]></category>
		<category><![CDATA[Zentity]]></category>

		<guid isPermaLink="false">http://www.aggiescott.com/?p=3</guid>
		<description><![CDATA[During the Open Repositories 2009 conference I attended the Microsoft&#8217;s &#8220;Tools for Repositories&#8221; workshop.  At the workshop Microsoft was able to get down into the details of their new products and how they are intended to work together. I was impressed by the breadth of work that Microsoft is engaging in to support scholarly publishing [...]]]></description>
			<content:encoded><![CDATA[<p>During the <a href="https://or09.library.gatech.edu/">Open Repositories 2009</a> conference I attended the <a href="https://or09.library.gatech.edu/workshops.php">Microsoft&#8217;s &#8220;Tools for Repositories&#8221; workshop</a>.  At the workshop Microsoft was able to get down into the details of their new products and how they are intended to work together. I was impressed by the breadth of work that Microsoft is engaging in to support scholarly publishing use-cases within their tool set. The workshop lasted about 4 hours, longer than most of the other workshops at OR09. The bulk of time was given to the two developers <a href="http://savas.me/">Savas Paratatidas</a> and <a href="http://research.microsoft.com/en-us/people/pablofe/">Pablo Fernicola</a>, along with the team leader <a href="http://research.microsoft.com/en-us/people/awade/">Alex Wade</a><a href="http://research.microsoft.com/en-us/people/ldirks/"></a>.</p>
<p>Microsoft announced their intention to engage with the repository community at last year&#8217;s <a href="http://or08.ecs.soton.ac.uk/">OR08</a> conference in Southhampton. The OR09 workshop featured the tools that Microsoft has developed in the intervening year starting with new authoring tools in Word, publishing via SWORD, a new peer-reviewed journaling service, and the new .NET based repository. It&#8217;s clear to see that Microsoft will have a huge impact on scholarly publishing in the future. We, as the repository community, will need to adjust our services to ensure they work smoothly within the Microsoft ecosystem. After the jump read about the three big features that were presented.</p>
<p><span id="more-55"></span></p>
<h2>Authoring Addin:</h2>
<p>Let&#8217;s start with Microsoft&#8217;s Word suite. There are several &#8220;addins&#8221; Microsoft is in the process of creating for Word. &#8220;<a href="http://office.microsoft.com/en-us/downloads/CD102141051033.aspx">Addins&#8221;</a> are plugins that can be downloaded from Microsoft providing extra functionality. The most interesting of these new addins is &#8220;<a href="http://research.microsoft.com/en-us/projects/authoring/">Authoring Addin</a>&#8221; for Word 2007 (windows only) still in Beta. This pushes metadata creation down into the very first steps the author takes when creating their document.</p>
<p>The tool is based upon Word templates, so you as a journal publisher would create a Word-based template for your journal that gives the visual style of your journal, the required sections, and what metadata fields are required or optional. The creator downloads this template and begins their work. When they put the title into the document&#8217;s body it is automatically extracted because they are using the template. Next, after the creator is finished and ready to submit their article to a journal, they will be stopped until they fill in the required metadata fields. Within Word, a pane slides out querying the user for additional metadata. Once the required metadata fields are present, the user can continue their submission via SWORD. The user doesn&#8217;t need to know URLs or metadata schemas, all that technology is hidden behind the template. From the creator&#8217;s point of view, if they start with the correct template then they just use Word as normal and click the publish button at the end &#8211; Very simple. However, all this hinges on starting with the correct template.</p>
<h2>Electronic Journal Service:</h2>
<p>Next let&#8217;s continue with <a href="http://research.microsoft.com/en-us/projects/ejournal/">Microsoft&#8217;s new journaling service</a>. The first thing to note is that this will be a service and not software. It will be hosted by Microsoft.  The service handles the entire peer review process for an on-line academic journal up to, but not including, publication. The service, in alpha testing at the moment, appears to be geared for both non-profit and commercial publishers. The service does not place any constraints on the publishing of the journal. Material is to be exported to whatever system the publisher wants to use. This integrates nicely with the author tools and SWORD compatibility described earlier.</p>
<p>The best method to submit content into the journaling service is via SWORD from within Word, although a standard web-based form submission is also available. The service offers a similar work flow to <a href="http://pkp.sfu.ca/ojs">Open Journal System&#8217;s</a> model. However, it doesn&#8217;t bewilder the user with a massive array of options. One of the interesting points is that if articles are submitted as Word documents then all the workflow can be accomplished within Word&#8217;s native GUI. As an editor you review articles by downloading them, marking them up, filling in or correcting metadata all within Word before uploading your updated version. Once an article has passed through the peer review process and is ready to be published then the article will be pushed to the publisher via SWORD, or if that is not available, then FTP.</p>
<p>This is interesting to see this use of &#8216;double SWORD.&#8217; It is something we have been thinking about for a workflow system we&#8217;re at the very beginning stages of here in Texas: accept SWORD as input and then when finished with the review workflow, automatically push the content into the repository via SWORD.</p>
<h2>Zentity, .NET repository platform</h2>
<p>Finally let&#8217;s cover Microsoft&#8217;s new repository platform: <a href="http://research.microsoft.com/en-us/projects/zentity/">Zentity</a>. This has received lots of hype within the repository community. Zentity is a typical web-based application built upon the Microsoft stack. Its data model underneath is a classic triple store with some basic scholarly object relationships defined. Zentity implements a wide array of interoperability protocols: RSS, OAI-PMH, OAI-ORE, SWORD (and Atom Pub) just to start. Microsoft obviously is not using Zentity to trap customers into a Microsoft platform.</p>
<p>After looking at the software it&#8217;s only feature that distinguishes it from other repository platforms is just that its on the Microsoft stack. In fact, it feels like not much development has gone into what is currently implemented; it feels like the current features were just the low lying fruit that are easy to implement because it is on the Microsoft stack. I&#8217;m hard pressed to see any reason why someone would switch to Zentity or, if they don&#8217;t have a repository, choose Zentity over another platform.</p>
<h2>Commentary, Where is the IR use-case?</h2>
<p>The big thing that I noticed to be absent from Microsoft&#8217;s plan is the use-case for the Institutional Repository in their ecosystem. Microsoft Word integrates into journal publishing via SWORD and the electronic journaling service but where does Zentity, or any other repository fit into that? If a faculty member downloads a journal&#8217;s Word template and then submits it into the journal&#8217;s submission system then published on the journal&#8217;s website how does it get captured into the institutional repository?</p>
<p>The template-based design lacks a critical feature in my opinion, &#8216;A me too&#8217; feature. Something that streamlines a faculty member&#8217;s submission to both the journal and the institutional repository with just one click. Without something that fills this need, faculty members will be forced to create one document and submit it to the journal of their choice via SWORD, then using another template from the IR create a second document to submit to the institutional repository. This is anything but one-click submission!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.scottphillips.com/2009/05/microsoft-ecosystem/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>HOWTO: DSpace + Eclipse + Tomcat</title>
		<link>http://www.scottphillips.com/2009/05/howto-dspace-eclipse-tomcat/</link>
		<comments>http://www.scottphillips.com/2009/05/howto-dspace-eclipse-tomcat/#comments</comments>
		<pubDate>Wed, 13 May 2009 16:14:05 +0000</pubDate>
		<dc:creator>Scott</dc:creator>
				<category><![CDATA[HOWTO]]></category>
		<category><![CDATA[DSpace]]></category>
		<category><![CDATA[Eclipse]]></category>
		<category><![CDATA[Tomcat]]></category>

		<guid isPermaLink="false">http://www.aggiescott.com/?p=4</guid>
		<description><![CDATA[Update: There is an updated version of these instructions covering the next version of Eclipse, 3.5 (Galileo). A common question for someone just starting to develop with DSpace is how do others set up their development environment. Often times this isn&#8217;t documented anywhere, but a lot of time goes into researching the best way to [...]]]></description>
			<content:encoded><![CDATA[<div style="background-color: #f3e5c0;border: 2px solid #582103;padding: 20px;margin: 10px auto;width: 400px"><span style="color: #660033"><span style="font-weight: bold">Update</span></span>: There is an <a href="http://scott.phillips.name/2010/02/how-to-setup-eclipse-tomcat-and-dspace-for-developmen/">updated version of these instructions</a> covering the next version of Eclipse, 3.5 (Galileo).</div>
<p>A common question for someone just starting to develop with DSpace is how do others set up their development environment. Often times this isn&#8217;t documented anywhere, but a lot of time goes into researching the best way to set things up. Today I co-taught a class on customizing DSpace for TDL. One of the hand outs I created for the class is a simple how to setup DSpace, Eclipse, and Tomcat together for easier development. This is certainly not the only way to set up these tools, but it is the method most developers in TDL choose.</p>
<p>One thing to note is the use of <a href="http://www.eclipsetotale.com/tomcatPlugin.html">Sysdeo&#8217;s Tomcat plugin</a> vs. <a href="http://www.eclipse.org/webtools/">Eclipse&#8217;s WTP plugin</a> for integration with Tomcat. I played around with the WTP tools that come standard with Eclipse and found them to be too buggy to rely on. I hope that future versions of Eclipse will iron out the kinks in WTP tools, but for now I&#8217;m going to stick with Sysdeo.</p>
<h3>Versions:</h3>
<ul>
<li>Eclipse 3.4.x</li>
<li>Tomcat 5.x or 6.x</li>
<li>DSpace 1.5.x</li>
</ul>
<p><a href="http://scott.phillips.name/files/2009/05/setupeclipse.pdf">Howto: setup Eclipse, Tomcat, and DSpace</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.scottphillips.com/2009/05/howto-dspace-eclipse-tomcat/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Customizing DSpace</title>
		<link>http://www.scottphillips.com/2009/05/customizing-dspace/</link>
		<comments>http://www.scottphillips.com/2009/05/customizing-dspace/#comments</comments>
		<pubDate>Wed, 13 May 2009 15:13:55 +0000</pubDate>
		<dc:creator>Scott</dc:creator>
				<category><![CDATA[Presentations]]></category>
		<category><![CDATA[DSpace]]></category>
		<category><![CDATA[TDL]]></category>
		<category><![CDATA[Training]]></category>

		<guid isPermaLink="false">http://www.aggiescott.com/?p=37</guid>
		<description><![CDATA[TDL offers several training classes for DSpace and other software/services we offer. They are a good way to get in depth information on a particular topic; the sessions typically last for a day &#8211; in some cases half a day. You&#8217;ll be able to get all your questions answered about a particular topic. This last [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.tdl.org/about-tdl/training/">TDL offers several training classes</a> for DSpace and other software/services we offer. They are a good way to get in depth information on a particular topic; the sessions typically last for a day &#8211; in some cases half a day. You&#8217;ll be able to get all your questions answered about a particular topic.</p>
<p>This last Wednesday after getting back from vacation I co-taught the &#8220;DSpace Customization&#8221; class with Steve Williams from UT. I think the class was a great success, here are the slides we used for the class.</p>
<p><a href="http://scott.phillips.name/files/2009/06/tdl-manakin-training.pdf">DSpace Customization Slides</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.scottphillips.com/2009/05/customizing-dspace/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>HOWTO: DSpace Batch Import</title>
		<link>http://www.scottphillips.com/2009/05/howto-dspace-batch-ingest/</link>
		<comments>http://www.scottphillips.com/2009/05/howto-dspace-batch-ingest/#comments</comments>
		<pubDate>Tue, 12 May 2009 18:49:50 +0000</pubDate>
		<dc:creator>Scott</dc:creator>
				<category><![CDATA[HOWTO]]></category>
		<category><![CDATA[DSpace]]></category>

		<guid isPermaLink="false">http://www.aggiescott.com/?p=52</guid>
		<description><![CDATA[About a year ago we ran into the problem where a department wanted to ingest content into our repository using a batch ingest format from their internal database system. My initial thought was to pull the page out of the DSpace manual that covers the batch import and  hand that over to them so they [...]]]></description>
			<content:encoded><![CDATA[<p>About a year ago we ran into the problem where a department wanted to ingest content into our repository using a batch ingest format from their internal database system. My initial thought was to pull the page out of the DSpace manual that covers the batch import and  hand that over to them so they can build  their import. Turns out, that page doesn&#8217;t exist. All the DSpace manual will tell you how to do is the syntax the ./import command uses to run the batch import. If you want to find out how to build a DSpace import the only place you can look is at the Java source to piece it together.</p>
<p>Obviously this isn&#8217;t going to work for another department, so I created a simple one page hand out (print in duplex) to give someone who needs to create an import. I think this is still a good resource, so even if I wrote it over a year ago I wanted to publish it on my blog.</p>
<p><a href="http://scott.phillips.name/files/2009/06/dspacebatchimport.pdf">DSpace Batch Import Format</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.scottphillips.com/2009/05/howto-dspace-batch-ingest/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

