Posts Tagged ‘TDL’

Preserving Character Encodings of a DSpace Metadata Export using MS Excel 2011 on OS X

Wednesday, July 20th, 2011

Stencil Alphabet The problem I recently ran into was updating the metadata for a particular collection that was being moved from TDL’s repository into A&M’s repository. I able to quickly move the collection into the new repository using OAI-PMH harvesting with ORE support. However, the metadata needed a bit of cleaning up for it’s new repository home, such as changing dc.contributor.author to dc.author and inconsistent formats used in other fields. This is a perfect task for Stuart’s Bulk Metadata Export tool. This DSpace feature allows an administrator to download a Comma Separate Values (CSV) file of all the metadata in a particular collection, then open it up in MS Excel and edit the metadata naturally. Finally once the metadata is ready to go you can upload it back to the repository and all the fields will be updated correctly. It is a very nice feature that can save a ton of time.

The Problem

When I opened the file in Excel some of the characters were not showing up correctly. Specifically characters in titles and names which used non-English marks, in this case there were all from the extended Latin character set. If you ignore these problems, later when you try to upload the CSV file DSpace will pick up on these changes and cause the garbled characters to be introduced into the repository.


(more…)

SvnBot 1.1 Released

Friday, February 25th, 2011

The SvnBot is a simple single purpose IRC robot that monitors one or more SVN repositories. When changes are committed to a source repository the robot makes an announcement in an IRC channel. The purpose of the tool is to allow a team of developers to keep up to date on changes that other team members are making. Here at TDL we have a geographically distributed team of software developers some in Austin and Lubbock along with my self in College Station. This is one tool that helps the team keep in sync with each other.

There are already many tools available that do this task [1][2][3], however they all require the use of SVN commit-hooks. Commit-hooks are run on the repository’s server allowing external tools to be notified when specific events occur. Using commit-hooks can work reasonably well if you have access to the server’s configuration, but that is not always the case. Instead of relying on commit-hooks; the SvnBot runs independently, periodically polling the repository for any updates. When an update in found a message will be announced in IRC.

Below is an example IRC message. The message includes the author and their commit message along with some brief statistics about the number of files affected and a URL. If multiple files were affected by a single commit then the url reported is to the common path, i.e. the closest directory that contains all the affected files.

Scott: “SAND-30, reviewed the pom file: removing unneeded dependencies, declaring output to be UTF-8 and added a few comments.(Rev 666: 1 file modified) https://texasdl.jira.com/svn/SAND/svnbot/trunk/pom.xml

Special thanks to the Texas Digital Library, my employer, for allowing me to release this as an open source project.
(more…)

DSpace Functional Tests?

Sunday, April 4th, 2010

The Texas Digital Library has been focusing on testability for our projects. Since DSpace is related too or part of most of our projects we’ve been looking for a way to increase DSpace’s testability. Traditionally this would mean adding unit tests and integration tests. However as DSpace currently stands is hard to break it up into individual components that can be tested in isolation. You’ll quickly find that writing tests for DSpace pull in the entire system, plus databases, and a file system. To address this problem we’ve created a simple framework for adding both integration tests and functional tests which improve the reliability of our projects. I’m interested to see if this is something the greater DSpace community would be interested in?

The goals of our project were to create a mechanism where we could run complete functional tests. Functional tests evaluate the entire system as the end user would use it, so think of it as opening a web browser and evaluating the output – but completely automated. They test everything all together. Ideal it would be better to test each component individual, but this is in practical for DSpace for two reasons 1) DSpace is highly integrated and nearly impossible to separate from the database and file systems, 2) Creating unit test for all of DSpace is very time consuming it is simpler to write a few functional tests that cover a wide set of features over the whole application. It gets you to a point where you can reliably verify the software quicker. If you’re working on unit tests for DSpace please do not let this stand in your way.

(more…)

Grails HTTP-based test plugins

Wednesday, November 18th, 2009

grails-logoThere are two leading frameworks for doing http-based integration tests within the Grails framework: functional-test and webtest. Both are packaged as grails plugins and accomplish the same task in a similar manner. The main library behind the scene is HtmlUnit. This is a well thought out library for abstracting web conversations, i.e get this url, check that it has a form, click the submit button. It’s basically a headless browser which even supports Javascript. HtmlUnit appears to have taken the mantle from HttpUnit as the premeir library in this small arena. There has only been one release since 2006 for HttpUnit while HtmlUnit has been very active with 11 releases in the same period.

Webtest is the more established project with its first release in 2007. Webtest is provided by Canoo as a free open source plugin to Grails. The tests are run via an Ant script that packages together the test cases and runs through each one.

Functional-test is a relatively new project with its first release in early 2009. Functional test is basically the same as webtests however instead of using the Ant-based scripts everything is based on Junit and implemented in plain old java or groovy.

We’ve chosen to proceed with using the Junit-based plugin for our HTTP integration tests. Our primary reason in deciding to use Junit is that it integrates well with our other unit-based tests which also use Junit instead of using disparate frameworks for the various types of tests. While this plugin may be newer it’s traffic on the mailing lists is growing and substantial.

Where to host a blog?

Sunday, June 7th, 2009

One of the questions I faced when starting this blog is where should I host it? There are lots of options from several commercial blogging services or from the many free blogging services such as Blogger, SquareSpace, ExpressionEgine, or WordPress.com. Because of my employment there is also the option to use the Texas Digital Library’s blogging service based upon WordPress. Then lastly because I have the technical skills and available hosting, I can self publish my blog. I ultimately decided to self publish this blog using my own means instead of using a blogging service, here are the factors that effected my decision:

(more…)