Exporting from Vireo into DSpace
The first version of the Vireo Electronic Thesis and Submission system was built as an addon to DSpace. It used the same technology stack, reused the underlying database and file storage, operated within the same UI. The original idea was that Vireo would deeptly integrate with the repository. Because of these decisions there was no separation between how Vireo stored it’s metadata and it’s Dublin Core encoding of the metadata. There was only one way, the Vireo way. If you wanted something else you could not do it with DSpace. For example, if you wanted to store the author’s information in contributor.author you could not. Vireo demanded that you use the creator field instead.
Vireo 2.0 broke this requirement bring more flexibility. The project is no longer deeply integrated with DSpace, or any other repository. Internally Vireo stores its metadata in relational tables in the format that is easiest for it to work with and does not conform to any particular metadata encoding. Then when data is ready to be deposited into the repository the SWORD protocol is used to deposit the content into the destination repository. During the SWORD deposit Vireo will generate a metadata package in a particular encoding format. These “export formats” are designed to be flexible so that different repositories can use different encodings. I’ve previously written a blog post on the technology behind these export formats if you are interested in customizing them.
Vireo ships with several built in export formats. The two formats for DSpace are the METS-based package along with the simple archive format. Both of these formats use th same metadata profile that was hard coded into prior versions of Vireo. Here is a recap of that profile
Qualified Dublin Core Fields:
dc.creatorThe Student’s name in last, first format.
dc.titleThe document’s title.
dc.description.abstractThe document’s abstract as plain text.
dc.subjectThe document’s keywords each as a separate element.
dc.contributor.advisorThe student’s Committee Chair, Co-Chair, Superviosor, Co-Superviosor, or Advisors’s name in last, first format.
dc.contributor.committeeMemberThe student’s committee if they have a role other than one associated with the advisor field. The name is also in the last, first format.
dc.date.createdThe student’s graduation date in the format “YYYY-MM”.
dc.date.submittedThe date the student completed their submission (the first time) for review.
dc.date.issuedThe date when the submission transitioned into the Approved state.
dc.format.mimetypeThe mimetype of the primary document. This is always “application/pdf” because primary documents are required to be PDFs.
dc.language.isoThe document’s language in ISO-639-2 format.
dc.type.materialThis field is statically defined as “text”.
dc.typeThis field is statically defined as “Thesis”.
dc.identifier.uriThe handle or other repository defined identifier as assigned by the repository. This would not have been assigned until after it has been submitted into a repository. Therefore, upon first submission it would not be present, but on subsequent submissions it would be defined. However, the SWORD protocol, version 1, does not support re-submissions of the same item - it would create duplicate copies in the repository.
ETD-MS Fields:
thesis.degree.nameThe full degree name selected by the Student
thesis.degree.levelThe degree level (Doctoral, Masters, Undergraduate) selected by the Student.
thesis.degree.disciplineThe major selected by the student
thesis.degree.grantorThe degree granting institution this field is statically set for all submissions under the Application’s Settings tab.
thesis.degree.departmentThe department selected by the student.
DSpace Embargo Fields:
local.embargo.termsThe type of embargo term, in this case a specific date calculated based upon the graduation date.
local.embargo.liftThe date when this item should be released from embargo. This is calculated based upon the graduation date, and yes this is the same date as the other field - That’s just the way they made DSpace work. It doesn’t make sense from this perspective.
Dublin Core Provenance Fields:
dc.description.provenanceAn english description of the embargo selected.
dc.description.provenanceAn english description of when the student agreed to the license.
dc.description.provenanceAn english description of the submission date, document type.
dc.description.provenanceAn english description when the committee approved the submission.
dc.description.provenanceAn english description when the submission was approved.
dc.description.provenanceAn english description of when the submission was deposited
If you are trying to get Vireo to deposit items into a DSpace. The number one problem that occurs is because DSpace does not have the above fields defined in its metadata registry. When you try to deposit an item into a DSpace repository and the item uses a field that is unknown to DSpace then a log message will be generated in DSpace but the error returned via SWORD to Vireo will be a generic error message. One of the best debugging techniques I have found is to tail the DSpace logs while doing a submission to identify the error. Most of the time it’s a simple fix to just add the field to the metadata registry.