Glossary
All definition for terms and acronyms used in this documentation.
PGCN: Plateforme de Gestion de Contenu Numérisé. Original name of NumaHOP.
OCR : Optical Character Recognition. Process by which text is recognized on images.
Metadata standards and file formats.
METS : Metata-Data Encoding and Transportation Standard. An XML standard .
EAD : Encoded Archival Description.
DC : DublinCore. Small Set of core metadata for bibliographic units. User Guide
SIP : Submission Information Package. A file describing a package to be archived.
AIP : Archival Information Package. A file representing an archived packaged.
XSD : XML Schema Definition. An XML file describing an XML format allowing an XML parser to verify a file is well formed according to a standard.
MARC: Format for bibliographic data. Spec
ALTO : Analyzed Layout and Text Object.
Protocols.
OAI-PMH : Open Archives Initiative Protocol for Metadata Harvesting Spec
Z39.50 : A protocol to searching and retrieving data in databases between servers over TCP/IP.
NumaHOP Vocabulary.
Digitalization service provider: The company(for external digitalization) or service (internal digitalization) doing the digitalization process.
Document Unit: Also referred as Doc Unit
or DU
. A Document or part of a document for collections to be processed by NumaHOP.
Workflow: Steps the Document Unit takes trough NumaHOP. Can be defined at the project level, or the bundle level.
Notice: Bibliographic meta-datas attached to a DU.
Condition Report: A check of the state of the document after it was manipulated during the digitalization process.
Delivery: The step where the digitalized documents are deposited into NumaHOP and attached to the correct Document Unit.
Import: The step where we create the Document Unit.
Grouping of Document Units.
Project: A project is usually a set of documents to be digitalized by the same provider.
Bundle: A physical set of Documentary Units averaging around 20 to 50 documents.
Train: As in digitalization train. Smaller set of document for insurance reason.
All sets of grouping are not necessarily useful for all use cases. For smaller volume of documents in a project the train is not as useful.
Design Pattenrn jargon.
DTO: Data Transfer Object an object traveling between the view and the controller.
Service: Class containing business logic.
Repository: Class abstracting a storage method (database, elasticsearch or file system).
Mapper: Class performing mapping from a domain object to another.