premisrw

PREMIS-Reader-Writer: a small PREMIS library designed to work as a plugin for METS-reader-writer. Public functions and classes:

  • data_to_premis
  • premis_to_data
  • PREMISObject
  • PREMISEvent
  • PREMISAgent
  • PREMISRights
class metsrw.plugins.premisrw.premis.PREMISAgent(**kwargs)[source]

Bases: metsrw.plugins.premisrw.premis.PREMISElement

defaults

Return a dict that maps implicit getter attributes (implicit in self.schema) to default values or to callables that return default values. For example, see PREMISObject.defaults.

schema

Return a tuple representing the schema of the PREMIS element. This tuple schema determines the available getters and setters (during initialization) of the subclass.

class metsrw.plugins.premisrw.premis.PREMISElement(**kwargs)[source]

Bases: object

Abstract base class for PREMIS object, event and agent classes. These classes must implement schema and defaults properties. After that, initalization can proceed either by passing a data kwarg to the class or by passing keyword arguments implicit in the element tag names of self.schema, e.g.,:

>>> premis_obj = PREMISObject(data=('object', {...}, (...)))
>>> premis_obj = PREMISObject(
    identifier_type='UUID',
    identifier_value='9bf6bcf8-4d77-4623-a9fb-b703365d0ffe',
    ...)

Under the first construction approach, the tuple passed as data becomes the source of truth for the PREMIS element. Under the second construction approach, the kwargs are used to construct a data tuple that becomes the source of truth. This tuple can be accessed via the .data property.

attributes

Return a dict that maps normalized XML attributes to their values, e.g., ‘xsi_schema_location’ and ‘schema_location’ would be keys for the value of the xsi:schemaLocation PREMIS XML attribute.

attrs_to_paths

Return a dict that maps valid getter attributes to the simplified XPaths needed to get the corresponding values from self.data.

This property analyzes self.schema and sets self._attrs_to_paths to a dict that maps implicit getters like ‘agent_identifier_value’ and ‘identifier_value’ to the XPaths implicit in self.schema. In the case of PREMISAgent, the above two getters would map to the XPath ‘agent/agent_identifier/agent_identifier_value’. PREMISAgent.schema also implies the getters ‘agent_identifier’ and ‘identifier’, which both map to the XPath ‘agent/agent_identifier’ and which should return a tuple (or list thereof) instead of a string.

data
defaults()[source]

Return a dict that maps implicit getter attributes (implicit in self.schema) to default values or to callables that return default values. For example, see PREMISObject.defaults.

find(path)[source]
find_text_or_all(path)[source]
findall(path)[source]
findtext(path)[source]
classmethod fromtree(tree)[source]

Create a PREMIS from an _Element.

generate_data()[source]

Generate and return a tuple to assign to self._data, which is the source of truth of the PREMIS XML element. Expects self._xml_element_values and self._xml_attribute_values to be dicts populated with XML element text values and XML attribute values, respectively.

schema()[source]

Return a tuple representing the schema of the PREMIS element. This tuple schema determines the available getters and setters (during initialization) of the subclass.

serialize()[source]
tostring(pretty_print=True)[source]
class metsrw.plugins.premisrw.premis.PREMISEvent(**kwargs)[source]

Bases: metsrw.plugins.premisrw.premis.PREMISElement

compression_details

Return as a 3-tuple, this PREMIS compression event’s program, version, and algorithm used to perform the compression.

defaults

Return a dict that maps implicit getter attributes (implicit in self.schema) to default values or to callables that return default values. For example, see PREMISObject.defaults.

encryption_details

Return as a 3-tuple, this PREMIS encryption event’s program, version, and key used to perform the encryption.

get_decompression_transform_files(offset=0)[source]

Returns a list of dicts representing <mets:transformFile> elements with TRANSFORMTYPE="decompression" given compression_algorithm which is a comma-separated string of algorithms that must be used in the order provided to decompress the package, e.g., ‘bzip2,tar’ or ‘lzma’.

get_decryption_transform_file()[source]

Returns a dict representing a <mets:transformFile> element with TRANSFORMTYPE="decryption".

parsed_event_detail

Parse and return our PREMIS eventDetail string value like:

'program="7z"; version="9.20"; algorithm="bzip2"'

and return a dict like:

{'algorithm': 'bzip2', 'version': '9.20', 'program': '7z'}
schema

Return a tuple representing the schema of the PREMIS element. This tuple schema determines the available getters and setters (during initialization) of the subclass.

class metsrw.plugins.premisrw.premis.PREMISObject(**kwargs)[source]

Bases: metsrw.plugins.premisrw.premis.PREMISElement

defaults

Return a dict that maps implicit getter attributes (implicit in self.schema) to default values or to callables that return default values. For example, see PREMISObject.defaults.

schema

Return a tuple representing the schema of the PREMIS element. This tuple schema determines the available getters and setters (during initialization) of the subclass.

class metsrw.plugins.premisrw.premis.PREMISRights(**kwargs)[source]

Bases: metsrw.plugins.premisrw.premis.PREMISElement

defaults

Return a dict that maps implicit getter attributes (implicit in self.schema) to default values or to callables that return default values. For example, see PREMISObject.defaults.

schema

Return a tuple representing the schema of the PREMIS element. This tuple schema determines the available getters and setters (during initialization) of the subclass.

metsrw.plugins.premisrw.premis.data_find(data, path)[source]

Find and return the first element-as-tuple in tuple data using simplified XPath path.

metsrw.plugins.premisrw.premis.data_find_all(data, path, dyn_cls=False)[source]

Find and return all element-as-tuples in tuple data using simplified XPath path.

metsrw.plugins.premisrw.premis.data_find_text(data, path)[source]

Return the text value of the element-as-tuple in tuple data using simplified XPath path.

metsrw.plugins.premisrw.premis.data_find_text_or_all(data, path, dyn_cls=False)[source]
metsrw.plugins.premisrw.premis.data_to_premis(data, premis_version='2.2')[source]

Given tuple data representing a PREMIS entity (object, event or agent), return an lxml.etree._Element instance. E.g.,:

>>> p = data_to_premis((
    'event',
    utils.PREMIS_META,
    (
        'event_identifier',
        ('event_identifier_type', 'UUID'),
        ('event_identifier_value', str(uuid4()))
    )
))
>>> etree.tostring(p, pretty_print=True).decode('utf8')
'''<premis:event
    xmlns:premis="info:lc/xmlns/premis-v2"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    version="2.2"
    xsi:schemaLocation="info:lc/xmlns/premis-v2 http://www.loc.gov/standards/premis/v2/premis-v2-2.xsd">
    <premis:eventIdentifier>
        <premis:eventIdentifierType>UUID</premis:eventIdentifierType>
        <premis:eventIdentifierValue>f4b7758f-e7b2-4155-9b56-d76965849fc1</premis:eventIdentifierValue>
    </premis:eventIdentifier>
</premis:event>'''
metsrw.plugins.premisrw.premis.el_is_empty(el)[source]

Return True if tuple el represents an empty XML element.

metsrw.plugins.premisrw.premis.generate_element_class(tuple_instance)[source]

Dynamically create a sub-class of PREMISElement given tuple_instance, which is a tuple representing an XML data structure.

metsrw.plugins.premisrw.premis.get_attrs_to_paths(schema, attrs_to_paths=None, path=None)[source]

Analyze PREMIS-element-as-tuple schema and return a dict that maps attribute names to the simplified XPaths needed to retrieve them, e.g.,:

>>> {'object_identifier_type':
        'object_identifier/object_identifier_type',
     'object_identifier_value':
        'object_identifier/object_identifier_value'}
metsrw.plugins.premisrw.premis.get_event_type(data)[source]
metsrw.plugins.premisrw.premis.now()[source]
metsrw.plugins.premisrw.premis.premis_to_data(premis_lxml_el)[source]

Transform a PREMIS lxml._Element instance to a Python tuple.

metsrw.plugins.premisrw.premis.tuple_to_schema(tuple_)[source]

Convert a tuple representing an XML data structure into a schema tuple that can be used in the .schema property of a sub-class of PREMISElement.

metsrw.plugins.premisrw.premis.uuid()[source]