premisrw API Documentation

premisrw is a library to help with parsing and creating PREMIS elements.

Conversion functions

metsrw.plugins.premisrw.premis.data_to_premis(data, premis_version='2.2')[source]

Given tuple data representing a PREMIS entity (object, event or agent), return an lxml.etree._Element instance. E.g.,:

>>> p = data_to_premis((
    'event',
    utils.PREMIS_META,
    (
        'event_identifier',
        ('event_identifier_type', 'UUID'),
        ('event_identifier_value', str(uuid4()))
    )
))
>>> etree.tostring(p, pretty_print=True).decode('utf8')
'''<premis:event
    xmlns:premis="info:lc/xmlns/premis-v2"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    version="2.2"
    xsi:schemaLocation="info:lc/xmlns/premis-v2 http://www.loc.gov/standards/premis/v2/premis-v2-2.xsd">
    <premis:eventIdentifier>
        <premis:eventIdentifierType>UUID</premis:eventIdentifierType>
        <premis:eventIdentifierValue>f4b7758f-e7b2-4155-9b56-d76965849fc1</premis:eventIdentifierValue>
    </premis:eventIdentifier>
</premis:event>'''
metsrw.plugins.premisrw.premis.premis_to_data(premis_lxml_el)[source]

Transform a PREMIS lxml._Element instance to a Python tuple.

PREMIS Element Types

class metsrw.plugins.premisrw.premis.PREMISElement(**kwargs)[source]

Bases: object

Abstract base class for PREMIS object, event and agent classes. These classes must implement schema and defaults properties. After that, initalization can proceed either by passing a data kwarg to the class or by passing keyword arguments implicit in the element tag names of self.schema, e.g.,:

>>> premis_obj = PREMISObject(data=('object', {...}, (...)))
>>> premis_obj = PREMISObject(
    identifier_type='UUID',
    identifier_value='9bf6bcf8-4d77-4623-a9fb-b703365d0ffe',
    ...)

Under the first construction approach, the tuple passed as data becomes the source of truth for the PREMIS element. Under the second construction approach, the kwargs are used to construct a data tuple that becomes the source of truth. This tuple can be accessed via the .data property.

property attributes

Return a dict that maps normalized XML attributes to their values, e.g., ‘xsi_schema_location’ and ‘schema_location’ would be keys for the value of the xsi:schemaLocation PREMIS XML attribute.

property attrs_to_paths

Return a dict that maps valid getter attributes to the simplified XPaths needed to get the corresponding values from self.data.

This property analyzes self.schema and sets self._attrs_to_paths to a dict that maps implicit getters like ‘agent_identifier_value’ and ‘identifier_value’ to the XPaths implicit in self.schema. In the case of PREMISAgent, the above two getters would map to the XPath ‘agent/agent_identifier/agent_identifier_value’. PREMISAgent.schema also implies the getters ‘agent_identifier’ and ‘identifier’, which both map to the XPath ‘agent/agent_identifier’ and which should return a tuple (or list thereof) instead of a string.

property data
abstract defaults()[source]

Return a dict that maps implicit getter attributes (implicit in self.schema) to default values or to callables that return default values. For example, see PREMISObject.defaults.

find(path)[source]
find_text_or_all(path)[source]
findall(path)[source]
findtext(path)[source]
classmethod fromtree(tree)[source]

Create a PREMIS from an _Element.

generate_data()[source]

Generate and return a tuple to assign to self._data, which is the source of truth of the PREMIS XML element. Expects self._xml_element_values and self._xml_attribute_values to be dicts populated with XML element text values and XML attribute values, respectively.

abstract schema()[source]

Return a tuple representing the schema of the PREMIS element. This tuple schema determines the available getters and setters (during initialization) of the subclass.

serialize()[source]
tostring(pretty_print=True, encoding='UTF-8')[source]
class metsrw.plugins.premisrw.premis.PREMISObject(**kwargs)[source]

Bases: PREMISElement

property defaults

Return a dict that maps implicit getter attributes (implicit in self.schema) to default values or to callables that return default values. For example, see PREMISObject.defaults.

property schema

Return a tuple representing the schema of the PREMIS element. This tuple schema determines the available getters and setters (during initialization) of the subclass.

class metsrw.plugins.premisrw.premis.PREMISEvent(**kwargs)[source]

Bases: PREMISElement

property compression_details

Return as a 3-tuple, this PREMIS compression event’s program, version, and algorithm used to perform the compression.

property defaults

Return a dict that maps implicit getter attributes (implicit in self.schema) to default values or to callables that return default values. For example, see PREMISObject.defaults.

property encryption_details

Return as a 3-tuple, this PREMIS encryption event’s program, version, and key used to perform the encryption.

get_decompression_transform_files(offset=0)[source]

Returns a list of dicts representing <mets:transformFile> elements with TRANSFORMTYPE="decompression" given compression_algorithm which is a comma-separated string of algorithms that must be used in the order provided to decompress the package, e.g., ‘bzip2,tar’ or ‘lzma’.

get_decryption_transform_file()[source]

Returns a dict representing a <mets:transformFile> element with TRANSFORMTYPE="decryption".

property parsed_event_detail

Parse and return our PREMIS eventDetail string value like:

'program="7z"; version="9.20"; algorithm="bzip2"'

and return a dict like:

{'algorithm': 'bzip2', 'version': '9.20', 'program': '7z'}
property schema

Return a tuple representing the schema of the PREMIS element. This tuple schema determines the available getters and setters (during initialization) of the subclass.

class metsrw.plugins.premisrw.premis.PREMISAgent(**kwargs)[source]

Bases: PREMISElement

property defaults

Return a dict that maps implicit getter attributes (implicit in self.schema) to default values or to callables that return default values. For example, see PREMISObject.defaults.

property schema

Return a tuple representing the schema of the PREMIS element. This tuple schema determines the available getters and setters (during initialization) of the subclass.

class metsrw.plugins.premisrw.premis.PREMISRights(**kwargs)[source]

Bases: PREMISElement

property defaults

Return a dict that maps implicit getter attributes (implicit in self.schema) to default values or to callables that return default values. For example, see PREMISObject.defaults.

property schema

Return a tuple representing the schema of the PREMIS element. This tuple schema determines the available getters and setters (during initialization) of the subclass.