Reading METS files ~~~~~~~~~~~~~~~~~~ metsrw supports reading METS files from disk, from strings, or from lxml_ `_Element` or `_ElementTree` objects. .. testcode:: # From a file on disk mets = metsrw.METSDocument.fromfile("../fixtures/complete_mets.xml") # From bytes mets_str = b""" """ mets = metsrw.METSDocument.fromstring(mets_str) # From an lxml object tree = lxml.etree.parse("../fixtures/complete_mets.xml") mets = metsrw.METSDocument.fromtree(tree) Accessing METS Data ------------------- To retrieve an :class:`metsrw.FSEntry`, use the :meth:`~metsrw.METSDocument.get_file` or :meth:`~metsrw.METSDocument.all_files` methods. .. doctest:: >>> mets = metsrw.METSDocument() >>> file_uuid = str(uuid.uuid4()) >>> file_1 = metsrw.FSEntry( ... label="hello.pdf", path="test/hello.pdf", type="Item", ... file_uuid=file_uuid) >>> mets.append_file(file_1) >>> mets.get_file(file_uuid=file_uuid) FSEntry(type='Item', path='test/hello.pdf', use='original', ...) >>> mets.all_files() {FSEntry(type='Item', path='test/hello.pdf', use='original', ...)} # Currently, filtering files can only be done via iteration >>> [entry for entry in mets.all_files() if entry.use == "original"] [FSEntry(type='Item', path='test/hello.pdf', use='original', ...)] `amdSec` and `dmdSec` data is accessible via the :attr:`~metsrw.FSEntry.amdsecs` and :attr:`~metsrw.FSEntry.dmdsecs` properties. .. doctest:: >>> mets = metsrw.METSDocument.fromfile('../fixtures/complete_mets.xml') >>> fsentry = mets.get_file(file_uuid="ab5c67fc-8f80-4e46-9f20-8d5ae29c43f2") >>> amdsec1 = fsentry.amdsecs[0] >>> [section for section in amdsec1.subsections if section.subsection == 'techMD'] [] >>> fsentry.dmdsecs[0] .. note:: In most cases, you'll want to access PREMIS data via the `get_premis` series of methods, rather than accessing the `amdSec` or `dmdSec` data directly. See `Accessing PREMIS Data`_ for more info. Accessing PREMIS Data --------------------- To access PREMIS_ metadata associated with a file, use the following methods: * :meth:`~metsrw.FSEntry.get_premis_objects` * :meth:`~metsrw.FSEntry.get_premis_events` * :meth:`~metsrw.FSEntry.get_premis_agents` * :meth:`~metsrw.FSEntry.get_premis_rights` .. doctest:: # Currently, filtering PREMIS objects can only be done via iteration >>> ingestion_events = [] >>> mets = metsrw.METSDocument.fromfile('../fixtures/complete_mets.xml') >>> for fsentry in mets.all_files(): ... for event in fsentry.get_premis_events(): ... if event.type == "ingestion": ... ingestion_events.append(event) >>> ingestion_events[0] ('event', ...) .. _lxml: https://lxml.de/index.html .. _PREMIS: https://www.loc.gov/standards/premis/v3/index.html