geometamaker package

Submodules

geometamaker.geometamaker module

Module contents

class geometamaker.MetadataControl(source_dataset_path=None)[source]

Bases: object

Encapsulates the Metadata Control File and methods for populating it.

A Metadata Control File (MCF) is a YAML file that complies with the MCF specification defined by pygeometa. https://github.com/geopython/pygeometa

datasource

path to dataset to which the metadata applies

Type:

string

mcf

dict representation of the Metadata Control File

Type:

dict

Create an MCF instance, populated with properties of the dataset.

The MCF will be valid according to the pygeometa schema. It has all required properties. Properties of the dataset are used to populate as many MCF properties as possible. Default/placeholder values are used for properties that require user input.

Instantiating without a source_dataset_path creates an MCF template.

Parameters:

source_dataset_path (string) – path or URL to dataset to which the metadata applies

get_abstract()[source]

Get the abstract for the dataset.

get_band_description(band_number)[source]

Get the attribute metadata for a band.

Parameters:

band_number (int) – a raster band index, starting at 1

Returns:

dict

get_citation()[source]

Get the citation for the dataset.

get_contact(section='default')[source]

Get metadata from a contact section.

Parameters:

section (str) – a header for the contact section under which to apply the other args, since there can be more than one.

Returns:

A dict or None if section does not exist.

get_doi()[source]

Get the doi for the dataset.

get_edition()[source]

Get the edition of the dataset.

Returns:

str or None if edition does not exist.

get_field_description(name)[source]

Get the attribute metadata for a field.

Parameters:

name (str) – name and unique identifier of the field

Returns:

dict

get_keywords(section='default')[source]
get_license()[source]

Get license for the dataset.

Returns:

dict or None if license does not exist.

get_lineage()[source]

Get the lineage statement of the dataset.

Returns:

str or None if lineage does not exist.

get_purpose()[source]

Get purpose for the dataset.

Returns:

str or None if purpose does not exist.

get_title()[source]

Get the title for the dataset.

get_url()[source]

Get the url for the dataset.

set_abstract(abstract)[source]

Add an abstract for the dataset.

Parameters:

abstract (str) –

set_band_description(band_number, name=None, title=None, abstract=None, units=None, type=None)[source]

Define metadata for a raster band.

Parameters:
  • band_number (int) – a raster band index, starting at 1

  • name (str) – name for the raster band

  • title (str) – title for the raster band

  • abstract (str) – description of the raster band

  • units (str) – unit of measurement for the band’s pixel values

  • type (str) – of the band’s values, either ‘integer’ or ‘number’

set_citation(citation)[source]

Add a citation string for the dataset.

Parameters:

citation (str) –

set_contact(organization=None, individualname=None, positionname=None, email=None, section='default', **kwargs)[source]

Add a contact section.

Parameters:
  • organization (str) – name of the responsible organization

  • individualname (str) – name of the responsible person

  • positionname (str) – role or position of the responsible person

  • email (str) – email address of the responsible organization or individual

  • section (str) – a header for the contact section under which to apply the other args, since there can be more than one.

  • kwargs (dict) – key-value pairs for any other properties listed in the contact section of the core MCF schema.

set_doi(doi)[source]

Add a doi string for the dataset.

Parameters:

doi (str) –

set_edition(edition)[source]

Set the edition for the dataset.

Parameters:

edition (str) – version of the cited resource

set_field_description(name, title=None, abstract=None, units=None, type=None)[source]

Define metadata for a tabular field.

Parameters:
  • name (str) – name and unique identifier of the field

  • title (str) – title for the field

  • abstract (str) – description of the field

  • units (str) – unit of measurement for the field’s values

set_keywords(keywords, section='default', keywords_type='theme', vocabulary=None)[source]

Describe a dataset with a list of keywords.

Keywords are grouped into sections for the purpose of complying with pre-exising keyword schema. A section will be overwritten if it already exists.

Parameters:
  • keywords (list) – sequence of strings

  • section (string) – the name of a keywords section

  • keywords_type (string) – subject matter used to group similar keywords. Must be one of, (‘discipline’, ‘place’, ‘stratum’, ‘temporal’, ‘theme’)

  • vocabulary (dict) – a dictionary with ‘name’ and ‘url’ (optional) keys. Used to describe the source (thesaurus) of keywords

Raises:

ValidationError

set_license(name=None, url=None)[source]

Add a license for the dataset.

Either or both name and url are required if there is a license. Call with no arguments to remove access constraints and license info.

Parameters:
  • name (str) – name of the license of the source dataset

  • url (str) – url for the license

set_lineage(statement)[source]

Set the lineage statement for the dataset.

Parameters:

statement (str) – general explanation describing the lineage or provenance of the dataset

set_purpose(purpose)[source]

Add a purpose for the dataset.

Parameters:

purpose (str) – description of the purpose of the source dataset

set_title(title)[source]

Add a title for the dataset.

Parameters:

title (str) –

set_url(url)[source]

Add a url for the dataset.

Parameters:

url (str) –

to_string()[source]
validate()[source]

Validate MCF against a jsonschema object.

write(workspace=None)[source]

Write MCF and ISO-19139 XML to disk.

This creates sidecar files with ‘.yml’ and ‘.xml’ extensions appended to the full filename of the data source. For example,

  • ‘myraster.tif’

  • ‘myraster.tif.yml’

  • ‘myraster.tif.xml’

Parameters:

workspace (str) – if None, files write to the same location as the source data. If not None, a path to a local directory to write files. They will still be named to match the source filename. Use this option if the source data is not on the local filesystem.