Archive
Utility for storing SAR image metadata in a database |
|
drop (delete) a scene database |
- class pyroSAR.archive.Archive(dbfile, custom_fields=None, postgres=False, user='postgres', password='1234', host='localhost', port=5432, cleanup=True, legacy=False)[source]
Bases:
SceneArchiveUtility for storing SAR image metadata in a database
- Parameters:
dbfile (
str) – the filename for the SpatiaLite database. This might either point to an existing database or will be created otherwise. If postgres is set to True, this will be the name for the PostgreSQL database.custom_fields (
dict[str,Any] |None) – a dictionary containing additional non-standard database column names and data types; the names must be attributes of the SAR scenes to be inserted (i.e. id.attr) or keys in their meta attribute (i.e. id.meta[‘attr’])postgres (
bool) – enable postgres driver for the database. Default: Falseuser (
str) – required for postgres driver: username to access the database. Default: ‘postgres’password (
str) – required for postgres driver: password to access the database. Default: ‘1234’host (
str) – required for postgres driver: host where the database is hosted. Default: ‘localhost’port (
int) – required for postgres driver: port number to the database. Default: 5432cleanup (
bool) – check whether all registered scenes exist and remove missing entries?legacy (
bool) – open an outdated database in legacy mode to import into a new database. Opening an outdated database without legacy mode will throw a RuntimeError.
Examples
Ingest all Sentinel-1 scenes in a directory and its subdirectories into the database:
>>> from pyroSAR import Archive, identify >>> from spatialist.ancillary import finder >>> dbfile = '/.../scenelist.db' >>> archive_s1 = '/.../sentinel1/GRD' >>> scenes_s1 = finder(archive_s1, [r'^S1.*.zip'], regex=True, recursive=True) >>> with Archive(dbfile) as archive: >>> archive.insert(scenes_s1)
select all Sentinel-1 A/B scenes stored in the database, which
overlap with a test site
were acquired in Ground-Range-Detected (GRD) Interferometric Wide Swath (IW) mode before 2018
contain a VV polarization image
have not been processed to directory outdir before
>>> from pyroSAR import Archive >>> from spatialist import Vector >>> archive = Archive('/.../scenelist.db') >>> site = Vector('/path/to/site.shp') >>> outdir = '/path/to/processed/results' >>> maxdate = '20171231T235959' >>> selection_proc = archive.select(vectorobject=site, processdir=outdir, >>> maxdate=maxdate, sensor=['S1A', 'S1B'], >>> product='GRD', acquisition_mode='IW', vv=1) >>> archive.close()
Alternatively, the with statement can be used. In this case to just check whether one particular scene is already registered in the database:
>>> from pyroSAR import identify, Archive >>> scene = identify('S1A_IW_SLC__1SDV_20150330T170734_20150330T170801_005264_006A6C_DA69.zip') >>> with Archive('/.../scenelist.db') as archive: >>> print(archive.is_registered(scene.scene))
When providing ‘postgres’ as driver, a PostgreSQL database will be created at a given host. Additional arguments are required.
>>> from pyroSAR import Archive, identify >>> from spatialist.ancillary import finder >>> dbfile = 'scenelist_db' >>> archive_s1 = '/.../sentinel1/GRD' >>> scenes_s1 = finder(archive_s1, [r'^S1.*.zip'], regex=True, recursive=True) >>> with Archive(dbfile, driver='postgres', user='user', password='password', host='host', port=5432) as archive: >>> archive.insert(scenes_s1)
Importing an old database:
>>> from pyroSAR import Archive >>> db_new = 'scenes.db' >>> db_old = 'scenes_old.db' >>> with Archive(db_new) as db: >>> with Archive(db_old, legacy=True) as db_old: >>> db.import_outdated(db_old)
- add_tables(tables)[source]
Add tables to the database per
sqlalchemy.schema.TableTables provided here will be added to the database.Note
Columns using Geometry must have setting management=True for SQLite, for example:
geometry = Column(Geometry('POLYGON', management=True, srid=4326))
- cleanup()[source]
Remove all scenes from the database, which are no longer stored in their registered location
- Return type:
- drop_element(scene, with_duplicates=False)[source]
Drop a scene from the data table. If the duplicates table contains a matching entry, it will be moved to the data table.
- export2shp(path, table='data')[source]
export the database to a shapefile
- Parameters:
path (
str) – the path of the shapefile to be written. This will overwrite other files with the same name. If a folder is given in path it is created if not existing. If the file extension is missing ‘.shp’ is added.table (
str) – the table to write to the shapefile; either ‘data’ (default) or ‘duplicates’
- Return type:
- filter_scenelist(scenelist)[source]
Filter a list of scenes by file names already registered in the database.
- move(scenelist, directory, pbar=False)[source]
Move a list of files while keeping the database entries up to date. If a scene is registered in the database (in either the data or duplicates table), the scene entry is directly changed to the new location.
- select(sensor=None, product=None, acquisition_mode=None, mindate=None, maxdate=None, vectorobject=None, date_strict=True, processdir=None, recursive=False, polarizations=None, return_value='scene', **kwargs)[source]
select scenes from the database
- Parameters:
acquisition_mode (
str|list[str] |None) – the sensor’s acquisition mode(s)mindate (
str|datetime|None) – the minimum acquisition date; strings must be in format YYYYmmddTHHMMSS; default: Nonemaxdate (
str|datetime|None) – the maximum acquisition date; strings must be in format YYYYmmddTHHMMSS; default: Nonevectorobject (
Vector|None) – a geometry with which the scenes need to overlap. The object may only contain one feature.date_strict (
bool) –treat dates as strict limits or also allow flexible limits to incorporate scenes whose acquisition period overlaps with the defined limit?
strict: start >= mindate & stop <= maxdate
not strict: stop >= mindate & start <= maxdate
processdir (
str|None) – A directory to be scanned for already processed scenes; the selected scenes will be filtered to those that have not yet been processed. Default: Nonerecursive (
bool) – (only if processdir is not None) should also the subdirectories of the processdir be scanned?polarizations (
list[str] |None) – a list of polarization strings, e.g. [‘HH’, ‘VV’]return_value (
str|list[str]) –the query return value(s). Options:
geometry_wkb: the scene’s footprint geometry formatted as WKB
geometry_wkt: the scene’s footprint geometry formatted as WKT
mindate: the acquisition start datetime in UTC formatted as YYYYmmddTHHMMSS
maxdate: the acquisition end datetime in UTC formatted as YYYYmmddTHHMMSS
all further database column names (see
get_colnames())
**kwargs (
Any) – any further arguments (columns), which are registered in the database. Seeget_colnames()
- Return type:
- Returns:
If a single return_value is specified: list of values for that attribute. If multiple return_values are specified: list of tuples containing the requested attributes. The return value type is bytes for geometry_wkb and str for all others.
- select_duplicates(outname_base=None, scene=None, value='id')[source]
Select scenes from the duplicates table. In case both outname_base and scene are set to None all scenes in the table are returned, otherwise only those that match the attributes outname_base and scene if they are not None.
- class pyroSAR.archive.SceneArchive(*args, **kwargs)[source]
Bases:
ProtocolCommon interface for scene archive backends.
Implementations may represent local databases, STAC catalogs, remote APIs, or other scene repositories, but should expose a consistent select method and support context-manager usage.
- __exit__(exc_type, exc_val, exc_tb)[source]
Exit the archive context and release resources if necessary.
- Return type:
- close()[source]
Release open resources.
Implementations that do not hold resources may implement this as a no-op.
- Return type:
- static select(sensor=None, product=None, acquisition_mode=None, mindate=None, maxdate=None, vectorobject=None, date_strict=True, return_value='scene')[source]
Select scenes matching the query parameters.
- Parameters:
sensor (
str|list[str] |None) – One sensor or a list of sensors.product (
str|list[str] |None) – One product type or a list of product types.acquisition_mode (
str|list[str] |None) – One acquisition mode or a list of acquisition modes.mindate (
str|datetime|None) – Minimum acquisition date/time.maxdate (
str|datetime|None) – Maximum acquisition date/time.date_strict (
bool) – Whether date filtering should be strict.return_value (
str|list[str]) – One return field or a list of return fields.**kwargs – Backend-specific optional query arguments.
- Return type:
- Returns:
The query result. Implementations may return a list of scalar values or tuples depending on return_value.
- pyroSAR.archive.drop_archive(archive)[source]
drop (delete) a scene database
- Parameters:
archive (
Archive) – the database to be deleted
See also
sqlalchemy_utils.functions.drop_database()- Return type:
Examples
>>> pguser = os.environ.get('PGUSER') >>> pgpassword = os.environ.get('PGPASSWORD')
>>> db = Archive('test', postgres=True, port=5432, user=pguser, password=pgpassword) >>> drop_archive(db)