Digital Geochemical Data Infrastructure (DIGIS)
Digital Geochemical Data Infrastructure (DIGIS) (DFG)
The data ingestion into DIGIS relies heavily on the manual interpretation and categorization of papers for the GEOROC database. During this process, methodical descriptions and chemical and geographical information are extracted and mapped onto the GEOROC metadata format. Expert knowledge is needed to interpret the content of publications, categorize the data types according to the data systematics of the GEOROC database, and identify critical metadata. The main objective of this project is to support this process with an infrastructure prototype for information extraction. For this purpose, different approaches from the fields of NLP and language models are analysed and compared.
Role | Agency | Description |
---|---|---|
Principal Investigator | DFG | Novel apporach for data extraction and text & datamining in DIGIS |