Methodology

How the current public release is structured.

The database is designed around retained literature, structured site entities, normalized contaminant names, and record-level monitoring values that stay traceable back to their source references.

Scope

What is kept in the public database

Site-resolved monitoring evidence connected to retained references.
Normalized contaminant entities used across records and studies.
Record-level concentration values with matrix and sampling context.
Public-facing summaries intended for exploration rather than raw ingestion logs.

Data model

Core entities used by the platform

References represent retained studies and evidence context.
Sites represent monitoring locations or site-level location groupings.
Pollutants represent normalized contaminants and abbreviations.
Monitoring records connect site, contaminant, reference, matrix, and value fields.

Daily update logic

Rules used for the scheduled ingestion pipeline

The search term is centered on emerging contaminants.
The public daily scope keeps water-environment evidence only.
Only measured concentration data are eligible for retention.
Up to five papers per day are targeted, with quality prioritized over volume.

Interpretation

How to read the current release

The homepage is presentation-focused and not a substitute for the portal.
The database should be used for browsing structured evidence and linked records.
Latest additions summarize recent retained changes, not every candidate screened.
Reference-linked visibility is preferred over opaque aggregate statistics.

Next step

Move from methodology into the data portal.

Use the database route when you want the actual explorer rather than the product overview.

Explore Database Back Home