Methodology
How the current public release is structured.
The database is designed around retained literature, structured site entities, normalized contaminant names,
and record-level monitoring values that stay traceable back to their source references.
Scope
What is kept in the public database
- Site-resolved monitoring evidence connected to retained references.
- Normalized contaminant entities used across records and studies.
- Record-level concentration values with matrix and sampling context.
- Public-facing summaries intended for exploration rather than raw ingestion logs.
Data model
Core entities used by the platform
- References represent retained studies and evidence context.
- Sites represent monitoring locations or site-level location groupings.
- Pollutants represent normalized contaminants and abbreviations.
- Monitoring records connect site, contaminant, reference, matrix, and value fields.
Daily update logic
Rules used for the scheduled ingestion pipeline
- The search term is centered on emerging contaminants.
- The public daily scope keeps water-environment evidence only.
- Only measured concentration data are eligible for retention.
- Up to five papers per day are targeted, with quality prioritized over volume.
Interpretation
How to read the current release
- The homepage is presentation-focused and not a substitute for the portal.
- The database should be used for browsing structured evidence and linked records.
- Latest additions summarize recent retained changes, not every candidate screened.
- Reference-linked visibility is preferred over opaque aggregate statistics.
Next step
Move from methodology into the data portal.
Use the database route when you want the actual explorer rather than the product overview.