Open Source Reference Data Integration with Go.Data (i.e. Country administrative units, health facilities)
Importing administrative locations (e.g., Admin-Level-2
province data) and different Location
reference data are typically the first steps in setting up a new Outbreak
in Go.Data. Importing locations from shared reference sources are important to strengthening interoperability and ensuring information can be easily synced and mapped in other systems.
- Administrative units are captured as
Location
resources in the Go.Data pplication. All other location data (e.g., Health Facilities) should be captured asReference Data
in Go.Data. - Location data can be extracted from external shared references (see HDX and other external data source examples) and uploaded via the standard Go.Data file import feature OR via direct integration to the Go.Data API endpoint
/locations
. - In Go.Data
Location
andReference Data
should be imported before data collection can begin.
Resources for importing Location data
- See p. 33 of the Implementation Guide for more information and step-by-step guidance for
Location
data imports. - See external data source examples for links to location data & reference sources.
- See the
dhis2godata
Org Unit Converter for example script that will export admin units from a DHIS2 instance for import to Go.Data. - See the API adaptor language-godata example or check out the API explorer.
Other types of location reference data (i.e., health facilities)
In this reference implementation (below docs), we demonstrate how Go.Data Reference Data
may be synced via APIs between Go.Data and an external web-based data source. While this implementation integrates HealthSites
data, a similar integration approach may be applied to integrate other data sources with an available web API.**
Use Case: #5. As a Go.Data analyst, I would like to import health facility lists from standard registries and external data sources so that I can more easily exchange information with the MOH and other partners.
Solution Overview
In this reference implementation, we integrated a health facility list extracted from the http://healthsites.io/ API with the Go.Data reference-data
category Health Centres
. See below for the data flow diagram.
- In Go.Data
reference-data
must be imported before data collection can begin so that any these can be selected in any data collection or questionnaire form variables. - HealthSites.io is an open source repository of health facility data built in partnership with Open Street Map. It contains several facility lists and geolocation details and provides a REST API that supports data extraction in JSON format.
- Here we leverage the free-tier OpenFn integration platform to automate the data integration flow & quickly map Kobo data elements to Go.Data. See Explore OpenFn to learn more and explore the live project.
Implementation Steps
–>Watch to learn about the solution setup
- We first identified the external data source to be integrated, data availability, and available APIs - see HealthSites country data.
- Once we’ve identified the specific facility we’d like to integrate, we then determine how to export the data from the source. See here for an example OpenFn job script where we send a
GET
HTTP request to/api/v2/facilities
to list a health facilities forBangladesh
in HealthSites.io.GET '/api/v2/facilities/?api-key=NNNNN&page=1&country=Bangladesh'
We refer to the data source’s API docs to determine how to make this HTTP request and apply relevant filters like
country
. - Analyze the response to #2 to determine the appropriate unique identifiers to matching with Go.Data
reference-data
and to use as an external identifier for future duplicate prevention. For this implementation, we chose to usename
to map to existingHealth Centres
, but you might also consider using a unique location id or geodata codes (see the Unique Identifiers section for more on this design topic).
See here for the full JSON response to the GET
request made to the API in step 2.
{
"0": {
"attributes": {
"amenity": "hospital",
"changeset_id": 22058971,
"changeset_timestamp": "2014-05-01T07:56:20",
"changeset_user": "Md Alamgir",
"changeset_version": 1,
"name": "Manikchari Upazila Health Complex",
"uuid": "4598ac8e8c6d47a4a3b95d0806ca4a5d"
},
"centroid": {
"coordinates": [
91.84117813480628,
22.8501898435141
],
"type": "Point"
},
"completeness": 9,
"osm_id": 2828406228,
"osm_type": "node"
},
-
We then mapped relevant data elements from the HealthSites response to Go.Data
reference-data
. See example mapping specification. -
We then drafted another OpenFn integration script (or “job” - see here) to automate the data integration mapping of data points between HealthSites and Go.Data (see below snippet).
const data = { //mapping attributes id: `LNG_REFERENCE_DATA_CATEGORY_CENTRE_NAME_${name}`, //godataVariable: sourceValue, categoryId: 'LNG_REFERENCE_DATA_CATEGORY_CENTRE_NAME', //godata reference-data Id value: attributes.name, //map from HealthSites.io ... code: attributes.uuid, active: true, readOnly: false, outbreakId: '3b5554d7-2c19-41d0-b9af-475ad25a382b', description: 'hospital', name: attributes.name, };
In this second job, we perform an “upsert” pattern via the Go.Data API where we (1) check for existing facilities by searching Go.Data
reference-data
categoryHealth Centre
records using HealthSitename
to create a matchingid
(e.g.,LNG_REFERENCE_DATA_CATEGORY_CENTRE_NAME_HOSPITAL_1
) as an external identifier, and then (2) create/update thereference-data
records (sendPOST
/PUT
request) depending on whether a match was found.upsertReferenceData('id', { //where id is reference-data-catefory unique identifier data, //object where we'll specify healthsite-to-godata mappings })
See upsertReferenceData(...)
function in the Go.Data API adaptor
External Data Sources
See below and p.33 of the Implementation Guide for other data sources you might consider integrating with to automatically register shared Location
records in Go.Data.
HDX
– a clearinghouse of humanitarian open-source data. Included for many countries is the administrative unit boundaries which typically has a unique ID (Pcode) and in some cases this will also include additional data such as “population” that can be used in post analysis. https://data.humdata.org/
2.WHO
– Global coverage of administrative unit boundaries, has centroid and Pcode for unique ID – so could be joined back to GIS data afterwards but only goes to Adm2 and in some countries only ADM 1. https://polioboundaries-who.hub.arcgis.com/
-
Geonames
– has ID, lat/lon and names – may or may not be able to match up to any GIS outside of Go.Data. http://download.geonames.org/export/dump/ -
HealthSites.io
- open source repository of health facility data built in partnership with Open Street Map http://healthsites.io/
Explore the Implementation
- See the Explore OpenFn page to explore the jobs on the live reference project.
- Watch the video overview
- Learn about the solution setup
- And see here for other interoperability videos.
-
HealthSites.io: See here for the API docs and instructions for creating your own OpenStreetMap account to access the data source via the API.
-
Job scripts: See the Github
interoperability-jobs
to explore the source code used to automate these flows. These leverage an open-source Go.Data API wrapper - the OpenFn adaptorlanguage-godata
. - Solution Design Documentation: See this folder] for the data flow diagram & data element mapping specifications mentioend above and used to write the integration jobs.