Using algorithms and Tomnod to create the most complete and accurate populations maps

03 September 2015 by DigitalGlobe Image Mining Team

Numerous parts of the world are still affected by infectious but curable diseases, and nongovernmental organizations (NGOs) are fighting it by large-scale vaccination campaigns. To efficiently plan these campaigns over large areas (often spanning 200,000 km2; that’s half the size of California) and maximize the success of vaccine distribution, accurate population density maps are needed. Instead of sending people out on foot to map these areas, which can be expensive, potentially dangerous and take a long time, NGOs are taking advantage of a combination of high resolution satellite imagery, automated algorithms, crowdsourcing and human insight about population estimations.

DigitalGlobe and its “High-res Urban Globe” scientists team (codename: HUG ) have developed a unique combination of machine learning and image processing techniques for analyzing satellite imagery and characterizing human settlements at large scale. Thanks to efficient algorithms and cloud based technologies these automatic techniques are able to analyze tera-bytes of image data in a couple of hours.

Tomnod users have wondered about the automatic technology under the hood, so the purpose of this post is to provide more insights about the HUG technology and its interactions with Tomnod crowd-sourcing. The technology acts in three successive stages during the analysis:

Fig1. The villages are automatically extracted from imagery in the first place. Still some errors persist because of the algorithms limitations.
Fig1. The villages are automatically extracted from imagery in the first place. Still some errors persist because of the algorithms limitations.
Fig.2. All the potential buildings falling into the villages are automatically extracted. The extraction is not perfect, but is accurate enough for further estimating population.
Fig.2. All the potential buildings falling into the villages are automatically extracted. The extraction is not perfect, but is accurate enough for further estimating population.
Fig3. Buildings are used to derive population densities in cells of 50x50m2 by attributing people per buildings and spatial aggregation.
Fig3. Buildings are used to derive population densities in cells of 50x50m2 by attributing people per buildings and spatial aggregation.

A major challenge encountered with this automatic technology is the precision of the detected villages. While the algorithm detects potential villages covering generally about 5% of the full area of interest, between one quarter and half of these 5% are errors and do not cover human settlements. Thanks to the Tomnod Crowd, these errors can be removed efficiently by empowering the yet unmatched human intelligence, thus bringing the population estimation to a level unequalled.

Fig.4. Polygons are crowdsourced in Tomnod. The invalidated village ares removed (red ones), while the good ones are retained (green ones). This step allows to get rid of errors and to enhance the subsequent detection of buildings and population estimation.
Fig.4. Polygons are crowdsourced in Tomnod. The invalidated village ares removed (red ones), while the good ones are retained (green ones). This step allows to get rid of errors and to enhance the subsequent detection of buildings and population estimation.

The automation brings scalability by analyzing large areas in a very short period of time. It significantly reduces the area to be focused on (ie. Areas that likely have buildings). The use of Tomnod provides the precision we need by further refining the area from “likely having buildings” to a high confidence area where buildings are indeed present. This unique combination between crowd analysis and automation is paving the way for extracting geospatial information at large scale with a high level completeness and precision.

Recently, Tomnod users allowed us to produce population density maps in two regions of Pakistan and Afghanistan (total area: 200,000 km2) which greatly improved the current understanding of population distribution in those areas (See Fig.6.). During this project, 300 image strips were analyzed producing millions of potential villages validated through continuous Tomnod crowdsourcing campaigns. Then all the retained polygons were plugged back into the automatic process, providing village boundaries and spatially precise population density maps.

Fig.5. The region of Sindh, Pakistan was analyzed with Tomnod and HUG technology. For display, population densities were aggregated within the detected villages, and the villages were color coded given their population counts.
Fig.5. The region of Sindh, Pakistan was analyzed with Tomnod and HUG technology. For display, population densities were aggregated within the detected villages, and the villages were color coded given their population counts.
Fig.6. Left. Part of Karachi, Pakistan is displayed. Middle. The best available knowledge available, represented as kilometric population density. Whiter means more people. The data is an extract of the global dataset Landscan produced by ORNL. Right. An improved population density map in cells of 50 meters, which is a part of the density map produced over Sindh. The map is overlaid on the kilometric population map, showing the spatial enhancement achieved by the combination of HUG technologies and crowd sourcing.
Fig.6. Left. Part of Karachi, Pakistan is displayed. Middle. The best available knowledge available, represented as kilometric population density. Whiter means more people. The data is an extract of the global dataset Landscan produced by ORNL. Right. An improved population density map in cells of 50 meters, which is a part of the density map produced over Sindh. The map is overlaid on the kilometric population map, showing the spatial enhancement achieved by the combination of HUG technologies and crowd sourcing.

The world is facing new problems with the increase of worldwide population, with migratory fluxes and diseases outbreaks. The need for accurate population maps is steadily growing amongst NGO and governmental organizations for making informed and smart decisions.

“The contribution of Tomnod users is essential for refining the population maps that we extract from DigitalGlobe’s satellite imagery and for providing actionable maps for NGOs.” said Lionel Gueguen, scientist at DigitalGlobe Inc.

The HUG team would like to thank the incredible work of the crowd and count on Tomnod users in the future to help make the world a better place.

This work was done by the DigitalGlobe Image Mining team which daily pushes technology to transform pixels into information.