Using the GFW API [update]

grabbing forest loss information, more efficiently

April 30, 2014

We recently posted an article about collecting information about forest cover loss using the GFW API. There, I said that the GFW API could not support multipolygons. That was true. It is true no longer. We have added support for multipolygons. This greatly simplifies the post, which spent an inordinate amount of time ripping apart multipolygons into their polygon components and then stitching them back together.

The cross-over code didn't change much, specifically reading in the data and filtering it. Suppose we have a map of administrative boundaries of Indonesia from GADM, which we will call map.geojson and store in the data subdirectory. We read in the data and set up a function to filter out only those subprovinces that we want. We need to supply a province name and a subprovince name.

import itertools
import requests
import json
import pandas as pd

 def _read_data(data='data/map.geojson'):
    with open(data) as json_file:
        x = json.load(json_file)
        return x['features']

def _filter_admin(prov, sub, data='data/map.geojson'):
    polys = _read_data(data)

    def _spec_filter(xx):
        x = xx['properties']
        return (x['NAME_1'] == prov) & (x['NAME_2'] == sub)

    return filter(_spec_filter, polys)

Once we have the features we want, we just build the request from a parameter dictionary. Specifically, we are collecting information for a particular multipolygon (which subsumes a standard, single-hull polygon) and the difference in forest loss from one year to the previous year.

def _params(geom, year):
    x = json.dumps(geom['geometry'])
    return {"begin": year-1, "end": year, "geom": x}

The final function is just building a dictionary with the keys as variable names and the values are the associated attributes, making it exceedingly easy to combine the entries into a Pandas data frame.

 def _grab_loss(geom, year):
    endpoint = 'http://gfw-apis.appspot.com/datasets/umd'
    res = requests.post(endpoint, data=_params(geom, year))
    return res.json()['loss']

def _process_entry(entry):
    n1 = entry['properties']['NAME_1']
    n2 = entry['properties']['NAME_2']

    def _res_dict(e, y):
        loss = _grab_loss(e, y)
        return {'prov':n1, 'sub':n2, 'year':y, 'loss':loss}

    return [_res_dict(entry, yr) for yr in range(2001,2013)]

Now we just put all the component functions together to generate a results dictionary for each year and each subprovince of interest.

def process_prov(prov_name):
    x = map(_process_entry, _filter_admin(prov_name))
    #flatten list of dictionaries
    data = list(itertools.chain(*x)) 

    return pd.DataFrame(data)

So what does the output look like? Check it:

>>> xx = process_prov("Jambi")
>>> print xx[0:10]
         loss   prov      subprov  year
  2168.218476  Jambi  Batang Hari  2001
  6929.433465  Jambi  Batang Hari  2002
  6954.091027  Jambi  Batang Hari  2003
 13053.900478  Jambi  Batang Hari  2004
 24139.994766  Jambi  Batang Hari  2005
 34024.525262  Jambi  Batang Hari  2006
 51258.153696  Jambi  Batang Hari  2007
 51904.049130  Jambi  Batang Hari  2008
 39439.839620  Jambi  Batang Hari  2009
 19894.246405  Jambi  Batang Hari  2010

Boom.

Subscribe to this blog

Using the GFW API [update]

grabbing forest loss information, more efficiently

Share this article with friends