MODIS HDF data extraction in R

The Moderate-resolution imaging spectroradiometer (MODIS) sensors on both Aqua and Terra platforms provide a wealth of (environmental) data. However, manipulating the raw files can be challenging. Although Google Earth Engine provides an easier way to access these data, as most of the MODIS products are hosted, sometimes direct manipulation is still necessary. Here I quickly outline how to extract data from raw MODIS files using R.


To download extract data from a MODIS HDF file in R you need a few packages. Mainly:
– raster
– sp

After installing these packages make sure you load them to continue.

Finding your way on the globe

First you have to determine where your data is located within the tiled format MODIS data is distributed in. All tiles are denoted with a horizontal (h) and a vertical index (v). For the Land Products there are roughly ~350 or so tiles with varying degrees of actual land coverage. All MODIS data are distributed in a custom sinusoidal projection, hence the rather weird looking shape of the map.


In order to find the right tile we load a coordinate into R. In this case it’s the location of Harvard Yard, or 42.375028, -71.116493 latitude and longitude respectively.

Once you have the horizontal and vertical tiles you can download only these tiles instead of the entire archive.

Extracting data

As previously mentioned, data is stored in HDF files. These files are readily readable by the GDAL library. However, given the structured nature of these files you have to know what you are looking for. These layers in structured HDF files are called scientific data sets (SDS). You need to specify one to read them individually.

Now you have succesfully read in data. However, this data is projected using the sinusoidal projection, not the lat-long. In order to extract data at given locations we have to translate the coordinates of these extraction points to the MODIS sinusoidal projection.

Finally extract the data at the new sinusoidal coordinates.