Small changes to this blog’s URL and hosting

I’ve long wanted to inject more live into my web presence and this blog in particular. It’s work in progress – one of my main sites is currently not even live! Wherever we may end up though, informal writing about science and the changing northern environment is something I want to do more of. So I’ve made two changes to give this site more oomph:

  • I’ve added a custom domain,, to this (currently) WordPress-hosted site. All URLs will work fine, whether you access them via and the new This is, unless and until I may decide in the future to move the site off, but that’s not in the stars right now.
  • I’ve moved to the smallest paid account to remove ads from the site. Only when I happened to access it from a browser that was not logged into did I see how intrusive the ads would be. It’s not much money, and I hope will make for more comfortable reading.

The Nugget Creek fire, and how fires are managed in Alaska

The numerous wildfires currently active in interior Alaska have made it a pretty miserable last day of the holiday weekend (Independence Day was Thursday in the US). The Fairbanks area is blanketed in smoke so badly that there’s a dense smoke advisory in effect and the local hospital has set up a smoke respite room for people with respiratory issues.

Most eyes are on the Shovel Creek fire NW of Fairbanks, which is burning very close to residential neighborhoods in mature black spruce forest, which naturally burns every ~100 years or so (plus/minus a large interval). It is managed by a Type 1 team – these are nationally managed in the US, and have the largest amount of experience and resources. There have been regular public information meetings, and the fire even has its own YouTube channel.

I live about 30 miles east of town, and over here, even though smoke these days comes from any of multiple fires in the wider area, the one we’re most concerned with is the Nugget Creek fire. It’s roughly the same order of magnitude as the Shovel Creek fire (a little smaller – 7000 acres / 2800 ha as of today, and growing – but unlike its companion, has not received any suppression activity at all, even though it’s only about 10 miles (16 km) from a residential area.

Why? This has to do with how decisions about fire management are made in Alaska. All of Alaska is divided into one of four fire management options. If you follow the link you’ll see that most of the remote areas are in the “limited” option. Here, fire is by default allowed to fulfill its natural ecological function, though known cabins and infrastructure is being protected by firefighters on the ground and from planes. Also, reconnaissance flights are being carried out, and fires are monitored. On the other extreme is the “critical” management option, where all fires are immediately suppressed. Towns and villages are covered by these. In-between, there are two intermediate options, “full” (covering the major road corridors, mining areas etc.) and “modified”. (These options are used for prioritization – there is no mechanical

The Nugget Creek fire started in the Chena River State Recreation Area, in a location that is in the “limited management option. Here’s a map of the fire and how it is just nestled into the green, “limited” area. The fire is in red on the right, with the spots indicating where active fire was detected by satellite imagery. The fire has been pretty active on nearly its entire perimeter.

Source: Alaska Fire Service GIS (click for larger image)

But it is still nearly entirely in the area that belongs to the limited management option. To the north, the full option starts at the Chena River and/or Chena Hot Springs Road, approximately. To the west, the modified option starts at the south fork of the Chena River (the confluence is where green, brown and blue areas meet). There are some fire service staff assigned to the fire, who have been continuously monitoring where the fire is relative to river/road in the north, and also have set up sprinklers to protect a public use cabin on the south fork of the Chena River (Nugget Creek cabin).

I’m pretty sure that we’ll see some drops of water or retardant if the fire service thinks it may jump over the river or road in the north. There are more reasons to think it won’t spread too far west or east: not only is the western side quite swampy and full of sloughs (and of course a small river), but there are also recent fire scars in the area, which would slow down a fire. The blue area is already within the scar left by the 2013 Stuart Creek 2 fire, and the area east of the fire burned in 2009 or 2000:

Nugget Creek fire as of July 6, with active fire detections around its perimeter. Old fire scars surround the area. The stars indicate where we live, and the location my spouse and I are in the process of moving to. (click for larger image)

So what do I expect will happen? My guess is that the fire will burn out a good part of the triangle between rivers and old fire scars. It’ll be several days before we can expect rain, so smoke will stay terrible for a while.

In any event, Nugget Creek will be an interesting case study for testing models of how fire spreads: It’s one of the few fires that will be relatively easily accessible on foot (once it is completely out of course!), as it has reached Mastodon Creek Trail in the east and is visible from two bridges over the Chena River. And it will have been essentially unsuppressed, ie left ti its natural progression, at least until now.

Fire season in interior Alaska, fire terminology: What is “spotting”?

It is fire season in Alaska and a very smoky Thursday in the Fairbanks area.

Smoke forecast for June 27, 2 pm local time, from the UAF Geophysical Institute. See

The vegetation in the Alaskan Interior (roughly the area between two mountain ranges, the Brooks Range in the north and the Alaska Range in the South) mostly consists of boreal forest – black and white spruce, larch, birch and quaking aspen – with easily burnable undergrowth and a thick organic soil layer in the coniferous parts. This type of forest has long been adapted to a cycle driven into a new round by wildfire. But with climate change, the fire intervals – the time between successive burns, usually over 100 years – have become shorter and areas burned have crept upwards.

Right now, there are many relatively small fires surrounding Fairbanks that are responsible for the smoke. But they are only kept relatively small by the excellent work of our wildfire management services (a complex effort involving both federal and Alaska state agencies, called Alaska Interagency Coordination Center, or AICC). And one of the fires to the north-west of Fairbanks, the Shovel Creek fire, now exceeds 1000 acres (400 ha), is very active, and has put several residential neighborhoods under an evacuation warning. My partner and I live on the east side of town, and the situation there looks about like this now:

Map of fires as of June 26 (evening) in the area east of Fairbanks. Chena Hot Springs Road, along which most people here live, is along the bottom of the map, partly paralleling the Chena River. Fire perimeter (last known) are the red outlines. Recent fire detections are dots in red (last 12/24 hours) or blue (24-48 hours)

Mapping services and data sources available to the general public have become much more widespread, and that’s great. And private citizens, fire management agencies and researchers publish a wealth of photos. But with the flood information comes a challenge: to learn how to interpret it. Because if we don’t, more information can mean more anxiety. Let’s see how we do for the fires in the area where I live.

Last week, we were mostly worried about the Caribou Creek fire (north edge of the map), which is about 7 miles (11 km) due north of our current home. (My spouse and I are in the process of buying and hopefully soon moving to another home very close by, and the fires are clearly making it harder to get the insurance providers to issue the policies we need!) That fire was immediately strongly attacked, and was already in the process of being controlled after received a good dousing with rain last weekend.

As you can see in the map, there is a fire perimeter as well as one red, black-edged dot that indicates the latitude and longitude where the fire was originally thought to have started. But that’s all. However, for the fire east of us (Nugget Creek fire — many fires here are named after the nearest creek, river or sometimes mountain or street), you can also see white-rimmed red dots as well as blue and darker red dots. These are fire detections from satellite observations, and they are more recent than the fire perimeter drawn by the fire services. So by looking where those dots are relative to the known fire perimeter, you can get an impression where the fire is moving. In this case, mostly east and south, plus a little bit north. (The fire is boxed in by rivers west and further north, luckily.) The reason this fire is more active than Caribou Creek is that it is in an uninhabited area (a state recreation park used by hikers etc.) and was NOT initially fought with vigor. And that’s a good thing in principle: it is normal for this forest to burn occasionally. If we never allow it, dead debris is just going to accumulate, and future fires might be that much worse. However, once the nuisance from the fire becomes too great, or the fire approaches areas where it could do damage, the fire agencies will decide to fight it – which is exactly what happened.

Here are some pictures of the Nugget Creek fire from yesterday that illustrate why concern has risen:

This photo was taken yesterday (Wednesday, June 26) evening by UAF’s Dr. Scott Rupp, from his home north of Chena Hot Springs Road and west of the fire. The Nugget Creek fire is on the left. On the right, the Beaver Fire is a new fire burning further south. Used with permission ( original on Twitter: )
Photo of the Nugget Creek fire from the same day, Wed. June 26, by the Alaska Department of Natural Resources/Division of Forestry. (Author: Brandon Kobayashi). Looking south as well.

The second photo in particular shows that the fire is active along its perimeter, with the smoke being driven backwards, into the already-burned zone (“fire scar”). This is called “backing”.

Backing is one of the terms for fire behavior that it is useful to know. Even more useful, here in the boreal forest of North America, are “torching” and “spotting”, both considered extreme behaviors Torching is a behavior that happens at high intensities in our spruce forests: individual trees, often one after the next, light up in combustion like a torch, with flames rising high and usually a very large amount of smoke. Spotting is another extreme behavior. It means that the fire is driven forward by wind, jumping ahead of the existing fire front by new ignition from falling embers. The fire in the last picture is a highly active fire, but as of the picture neither torching nor spotting. Especially for spotting, you’d have to have wind that blows in the other direction, driving embers ahead of the front.

Why am I harping on words like “spotting”? Because I’ve been seeing a misconception on social media, created by the excellent new fire maps! Let’s zoom in on the map:

Those dots that signify the locations of new satellite-based fire detections? I’ve seen people on social media call this situation “spotting”. But it is not. In fact, when I make fire maps, I prefer depicting the detection as whole footprints, rectangles (or nearly) that indicate what a satellite image pixel looks on the ground. Because the satellite images we use for this are not high resolution: each dot actually covers an area of about (at least) a quarter mile by a quarter mile.

The hobby linguist in me wonders if, given that the public is now used to these near real-time fire detections being available as dots on a map, maybe the term “spotting” will be more and more used in this way. Because it is relevant whether there are new fire detections or not (though to be honest, that’s a lot more complicated than it looks like at fist). “OMG, there are spots of new fire out to the east… the fire is spotting!” To be sure, if the fire is spotting via flying embers, then it is likely that new detections will show up as dots on the next available map. But I do need to point out that spotting, the extreme fire behavior is not the same as new spots on a map. Rather, any movement of the fire, even medium-intensity propagation of the combustion with no spotting at all, will appear as new dots. “Spotting” means a particular way of a high-intensity fire moving ahead, and new fire detections on a map are just new fire detections on a map.

Sources of fire information for Alaska:

  • AK Fire Info. Blog-style, up to date, excellent information about what’s going on, which fires receive what type of attention, planned community meetings, pictures and more.
  • Official BLM/AICC map site. Tip: Use the icons to select the background you like. I prefer “US Topo Map” or “Imagery with labels”.
  • “Old” BLM/AICC map site – harder to navigate, but has also some information the new one hasn’t.
  • UAF Smoke forecasts, run daily based on satellite fire detections and AICC data. Also has a simple “current fires” map.
  • NOAA smoke forecasts for Alaska. May be interesting for people in the Yukon Territory in Canada. Click on the check mark between “near surface smoke” and “loop”.

Inversion! (And a little bit about indoors air quality)

After an incredibly warm (and low-snow) start of winter in 2018, temperatures plunged to proper sub-Arctic winter values. A few days ago, we were worrying about thawing (with temperatures in the 30s °F / around 0 °C), and then down the curve went. Last night we reached -40 (choose your unit: the scales meet at -40) as per our consumer-grade Davis weather station. That’s the official definition of “cold snap”, and it’s the first of the season.

The Fairbanks area is well known for its extreme atmospheric temperature inversion patterns. And on map provided by the commercial service Wunderground it is very apparent today (click on the image for a bigger version with readable temperature labels):

Higher elevations, such as Cleary Summit, a mountain pass on the Steese Highway north of Fairbanks, are currently about 30 °F/17 °C warmer than the valley bottoms. Where we happen to live (and which is why “a house in the hills” is something we keep looking out for).

Life as -40 slows down. You can go out just fine, and you’ll be amazed by how well sled dogs are adapted to this extremely harsh weather, but it’s not really fun to exercise outside, plastic is brittle, batteries go flat … and you don’t have time for recreation because you need to keep an eye on the house systems. Especially to make sure that water stays liquid, be it in drains or incoming pipes.

The house we live in is made from three-sided logs, which retain heat quite well. We mostly heat with a very efficient and Toyo oil stove. But we have a wood stove to fall back on, in case the electricity or the Toyo have an outage. We decided to fire it up today, to make sure it works well and because it creates a nicer warmth in the corners too far from the Toyo. Wood burning is a huge and complicated political hot potato in interior Alaska, and a topic too complicated to go into in this post. The short version is: it’s traditional, economical and doesn’t require electricity; but it also contributes to air pollution in a way to occasionally render the winter air in Fairbanks one of the worst in the planet, so at a minimum eliminating the most polluting types of burners is quite imperative for the health of the population.

Our stove is an EPA-certified Blaze King with a catalytic converter, but even so: however pleasant wood heat is, you’ll get an impact on the air you breathe. Air quality indices only talk about outdoors air (and usually averaged over 24h). Another topic, though, is indoors air. I took the opportunity to pull out a little PM2.5 sensor that I bought online a while ago. (Disclaimer: I haven’t checked the calibration. At least during the summer wildfires it seemed to work reasonably well and deliver believable values.) Today, before starting the wood stove and with a blazing oil stove, it measured around 15 µg/m3, which is an ok value. With the wood stove, it was at 66 µg/m3 after an hour, went up to just above 100 µg/m3 and then settled in the 80s, which you will find labeled as “unhealthy for vulnerable populations”. My setup is here, with the sensor in the left foreground. It talks to the computer via a USB-to-serial interface. (I wrote the app to teach myself Dash for interactive real-time applications.)

This, of course, is measured indoors, not outdoors, and in the 24h average we will quite likely end up in the “moderate” range for the day. We’re fine.

On a related note, no one should retain too rosy a picture of the air pre-historic and pre-industrial populations used to breathe. Notwithstanding, of course, their insights about the relationship to nature: if a culture or people practices indoors wood or coal burning for cooking and warmth, chances are that respiratory illnesses were prevalent.

Addendum: Approximately 5-6 h after we first started the wood stove, the picture has changed a little bit. PM2.5 values came progressively down to 15-25 µg/m3, pretty much to where they were before we started. Clearly, the first fire-up phase is what generates the most pollution, while collected crud from inside and outside the stove burned off. In contrast, low-to-medium hot, steady operation deteriorates the air much less. Also, I believe it takes a few hours for the catalytic converter to reach its operating temperature. Apparently, letting the stove cool down completely and sit cold for extended periods is a recipe for air pollution. The lesson from this is: If you want to heat with firewood, make sure you have a device with a certified catalytic converter, use it continuously and burn dry, well-seasoned wood.

The great temperature sensor tryout (part 1)

I called it a shootout first, but hey, it wasn’t anything violent.

Sensing temperature with Python and the pyboard

Because I have the best spouse in the world, my recent birthday brought me a whole pile of little boxes with electronic tinker toys, useful ones, straight from Hackerboxes. I’ve since been renewing my acquaintance with the soldering iron and dusted off an old project about building environmental sensor units. 

One of the development boards I was keen to try out is the pyboard, which is the original or official board for MicroPython. MicroPython is a new implementation of a remarkably rich subset of Python 3.4 to run on microprocessors. I already had a good deal of experience with CircuitPython, the education-friendly derivative of MicroPython published and maintained by Adafruit. Now it was time for the real thing.

The pyboard is very compact. For comparison, the grid on my work surface are 1×1 cm.

So the plan I came up with is: Gather the temperature sensors from my kit, hook them all up to the pyboard at the same time, log the temperature for a while and compare the outcome. 

The sensors

After rummaging through my component boxes, I fished out these three sensors (pictures below):

  • The TMP36 is a popular low-cost silicon band gap temperature sensor. It is small, with three output pins, and comes in a black plastic housing. T range: -40-150 °C, T accuracy ±2 °C. Price: $ 1.50
  • The DHT11 is a very common temperature and relative humidity sensor with not the best reputation for accuracy. It comes in a blue housing with electronic components included, and our version (from a Hackerbox) is mounted onto a small board, which adds a pull-up resistor and a power LED. The DHT11 documentation is a little spotty. The sensor apparently uses a negative temperature coefficient (NTC) thermistor  (ie, temperature goes up → resistance goes down) and some kind of resistive humidity measurement. The sensor can only take T measurements every 2 sec. T range: 0-50 °C, T accuracy ±2 °C, RH accuracy ± 5%. Price: $ 5 
  • The most expensive item in this set, the Sensiron SHT31-D, uses a CMOS chip to measure temperature and relative humidity, with a capacitive method.  I used the lovely little Adafruit breakout board. T range: -40-125 °C, T accuracy ±0.3 °C (between 0 and 60 °C) and up to ±1.3 °C at the edges, RH accuracy ± 2%.  Price: $ 14.

The prices are 2018 prices from US vendors. By ordering directly from Chinese outlets, you can usually reduce them to ~40%, except for the much rarer SHT31-D, for which only one ready-to-use alternative (which is not much cheaper) appears to exist.  

 The three temperature sensors: TMP36 (left), DHT11 (middle), SHT31-D (right).
The whole setup.

Communication protocols, libraries, and code

All code, both on the computer side and the pyboard side, is available in a git repository on GitHub.

Two related consideration at this stage: What communication protocol do the sensors use, and what MicroPython libraries are available to drive them from the pyboard?

While the documentation for MicroPython is superb, I found the pyboard to be a lot less well documented. MicroPython comes with a library specific to the pyboard (pyb) as well as generic libraries that are supposed to work with any supported board (eg. machine), and their functions sometimes overlap. I had to consult the MicroPython forum a few times to figure out the best approach. 

  • The TMP36 is a single-channel analog sensor: The output voltage is linear in the temperature (it boasts a linearity of better than 0.5%). So we need a DAC pin to measure the output voltage at the pyboard (using a DAC object from the pyb module). According to a forum post, the pin output is a 12-bit integer (0-4095) that is proportional to the voltage (0-3.3V). So we get the voltage in mV by multiplying the pin value by 3300 and dividing by 4095. Then, as per the TMP36 datasheet, we have to subtract 500 and divide by 10 to convert that value to °C. Easy. 
  • The DHT11 also uses single channel protocol, but a digital one, which looks a little idiosyncratic. (I was a little surprised to find that none of my sensors uses the popular 1-Wire bus.) Luckily, MicroPython ships with a library (dht) that takes care of everything. It outputs temperature in °C and relative humidity in %.
  • The SHT31-D communicates via an I2C bus. I used MicroPython’s generic machine module and sht31 module provided by kfricke that builds on it. 

I was particularly happy how easy it is to scan the I2C bus for the device port with MicroPython. From the MicroPython REPL (which you enter to as soon as you connect to a new pyboard via a terminal emulator like screen), it’s great to be able to create and probe objects interactively, for example like so:

>> import machine
>> SCLpin = 'X9'
>> SDApin = 'X10'
>> i2c = machine.I2C(sda=machine.Pin(SDApin), 
>> i2c.scan()

This said, the sht31 library doesn’t even need the user to provide the port. I’m just noting this because this task can be a little frustrating on the Arduino platform, and take more tries to get it done.

Our basic workflow was this:

  1. Using the MicroPython REPL, develop the MicroPython code incrementally. The goal is to take a measurement every 10 sec and send the data back to the computer over the USB serial port.
  2. Upload finished MicroPython code to the pyboard.
  3. Use the USB connection now to receive serial data from the pyboard (using Python 3 on the computer) and also to power the board.
  4. Let the assembly collect data for a few hours and write it to a file.
  5. Visualize the data afterwards. 

I let the experiment run for a few overnight hours in my somewhat overheated home office corner, where the air is very dry. 

The measurement results

Even while monitoring the data flow, some observations stand out: The DHT11 only provides temperatures in whole degrees. And the temperature measured with the TMP36 is a good bit (about 2-3 °C) higher than the output from the other two sensors. The data looks somewhat like this: 

Let’s look at plots of the three temperature curves. At first, I was not impressed: The DHT11 measurements seemed to jump around a whole lot, the SHT31 has a very weird periodic signal and the TMP36 data is incredibly noisy (apart from unrealistically high – the room wasn’t that badly overheated!). 

But things become clearer once we plot onto a single figure: The upticks of the DHT11 measurement correspond exactly to the peaks of the SHT31 data, and so do the minima! Also, the two sensors agree, within their precision, quite well in absolute value. Once I took the SHT31 signal seriously, I realized what it was due to: The thermostat cycle of our Toyo oil stove! I had no idea that it was on this 30 min cycle. How interesting. And now that I know, I really like the SHT31. (Apparently you get what you pay for in this case.) 

What about the TMP36? Well, it’s noisy and operating way outside its nominal accuracy, but at least some of the signal is clearly due to the real temperature variation. 

The fall-off of the data during the first hour is, btw, probably because I removed my body from the desk and went to bed. The air cooled afterwards. 

Since I had the relative humidity data, I looked at it, too. Here the SHT31 and the DHT11 agree less well in absolute value, and I think the latter is in agreement with its reputation as not very accurate. I find the SHT31 values a lot more convincing.  One thing that the graphs don’t show is that if you blow on the sensor, the SHT31 adjusts immediately and goes back to normal pretty fast, while the DHT11 takes a few seconds for its measurement to jump. It also saturates quickly (showing unrealistic nearly-100% of RH) and takes a long time to return from an extreme measurement. 

What did we learn? What else could we find out?

So this was interesting! I didn’t expect to get so clear a feeling for the different sensors’ strengths and weaknesses. At the same time, I’m have more questions… 

  • The DHT11 confirmed its reputation as a very basic sensor. Its temperature measurement was in better agreement with what I think is reality than expected, but the relative humidity is not very trustworthy. Its limited range and whole-degree step size make it a mediocre choice for a personal weather station, and its slowness unsuitable for, say, monitoring temperature sensitive circuitry. But could we get a higher resolution with a different software library? Are there settings that escaped my notice? As-is, I see its application mostly in education and maybe indoors monitoring when you don’t need much precision.
  • The TMP36 was disappointing, but maybe my expectations of this $ 1.50 sensor were excessively high. As an analog sensor, it is faster than the DHT11 (though I don’t know how fast … just yet) and can be used for monitoring temperature extremes, but not with high precision because of the noise! Maybe a good approach would be to reduce the noise in some fashion. Also, did I have a bum unit, or do they all need calibrating? Maybe the pyboard-specific conversion formula has a flaw, and a different board or method would produce a different result? 
  • The SHT31-D looks like a great sensor all around. Also congratulations to Adafruit for producing an outstanding breakout board. This would be my choice for a weather station, hands down. 

Other questions I have are: How different would the performance be in a different temperature range? I live in Alaska, and temps went down to -30 °C today. (My medium-term plan is to put a sensor under the snow.) Also, how fast can we actually retrieve successive measurements from our  two faster sensors? Last, I think I found at least two more temperature sensors in my kit. How do they compare?

I think there might be a part 2 in the works… 

A thought on serial ports

Some time during the last year or two there has been a change in my attitude towards serial connections: I used to think of them as a relic of a bygone time, when personal computers came with RS232/DE-9 interfaces and you could find them on all sorts of peripherals.  A relic used by scientists and instrument makers for reasons, probably, of expedience. And if your instrument comes with a serial connector, you sigh, go look for a serial-to-USB converter and hope that it’s not one with an unmaintained and buggy driver that for all time will make you doubt in your data.

It is true that my deep negativity stemmed from the time I was receiving GPS signals in an airplane and worried a lot about USB buffer lag impacting the time stamps. I still think my worries were founded in this particular application. It is also true that for full-blown precision instruments providing an ethernet interface (or maybe something else?) out oft he box would be highly preferable to the unholy rat’s nest of serial hubs, USB converters and ethernet hubs (not to mention proprietary software to run them) that you find in too many scientific installations. 

But since I’ve been playing with electronics and development boards, dealing with UART feels completely normal and appropriate. I own a bunch of converters (USB-to-UART, USB-to-DE-9) with FTDI chips, which work very well. And while it’s true that I’m far from understanding everything about serial communications, and am still leery of the reliability of timings below the ms level, it’s a lot more convenient and manageable than other options. 

To be continued… 

Plotting polygon Shapefiles on a Matplotlib Basemap with GeoPandas, Shapely and Descartes

I often use Python to plot data on a map and like to use the Matplotlib Basemap Toolkit. In practice, I use a lot of different libraries to access various data formats (raster, vector, serialized…), select and analyse them, generate, save and visualize outputs, and it’s not always obvious to string one’s favourite tools together into efficient processing chains.

For example,  I’ve also become fond of Pandas DataFrames, which offer a great interface to statistical analysis (e.g. the statsmodels module), and its geo-enabled version, GeoPandas. Both Basemap and GeoPandas can deal with the popular (alas!) ESRI Shapefile format, which is what many many (vector) GIS datasets are published in. But they aren’t made for working together. This post is about how I currently go about processing Shapefile data with GeoPandas first and then plotting it on a map using Basemap. I’m using an extremely simple example: a polygon Shapefile of the earth’s glaciated areas from the handy, and free,  NaturalEarth Data site. The data is already in geographic coordinates (latitudes/longitudes), with a WGS 84 datum. We therefore don’t have to worry about preprocessing the input with suitable coordinate transforms. (Often, you do have to think about this sort of thing before you get going…). All my code is available in an IPython (or Jupyther) Notebook, which should work with both Python 2 and 3.

So let’s say we have our glacier data in a file called ne_10m_glaciated_areas.shp. GeoPandas can read this file directly:

import geopandas as gp
glaciers = gp.GeoDataFrame.from_file(

The output looks something like this:

Screen Shot 2016-03-06 at 17.02.23

The geometry column (a GeoSeries) contains Shapely geometries, which is very convenient for further processing. These are either of type Polygon, or MultiPolygon for glaciers with multiple disjoint parts. GeoPandas GeoDataFrames or GeoSeries can be visualized extremely easily. However, for large global datasets, the result may be disappointing:



If we want to focus on a small area of the earth, we have a number of options: we can use Matplotlib to set the x- and y-limits of the plot. Or we can filter the dataset geographically, and only, say, plot those glaciers that intersect a rectangular area in the vicinity of Juneau, AK, that is, the Alaskan Panhandle and the adjacent Western British Columbia. Filtering the dataset first also speeds up plotting, by a lot:

import shapely
studyarea =, 56., -130., 60.)
ax1 = glaciers[glaciers.geometry.intersects(studyarea)].plot()
fig = plt.gcf()
fig.set_size_inches(10, 10)


This is remarkable for so few lines of code, but it’s also as far as as we can get with GeoPandas alone. For more sophisticated maps, enter Basemap. The Basemap module offers two major tools:

  • a Basemap class that represents a map in a pretty good selection of coordinate systems and is able to transform arbitrary (longitude, latitude) coordinate pairs into the map’s coordinates
  • a rich database of country and state borders, water bodies, coast lines, all in multiple spatial resolutions

Features that add on to these include plotting parallels and meridians, scale bars, and reading Shapefiles. But we don’t want to use Basemap to read our Shapefile — we want to plot the selections we’ve already made from the Shapefile on top of it.

The basic syntax is to instantiate a Basemap with whatever options one finds suitable:

mm = Basemap(projection=..., width=..., height=...)

… and then to add whatever other features we want. To transform a (longitude, latitude) coordinate pair, we use mm(lon, lat). The resulting transformed coordinates can then be plotted on the map the usual Matplotlib way, for example via mm.scatter(x, y, size, ...). The code to plot our study area and the city of Juneau, in the Albers Equal Area conical projection (good for high- and low-latitude regions), at intermediate resolution, and including water, ocean, coastlines, country borders etc. is:

from mpl_toolkits.basemap import Basemap
import numpy as np
water = 'lightskyblue'
earth = 'cornsilk'
juneau_lon, juneau_lat = -134.4167, 58.3

fig, ax1 = plt.subplots(figsize=(12, 10))
mm = Basemap(
    width=600000, height=400000,
    lat_1=55., lat_2=65.,
    lat_0=58., lon_0=-134)
coast = mm.drawcoastlines()
rivers = mm.drawrivers(color=water, linewidth=1.5)
continents = mm.fillcontinents(
bound= mm.drawmapboundary(fill_color=water)
countries = mm.drawcountries()
merid = mm.drawmeridians(
    np.arange(-180, 180, 2), 
    labels=[False, False, False, True])
parall = mm.drawparallels(
    np.arange(0, 80), 
    labels=[True, True, False, False])
x, y = mm(juneau_lon, juneau_lat)
juneau = mm.scatter(x, y, 80, 
    label="Juneau", color='red', zorder=10)


This result may even be quite suitable for publication-quality maps. To add our polygons, we need two more ingredients:

  • shapely.ops.transform is a function that applies a coordinate transformation (that is, a function that operates on coordinate pairs) to whole Shapely geometries
  • The Descartes library provides a PolygonPatch object suitable to be added to a Matplotlib plot

To put it together, we need to take into account the difference between Polygon and MultiPolygon types:

patches = []
selection = glaciers[glaciers.geometry.intersects(studyarea)]
for poly in selection.geometry:
    if poly.geom_type == 'Polygon':
        mpoly = shapely.ops.transform(mm, poly)
    elif poly.geom_type == 'MultiPolygon':
        for subpoly in poly:
            mpoly = shapely.ops.transform(mm, poly)
        print(poly, "is neither a polygon nor a multi-polygon. Skipping it.")
glaciers = ax1.add_collection(
    PatchCollection(patches, match_original=True))

The final result, now in high resolution, looks like this:

We could do a lot more — add labels, plot glaciers in different colors, for example. Feel free to play with the code.

Tip: Earth View from Google

As a Google Chrome user (alternating  with Firefox depending on usage scenario: I’m a multi-browser person) and satellite imagery person, I’ve been enjoying a browser plug-in called Earth View, which customises every new tab with an image out of Google’s collection of particularly spectacular satellite imagery. These images are drawn from the DigitalGlobe and Astrium/CNES imagery that Google acquired for use in Google Earth and Google Maps.  DigitalGlobes launched and runs the IKONOS, GeoEye and WorldView series of commercial satellites, and Astrium (or their parent company) operates the SPOT and Pleïades and other satellite platforms for the French and European space agencies. This is high-resolution optical imagery on a scale of 50 cm to 2.5m.

But even if you don’t use Chrome, you can still enjoy the images! Google recently launched an improved web site where you can browse, gallery-style, and share images with a direct link or on some social media platforms. There is also one-click access to the location on Google Maps and the option to download. Obviously, you just get a processed RGB JPEG  at less than full resolution (and not a geo-referenced multi-band reflectance dataset which is available from the original sources for purchase), but still, the images are a lovely option to explore landscapes and urban structure. I’m learning a lot about serendipitous places of the world. Here’s an agriculture in Vietnam, for example. Click on the map and zoom in for a view of houses and roads along canals:

Screen Shot 2015-12-01 at 21.27.40.png

Or click on the hidden arrow to the right to find other places, such as in the US, Egypt, or Spain.

A scientist goes to PyCon 2015

It’s already been two months, and I still haven’t posted about going to PyCon in Montreal. I had a wonderful experience! Many thanks to the PSF and PyLadies, whose travel grant brought the cost down into the realm of the feasible for me.

PyCon is an extremely well-run conference, run by a community that emphasizes a welcoming attitude. There’s a visible science presence (much more general than the topics you’d see at SciPy, of course), and an impressive 30% of speakers were women. I came away from it with many new ideas, got to talk with countless Python people, met many members of the geospatial community, including Sean Gillies, the author of such useful libraries as Shapely, Fiona and Rasterio, who turned out to be lovely. Also, two very nice gentlemen from the National Snow and Ice Datacenter (my pleasure!), serendipitously, as I used some NSIDC data in my presentation. Right, I gave a talk (on using satellite data to make maps, understandable without a remote sensing background), which was well received. I’ve embedded it below, and you can get the slides on speakerdeck here :

Indeed, all the talks are available in a YouTube channel and on

I’ve learnt tons by watching talks from past PyCons. It’s one of the best pass-times to do in the evening.  So I thought I’d put together a quick “PyCon highlights for the pythonic scientist”, with links to the relevant videos. A few notes of caution:

  • These are not my best-of PyCon talks. Some talks that were excellent I left aside in favour of some that have a clearer utility for someone working in scientific research.
  • Most of these are 30 min talks. Some are 45 min. The ones that are marked as “3h” were tutorials, and may be somewhat tedious to watch — except if you really want to learn about a topic in-depth, in which case you’ll be happy they exist. Otherwise, skip!
  • I organized them roughly by topic area and added annotations. If you only have time for a few, my suggestion is to start with the ones with the asterisk. (Again, not because they’re necessarily the best, but because I think you get a lot of reward for your time investment).

Science topics

(In no particular order.)


Becoming a better Python programmer

(The hard ones are at the end.)


Understanding Python internals


Philosophy, ethics and community

A map of the Mount Polley Mine tailings pond breach

Like many, I’ve been following the developing story of the large spill of mine tailings and water following the failure of a tailings pond dam at Indurstrial Metal’s Mount Polley mine near Likely, British Columbia, Canada. There has been much impressive video, but I haven’t seen a good map of the lay of the land. So I made a quick one from Landsat imagery.


The before/after comparison shows the same location on the Tuesday before the spill (which happened on Monday, Aug. 4, 2014), and a week later. Debris, which after a day has not yet reached the town of Likely towards which it was headed, is visible in Quesnel Lake (and Polley Lake). Hazeltine Creek, which must have been a small stream passing close to the pond before the breach, is widened and filled with muddy water on a length of several miles (recognizable by the lighter colour). From the numbers I’ve seen, the volume of water:sediments in the spill was about 2:1, so we’re talking about liquid mud. I put in the 1 mi scale by eyeballing it — it’s not precise, but Polley Lake appears to be about 3 miles long.

These images are made from Landsat 8 scenes, which are available freely (simple registration requrired) from USGS ( I did not process them myself, but took the shortcut and downloaded the pre-processed “LandsatLook” images, which USGS provides for illustration purposes (rather than science and image processing). These are JPG files of about 10 MB, which aren’t at full resolution. If I processed the file from the original scene, it would look better, but I didn’t want to download 2 GB of full-scene data and take about an hour to process it myself last night.