University of Edinburgh logo British Antarctic Survey logo National Oceanographic Centre logo University of Leeds logo


From Crops to Glaciers: A Deep Learning Framework for Earth Observation

Motivation Advances in machine learning, in particular deep learning [1] have led to tremendous progress in
computer vision. At the heart of these successes are deep neural networks. Deep neural networks are able to excel
at challenging tasks such as image classification, object detection, and semantic segmentation of complex scenes.
However, they are notoriously cumbersome; they rely on vast quantities of manually annotated data and computing
power to train. Because of this, a practitioner, given a new task, will usually take an off-the-shelf pre-trained network
from the internet, and fine-tune it using their own task-specific data.

These networks are usually trained on massive datasets of everyday photos (e.g. dogs, cars, food, people) that
have been manually labeled at great expense [2]. The idea is that these pre-trained networks will be able to generate
useful feature representations that can then be readily adjusted for a new task. This relies on the assumption that the
domain shift (i.e. the difference in the underlying distribution of images from the two datasets) is minimal. However
this is not the case for tasks using satellite imagery. Satellite images differ substantially from everyday photos; they
are vertically acquired, with objects cutting across different scales, and represented at different resolutions, and are
highly prone to weather effects. There are no off-the-shelf neural networks suitable for earth observation experts
working on this imagery, yet there is a growing need for automated feature recognition as our natural environment
changes rapidly in response to both climatic and anthropogenic forcings.

Objective The goal of this project is to satisfy this need and create a deep learning framework in the form of a
neural network that excels at working with satellite imagery. This can be used downstream by earth observation
practitioners working with satellite images, for a host of disparate tasks. We will avoid the need for expensive manual
annotation of satellite images by employing self-supervised learning [3], a paradigm where we can create pretext tasks
to train networks to produce useful features. For everyday photos these pretext tasks include predicting rotations [4],
and solving jigsaw puzzles [5]. This project will involve the careful design of pretext tasks suitable for satellite
images (this could be e.g. a mixture of temporal or spatial infilling) as well as identifying the most salient data for
training these networks. We will also use neural architecture search algorithms [6, 7] so that the underlying network
architecture is optimised for use with these images, rather than everyday photos. To demonstrate the effectiveness
of our framework we will apply our network downstream to tackle two important environmental problems, employing
data from the PlanetScope and Sentinel-1/2 satellites. The problems we will consider are very different in nature;
this is an important demonstrator for the robustness of our approach.

Problem 1: Crop Monitoring Food production can be assessed and predicted using process-based crop simulation
models. Challenges in the use of these models include that of identifying the areas under which each crop is being
cultivated, and that of adequately simulating the Leaf Area Index (LAI) of the crop. Whilst remote sensing can help
with both of these challenges [8], accuracy can be a barrier to use. Addressing both of these challenges together
provides an excellent way of increasing skill in LAI, and ultimately crop yield, since incorrect crop identification has a
knock-on impact on assessment of LAI. Accordingly, in the part of the project, we will use our framework to identify
areas where maize is grown in Kenya [9], and to estimate the LAI of the crop. This will contribute directly to the
EU Horizon 2020 programme CONFER [10].

Problem 2: Glacier Surge Events Glacier surges events are characterised by long periods (decades) of inactivity
punctuated by short periods (months to years) of extreme ice discharge. They are confined to specific regions of
the world (e.g. the Karakoram, Pakistan; Alaska; Svalbard) and can impact on mountain communities by triggering
avalanche and flood events as well as damming rivers as the ice advances down-valley. The rapidity of the ice
movement during surge activity reorganises surface features and produces characteristic surface elevation changes,
such that surging glaciers can be visually differentiated from non-surging glaciers with relative ease [11]. However,
this is a time-consuming process, and quickly becomes infeasible as satellite data becomes more abundant. In this
project we will deploy our framework to automate the detection of surge events using fine and medium resolution
imagery with the aim of deriving a global inventory of historical activity, and then use this to train a new network, or
series of networks, each tailored for predicting future surge events at a different study location.

By using our framework to solve these problems, we will be able to demonstrate that we have developed a powerful
tool that can be widely employed by earth observation practitioners in future.

If you would like to discuss the project further please get in touch with the lead supervisor Dr Elliot Crowley <>

[1] Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. Nature, 521(7553):436, 2015.
[2] Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. ImageNet large scale visual recognition challenge. Int. Journal of Computer Vision (IJCV), 115(3):211–252, 2015.
[3] Jurgen Schmidhuber. Evolutionary principles in self-referential learning, or on learning how to learn: the meta-meta-… hook. PhD thesis, Technische Universitat Munchen, 1987.
[4] Spyros Gidaris, Praveer Singh, and Nikos Komodakis. Unsupervised representation learning by predicting image rotations. arXiv preprint arXiv:1803.07728, 2018.
[5] Mehdi Noroozi and Paolo Favaro. Unsupervised learning of visual representations by solving jigsaw puzzles. In European conference on computer vision, pages 69–84. Springer, 2016.
[6] Barret Zoph and Quoc V. Le. Neural architecture search with reinforcement learning. In International Conference on Learning Representations, 2017.
[7] Joseph Mellor, Jack Turner, Amos Storkey, and Elliot J Crowley. Neural architecture search without training. In International Conference on Machine Learning, 2021.
[8] Dimitrios A. Kasampalis, Thomas K. Alexandridis, Chetan Deva, Andrew Challinor, Dimitrios Moshou, and Georgios Zalidis. Contribution of remote sensing on crop models: A review. Journal of Imaging, 4(4), 2018. ISSN 2313-433X. doi: 10.3390/jimaging4040052. URL
[9] Annalyse Kehs, Peter C Mccloskey, David Hughes, and Derek Morr. Busia county, kenya agricultural fields 2019, 2021. URL
[10] CONFER project. URL
[11] Rakesh Bhambri, K Hewitt, P Kawishwar, and B Pratap. Surge-type and surge-modified glaciers in the karakoram. Scientific reports, 7(1):1–14, 2017