Skip to main content

Created by Thomas Ostersen

In 2020 during a covid lockdown, as a side project, I developed a tin-tungsten prospectivity map for northeastern Tasmania. This involved fitting a supervised classification model to a binary classification problem where pixels in a multiband raster containing geoscientific information in ‘evidence layers’ were either proximal or distal to known tin-tungsten mineral occurrences. By running inference with such a model on all pixels in the multiband raster a prospectivity map depicting the spatial distribution of class membership probability for the proximal class was generated (see figure 1).

Figure 1: Conceptual illustration of the prospectivity mapping workflow.

One potential limitation with the above workflow was the pixel-wise nature of the modelling approach. Here, pixels were treated as though they were isolated data points unrelated to their immediate neighbourhood, an approach that is akin to classifying the dog-ness of a pixel in a photo of a dog based purely on the colour content of the pixel. Humans analyse imagery by interpreting the textural content of the image, which is true for photos of dogs just as it’s true for, say, maps of aeromagnetic geophysical data. This blog post presents a simple example of one of the types of textural analysis workflows Datarock has applied to geophysical data, in this case using the same data from northern and western Tasmania.

Computer Vision & Transfer Learning

Computer vision refers to a field of computer science concerned with the study of algorithms that analyse and understand imagery data. We at Datarock specialise in applying such algorithms to help solve geological problems, whether its extracting automated geotechnical logs from photos of drill core on our platform, or developing continent-scale prospectivity models for clients of our consulting business, we find that treating geoscientific information as imagery and assessing it with computer vision algorithms often yields excellent results.

State-of-the-art computer vision models, typically convolutional neural networks (CNN), require large volumes of labelled data and significant computing resources to train from scratch. In geoscientific applications where high quality labelled training data is often difficult to come by this can pose a problem. Thankfully, there are a plethora of freely available models pre-trained on generic imagery data sets that we can adapt for our purposes. This approach is an example of transfer learning, a way of applying knowledge gained from one machine learning problem to another, potentially unrelated problem. In this blog post we’ll leverage transfer learning by using a generic pre-trained computer vision model to extract numerical information from geophysical imagery, describing textural characteristics of the geophysics.

Geophysical Data Sets

The data sets we’ll use for this demonstration include regional scale isostatic residual Bouguer anomaly gravity data as well as fairly recent compilations of airborne total magnetic intensity (TMI) and total count radiometric images covering northern and western Tasmania. The gravity data (figure 2) is sensitive to lateral variations in density of the shallow crust and is an important data set for tin-tungsten exploration as these deposits are genetically associated with low density granite batholiths that tend to manifest as prominent low gravity regions in the image. Areas where these granites outcrop give rise to strong total count radiometric responses owing to their elevated radioactive K, U and Th content (figure 3). The TMI image (figure 4) is sensitive to the abundance of magnetic iron oxide minerals, which are often depleted in evolved granite bodies prospective for tin-tungsten giving rise to low intensity, ‘quiet’ magnetic textures.

In order to assess the textures in these geophysical images with a generic computer vision model we first need to combine them into a composite red-green-blue (RGB) image for ingestion into the models. In this instance, we have used a fairly simple and subjective process in which the images were overlaid with varying degrees of transparency in QGIS to arrive at the composite RGB image shown in figure 5. For extra texture, a vertical hillshade effect computed for the isostatic Bouguer anomaly was added to highlight strong gradients in the gravity image associated with the flanks of low density granite bodies.

Figure 2: Isostatic residual Bouguer Anomaly from Geoscience Australia’s Gravmap2016 national geophysical compilation extracted via the Geophysical Archive Data Delivery System (GADDS).

Figure 3: Total count radiometrics image compiled from multiple generations of airborne surveys (pers. comm. Duffet 2022).

Figure 4: Compilation of total magnetic intensity from multiple generations of airborne surveys (pers. comm. Duffet 2022).

Figure 5: Composite RGB geophysical image resulting from the combination of gravity, radiometric and magnetic data. Note the tendency for outcropping granites with strong radiometric responses (green) to occur within low density regions (blues) with dark edges.

Figure 6: Composite RGB geophysical image with in-situ tin-tungsten mineral occurrence points (dots) and major tin-tungsten mines (stars) overlain. Points sourced from Mineral Resources Tasmania’s compilation of mineral deposits.

Image Tiling & Feature Extraction

Assessing geophysical textures within the composite RGB image requires the selection of a scale of investigation to proceed with. This depends on two things; the resolution of the data available, and the scale of the geological structures being investigated. In this example we are trying to assess textures in the geophysics at a mine camp scale, and have opted to assess textures at a 4 km resolution which is about 5 times the resolution of the gravity data, and 100 times the resolution of the magnetic and radiometric data.

To extract texture information we employ a pre-trained EfficientNetV2 model hosted by Ross Wightman’s library of image models implemented in Pytorch. Here, 4 km square tiles from the RGB composite image are parsed through the model and a set of 512 feature vectors are extracted (figure 7). Feature vectors, or embeddings, are numerical representations of abstract textural information in the image learnt by the neural network. Since the pre-trained model we are using was trained on generic imagery including, say, cats and dogs, it may be that some feature vectors represent the dog-ness of the image, or the cat whisker-ness. These details are irrelevant as we are mainly concerned with extracting numerical information that consistently describes the textural content of the image, and since all we are doing is running inference with a pre-trained model the computational costs of doing so are trivial.

Figure 7: Illustration of the feature extraction workflow.

Once feature extraction is complete, we now have a table of points representing the centroids of 4 km image tiles and their corresponding feature vectors. With this information we can start comparing the textural content of different regions of the composite RGB image in a numerical way. Of particular interest to us is the degree of textural similarity of different regions of the map to image tiles containing known tin-tungsten mineralisation. Textural similarity can be quantified using a similarity metric, in this case cosine angle similarity, computed between image tiles in feature vector space. An illustration of this metric in an arbitrary 3-dimensional space for points P1 and P2 is shown in figure 8.

Figure 8: 3-dimensional example of the Euclidean distance and cosine angle metrics used to define closeness of points P1 and P2.

Similarity to Royal George Tin-Tungsten Deposit

The Royal George tin-tungsten deposit is a mined out granite greisen deposit in the east of the study area (figure 6). Figure 8 below presents an image tile surrounding the Royal George deposit along with the 35 most similar looking tiles based on cosine similarity in feature space. We can surmise from the textural similarity across this plot that EfficientNetV2 has done a reasonable job extracting meaningful textural information for our similarity calculations.

Figure 9: Grid of 35 of the most similar tiles from the RGB composite image to the tile centred on Royal George (top left). A mottled green texture representing strong total count radiometrics from outcropping evolved granites seems to be key here.

Since we have the centroids for all image tiles we can visualise the spatial distribution of the textural similarity to the Royal George tile across the entire study area, as shown in figures 10 and 11. This seems to pick out areas with moderate to strong total count radiometrics (green) on the flanks of deep gravity lows representing the edges of granite batholiths.

Figure 10. Spatial distribution of textural similarity to the Royal George Sn-W deposit.

Figure 11. Zoomed in views from Western Tasmania of the RGB composite image (left) and the Royal George textural similarity map (right). Note that areas around the edges of outcropping granites identified by high total count radiometrics tend to be texturally similar to the Royal George tile.

Summary

This post presents a simple example of transfer learning applied to the assessment of complex textural information contained within geophysical imagery. We may not have found the next Tasmanian tin-tungsten deposit yet, but we have demonstrated a scalable workflow that is particularly useful for similarity searching within large, multi-band image data sets. Though it is not demonstrated here, variations on this workflow can be used to identify anomalous textures, or even run unsupervised textural domaining exercises to arrive at data driven geophysical domains.

Computer vision is a rapidly advancing field with direct applications to the geosciences. Whether its interpreting textures in geophysical images, photos of drill core or thin sections in under the microscope, computer vision algorithms provide a powerful, scalable and consistent way of augmenting our analysis and providing a new tool for geoscientists to interrogate their data sets with.