Created by Yasin Dagasan.
In geology, mapping and effective identification of lithologies is a basic requirement for almost any geological problem. But sometimes this can be complicated by too many samples, not enough resources, a need for consistency or a requirement for rapid turnaround of information. This is where harnessing the capabilities of machine learning algorithms provides the invaluable advantage of automation, especially for tasks that involve a substantial amount of repetitive manual tasks. At Datarock we specialise in the use of machine learning and computer vision tasks to deliver rapid, data-driven insights to geoscientists. In this article, our goal is to illustrate a recent example of using a supervised machine learning approach to accurately detect and characterise mineral grains within rocks, for the purpose of automated rock identification.
This project was the result of a collaborative endeavour between Datarock and Dr. Martin Jutzeler, from the Centre for Ore Deposit and Earth Sciences (CODES) at the University of Tasmania. The primary goal of the project was to identify distinct, coherent textures which effectively serve as ‘fingerprints’ to classify different rock types. Among these fingerprints, the main focus was on phenocrysts, which possess the potential to serve as characteristic identifiers in coherent and coarse clastic or igneous facies. Thus, quantifying the phenocryst content in volcanic rocks became a pivotal part of our project.
Given the labour-intensive nature of manually delineating the grains to obtain size information, the researchers wanted to develop a more automated, efficient solution. The graphic below offers a clear visualisation of two different basalt samples, each displaying unique grain size data. Our objective was to gauge the differences between these samples by utilising a crystal size quantification method. Working with the researchers, we developed a machine learning workflow to solve this particular problem.The study also effectively showcases the practical applications of machine learning in geological studies.
Approach
The developed workflow is composed of three distinct deep learning models, each serving a specific function. The first model is designed to extract scale or resolution information. The second model’s task is to identify rock boundaries and subsequently crop the image. The final model is employed to delineate crystal boundaries, enabling the extraction of various statistical data. An overview of the workflow can be seen in the figure below.
Automating scale measurements
To facilitate our study, we carefully selected appropriate rock and core samples. These were photographed alongside a scale bar, allowing us to translate the pixel-based crystal statistics into metric measurements. This strategy effectively circumvents issues arising from variations in camera types and the distances from which the photos are taken, factors which could potentially distort the scale.
To simplify the automated assignment of scale in each photo, expressed in millimetres per pixel (mm/px), we crafted a custom scale bar as depicted in the figure below. This bar features two distinctly coloured circles, one green and one yellow, designed to assist in the automatic extraction of the scale.
We employed a deep learning model, using Mask R-CNN, to identify these two circles. This segmentation model picks between two classes, specifically the green and yellow circles. The output of this model is essentially a polygon that traces the periphery of the circles. For each circle, we pinpointed the centre and calculated the pixel distance between the two centres. Given that we know the physical distance between the circles to be 5cm, we were then able to determine the image resolution by dividing the distance in millimetres by the distance in pixels. This is illustrated in the figure below. The distance between the circle centres was 1539 pixels, resulting in 50/1539 = 0.0324 mm/px resolution. Therefore, each pixel dimension equates to 0.0324 px/mm, or 32.4 micrometres per pixel. This information is critical because all the size and shape information will be extracted using computer vision tools in pixels automatically and needs to be turned into a metric based unit.
Finding the rock boundaries
Identifying rock boundaries is critical for two reasons. Firstly, some grain statistics require the relative proportion of the grains to the whole rock sample. As such, determining the rock area is essential. Secondly, the image’s background elements, which are not part of the rock, can sometimes be incorrectly identified as false positives by the segmentation model. By cropping the rock along its boundaries and rendering non-rock regions a specific colour, we can reduce the chances of false positives created by the crystal segmentation model.
For this task, we again employed a model created using Mask R-CNN. Training the model involved labelling various images containing rock samples. Ultimately, the model successfully identified the rock boundaries within the given images. The image below demonstrates the input and output of the rock boundary model. Additionally, the rock sample’s image area was calculated during this process, which is then used to derive some of the grain statistics.
Mineral Crystal Detection and Delineation
After the rock only image was acquired from the cropping model, this output served as the input for the crystal segmentation model. This model was trained specifically to detect and delineate the boundaries of select crystals, namely feldspar, hornblende, and pyroxene. The training data consisted of a diverse collection of rock samples, which included samples with different alteration overprints.
Each training image was meticulously labelled by the researchers at CODES, and the resulting label dataset was then used to train the crystal segmentation model. The images below some examples of the assortment of rock types used during the training process.
Once the model was trained, it was able to detect and segment the trained crystals. Below we can see an example of the model predictions, where it has correctly segmented and classified the various pyroxene and feldspar phenocrysts.
Below are the predictions for a range of volcanic rocks found in mineralised terrains in Australia, specifically feldspar-rich basaltic andesites to dacites. This demonstrates how well the model performed on various rock types, textures and different resolution images.
Extracting Statistical Information from the Segmentation Model
Once we had determined individual crystal boundaries, we conducted an in-depth analysis of each phenocryst to derive an array of macro and micro measurements. Macro measurements essentially describe the external shape of the crystal, incorporating descriptors like perimeter, area, axis lengths, diameter, thickness, and so forth. We also calculated statistical lengths, such as Feret diameter. In total, each particle was defined by over 50 descriptors. We used a versatile Python package called imea (Kröll, N., 2021) to extract these statistics post-segmentation.
Using these statistics, the researchers could calculate specific discriminators, such as determining a crystal size distribution (CSD) for each sample, as a way to finger print and compare different samples. For example, in the below figure we can see the different CSD’s (measured in log length in mm) for two intersected mafic bodies about 1km apart. There is a clear difference in the distribution of each drill holes CSD, suggesting that these two similar looking units are actually independent coherent bodies.
Bringing It All Together
With all modelling process components constructed, we were able to process each image through a single triggering of the model pipeline. This pipeline sequentially generates and consumes artefacts in conjunction with each model, ensuring a streamlined and efficient analysis. Using this automated approach, Jutzeler et al. (2021, 2022) were able to rapidly define and reconstruct the lithostratigraphy in mineralised volcanic systems. This approach, using additional information like geochemistry, can allow for rapid identification of units of interest, more rapid geological model building, or identification of units of interest in an exploration program.
Acknowledgements
Dr Martin Jutzeler is gratefully acknowledged for allowing us to release this blog post. All images reproduced in this blog were originally created by him as part of this study. For the full implications of the work, the reader is referred to the Jutzeler papers listed below.
References
Kroell, N., (2021). imea: A Python package for extracting 2D and 3D shape measurements from images. Journal of Open Source Software, 6(60), 3091, https://doi.org/10.21105/joss.03091
* He, K., Gkioxari, G., Dollár, P., & Girshick, R. (2017). Mask r cnn. In Proceedings of the IEEE international conference on computer vision (pp. 2961 2969).
Jutzeler M., Dagasan Y., Carey, R. (2022) Machine-learning image analysis on phenocrysts to reconstruct lithostratigraphy in mineralised terrains: An example with dacites in the Mt Read Volcanics. AusIMM annual meeting, Tullah, Tasmania.
Jutzeler M., Dagasan Y., Carey, R.J., Ila’ava M. (2021) Quantification of crystal size distribution in volcanic rocks with machine-learning image analysis.. AGU 2021 Fall Meeting, New Orleans, USA.