In a previous post I introduced the georeferencing work I was doing, with a video of me georeferencing Porter and Moss’ map of the Cemetery F mastaba field at Abu Rawash. Here I delve into the process a little more, with help from a further video, which shows me georeferencing a more difficult map of Abu Rawash using ArcGIS basemap World Imagery.
The map I am georeferencing in those videos is map I in Porter and Moss’ (1932) Topographical Bibliography volume IIIi. It shows the entire area of Abu Rawash from the pyramid of Dejedefre, to the north-west cemetery in Wadi Qaren and the village of Abu Rawash on the edge of the cultivation.
Accuracy and precision.
When working through the georeferencing process, its important to carefully consider the precision and accuracy you are aiming for, taking into account likely distortions in the image to be georeferenced; the resolution and accuracy of the data (the satellite imagery) you are georeferencing with; and the ultimate function of the georeferenced image.
All data, even GCP collected on site with differential GPS, have some level of error in them. What matters is that they are sufficiently accurate and precise for the task you have in mind. Accuracy and precision are also different. Accuracy refers to whether something is correct. In other words, is the map you are georeferencing in the right place? Precision is easiest to think of as resolution or the level of detail. Something can be very precise but very inaccurate, or something can be very accurate and very imprecise. It is accurate to say I live in the United Kingdom, but it is not very precise. It is precise to say I live in N7 6RY (A postcode in Finsbury Park, London) but that is not accurate – I do not live there any more.
The accuracy of the georeferenced image can be affected by drafting or composing errors in the original map, or by distortions introduced during printing or digitising. Such inaccuracies and distortions affect how well the map can be overlaid on the satellite imagery. The satellite imagery can also contain distortions. Gross and obvious distortions, like the kinks in the Deir el-Bahri temples I published in a previous post and distortions due to the angle of the satellite to the ground affect the georeferencing process.
As with accuracy, the precision of the georeferenced image is affected by the precision of the map we are georeferencing, and the resolution of the satellite imagery. If the map is sketchily drawn or details are missing this may make it more difficult to locate precisely. The resolution of the satellite imagery (or any other ground control data) also affects the precision of the georeferenced image. Georeferencing requires that we line up the map with the same features in the satellite image. The more precisely a feature appears in the satellite image, the more precisely we can locate the map we are georeferencing. The pixel size (resolution) of the satellite imagery therefore places considerable limits upon the precision of the georeferenced image.
Locating the archaeological features at Abu Rawash
It feels like it ought to be easy to identify the pyramid of Abu Rawash and associated cemeteries. After all, pyramids are not known for their discretion, particularly once they’ve been excavated. While the pyramid of Abu Rawash is pretty clear in the satellite imagery, the huge amount of quarrying and agricultural and housing development in the Abu Rawash area made it difficult to identify the geographical and archaeological features in Porter and Moss’ 1932 map. The Survey of Egypt 1:25,000 scale map of 1942 shows how little development had taken place 80 years ago.
In contrast the modern satellite image shows the pyramid and cemeteries as islands of archaeological landscape in a highly developed area.
This made it more difficult to relate Porter and Moss’ map to the satellite image, although the preservation of the pyramid and the cemeteries did help (see from 0.35 minutes in the video).
Scaling the map for georeferencing
Once we identify the area, we need to scale the map to fit that area in the satellite imagery. You can see me undertaking this task from 1.10 minutes in the video. During the process, I discovered an inaccuracy in Porter and Moss’ map – the pyramid of Djedefre had clearly been drawn at a different scale to the rest of the image. Drafting and composing inaccuracies of this type are frustrating but do occur with historic imagery. Other sources of inaccuracies include distortions introduced during the scaling of maps for publication and when publications are scanned or photographed to generate digital images. Photography is particularly problematic as the camera lens needs to be parallel to the image to avoid distortion, but even scanning can produce minor inaccuracies. Dealing with these inaccuracies often means adjusting the georeferencing, splitting an image or ignoring part of the map during the georeferencing process. In this case I chose to georeference the map, while ignoring the pyramid and subsequently cropped out the pyramid and georeferenced it separately.
Adding control points (GCP)
Once Porter and Moss’ map had been scaled to approximately the right scale, it was aligned more precisely to the satellite imagery using ground control points (GCP). These appear in ArcGIS as ‘Links’, and operate essentially as pins. You select a point in the map and then select the same point in the satellite image and ‘pin’ them together. You can see me undertake this process from 7.20 minutes in the video. Ideally, it would be possible to locate these links very precisely in each set of data, but in this example, we are constrained by the contents of Porter and Moss’ map, which does not include many clear points that can be related precisely to points in the satellite imagery. The resolution of the satellite imagery is also a factor. The ArcGIS Basemap World Imagery uses a variety of satellite imagery sources, but the highest resolution of any commercial satellite imagery is currently c. 30cm and much of the imagery is likely to have a resolution of c. 40-50cm or more. This means that the pixels of the satellite image represent 40-50cm on the ground. Any feature smaller than that is invisible, and features that are only slightly larger are difficult to identify. Another effect of the satellite imagery resolution is that when we zoom in close the satellite imagery appears blurry, and a point becomes more difficult to locate than when zoomed out (you can see the effect of this from 8.45 in the video).
Root Mean Square Error (RMSE)
Once we have added four links in ArcGIS, we can open the link table and see a RMSE for the entire map in the top box and the residuals for each point in the right column of the table. Turning off links or adding new links will alter the position of the map and the RMSE accordingly (from 10.30 in the video). The RMSE represents ArcGIS’ calculation of the fit between the actual and desired link positions (Conolly and Lake 2006, 82-83). In simple terms ArcGIS uses the first three links to estimate where it thinks any further links should be. It then calculates the residual for each point as the difference between where you placed a link and where the map ended up based on the other links that have already been placed. The RMSE is the product of all the residuals. Although RMSE is useful, it’s important to recognise that it is reliant upon the accuracy and precision of the map and the satellite imagery. If there are inaccuracies in either, they will increase the RMSE. It is also reliant upon the locations and positioning of the points you choose. The old adage of ‘junk in, junk out’ definitely applies and it is entirely possible to have a low RMSE and a very inaccurate and imprecisely positioned map. So while you can reduce your RMSE by removing links with high residuals and adding new links, it is sometimes better to accept a higher RMSE and keep an important link, recognising that the higher RMSE is due to inaccuracies in the map. Alternatively, it may be necessary to chose which ground control points you believe are more accurate and only link to them.
We aim for an error of less than 1:3000 so for an original image at a scale of 1:15000 an RMSE of under 5 (i.e. 15000/3000) is ideal (Conolly and Lake 2006, 82-83). Ideally we would use the scale given in the original image, but Porter and Moss do not include scale information so we have to work with the scale we established during georeferencing. When I scaled this map I settled on a scale of 1:9000, so any RMSE under 3m would be very acceptable. Here our RMSE is slightly above 3m, which is not unreasonable given the inaccuracies in the map and the difficulty of locating very precise control points due to the resolution of the satellite imagery and changes to the landscape. I subsequently repeated the georeferencing and obtained an RMSE of 2.88, but reducing the RMSE by a large amount is not always possible depending on the scale and accuracy of the map, and the resolution of the satellite imagery. The map of Cemetery F, for example was at a scale of just over 1:500, meaning its RMSE should be 0.16m or under, but I was only able to get it to 0.3m. Nevertheless, under the circumstances that is acceptable because of the resolution of the satellite imagery, which makes it impossible to place a point more precisely than within 0.3m. This is compounded by the imprecise edges of certain archaeological features in the satellite imagery, such as the mastabas of Cemetery F or the satellite pyramid of Djedefre, and any inaccuracies or distortions in the maps. In such cases it is important to be aware of known inaccuracies and distortions in the map and satellite imagery or you can be driven to distraction trying to get inaccurately positioned features to line up.
Ideally, if the RMSE is too high and cannot be reduced, we would seek an alternative source of data, but such data does not exist for some of these sites. In those cases it is much better to have a slightly less than ideally georeferenced map, than none at all. It is also important to be aware of the purpose of your georeferenced map. In this case the relatively modest aim was to locate archaeological features to within 10m, which is achievable with the accuracy of the maps and the resolution of the satellite imagery.
Overall I was satisfied with the georeferencing of the Abu Rawash map. It was a very difficult map to georeference; hard to locate due to the changes to the landscape; difficult to scale due to the inaccuracy in the pyramid; and difficult to find enough precise features to use as GCP links . Nevertheless, the final georeferenced version gives useful insight into the archaeological landscape. With careful thought and reference to the underlying satellite image, it will be possible to locate any relevant archaeological features during the rest of the project.
Acknowledgements and References
Conolly, J. and Lake, M. 2006. Geographical Information Systems in Archaeology. Cambridge.
Porter, B, and Moss, R. 1932, Topographical Bibliography of Ancient Egyptian Hieroglyphics, Texts, Reliefs and Paintings III: Memphis 1. Abu Rawash to Abusir. Oxford.
Maps and images throughout this blog post were created using ArcGIS® software by Esri. ArcGIS® and ArcMap™ are the intellectual property of Esri and are used herein under license. Copyright © Esri. All rights reserved. For more information about Esri® software, please visit http://www.esri.com.
All the satellite imagery used is ArcGIS World Imagery. Sources: Esri, DigitalGlobe, GeoEye, i-cubed, USDA FSA, USGS, AEX, Getmapping, Aerogrid, IGN, IGP, swisstopo, and the GIS User Community.
One thought on “Errors, inaccuracies, resolution and RMSE: Georeferencing a difficult map of Abu Rawash’s pyramid and cemeteries”
Pingback: Finishing the map, georeferencing the pyramid of Djedefre | Archaeology and Egyptology in the 21st century