Processing done on Geograph Images
Published: 8 August 2017
Contents
Processing done by Geograph
Note, although Geograph itself actually processes the images through the algorithm, the algorithms themselves are generally created by a third party!Term Extraction
Using various online APIs, (like the Yahoo Term Extraction API Link ) to extract 'key' terms from the contributor supplied textual description.Creates a sort of tag, so could allow faceted browsing.
We've run a vast majority of Geograph images though one API or other (changed as availability of APIs change)
Data is available via: data.geograph.org.uk/dumps/ and is used in a few places on Geograph Website.
Example: Link
Cluster Labels
Using Carrot2 clustering engine Link , cluster labels have been assigned to a majority of images.This can in some cases pick up associations not actually in the original metadata.
Data is available via: data.geograph.org.uk/dumps/ and is used in a few places on Geograph Website.
Example: Link
And see the 'Automatic Clusters' sidebar in Link
Computer Vision API
A pre-trained computer Artificial Intelligence is used to predict labels using the image itself Link . (not any text linked with the image)We've run most geograph images this the system, enough to experiment with the data. See:
Link
Land Cover
Land cover describes the physical material on the surface of the country. For example: grassland, woodland, rivers & lakes and artificial materials such as roads and buildings.A computer algorithm predicts the landcover by looking at satellite data (imagery etc), we then classify images by landcover.
For the moment see: Link
Link Checking and Web-Archiving
We have extracted all links from all image descriptions, and undertaken to check if they are still valid. At the same time check if the pages referred are available in any oneline 'webarchive'. So that if the page ever disappears, we can instead link to the archived version. If not found in an archive, we do attempt to ask for the page to be archived!See: Checking External Links
Nearest Placename
We do lookup the nearest settlement to the image, by consulting various gazetteers (generally the best we have available!) This allows us to show a 'near placename' on the photo page. But also most gazetteers also list a county/country for the place, so this gives an approximate country/county for the image too (but is not exact).Processing done by Third Parties
Scenic predictions
A sample of some 200,000 images around London have been processed to predict scenicness. Trained with data from Link which rated a different selection via on online game.
Academic paper:
Seresinhe CI, Preis T, Moat HS (2017) Using deep learning to quantify the beauty of outdoor places. Royal Society Open Science 4(7): 170170. Link
Additionally, please cite the Dryad data package:
Output data:
Seresinhe CI, Preis T, Moat HS (2017) Data from: Using deep learning to quantify the beauty of outdoor places. Dryad Digital Repository. Link