Are duplicates a problem?
This is one of the commonest falacies that I come across very regularly and it’s actually not a problem at all – having more data is always better than less and duplicates are inherent to all biological data.
Consider the following scenario:
30 birdwatchers all stand watching a rare Bluethroat and all enter their records into their prefered database.
That immediately creates 30 records in the system but there was only 1 bird seen so you can see that any report to find out how many Bluethroats there were would only consider 1 sighting for that species, at that location, on that day and would deduplicate by removing recorder’s name. On the flipside though a data analyst back at the records centre might be interested to know how many people took the trouble to travel to see the Bluethroat each day and so they would want to know all those records and they’d count how many recorders saw the Bluethroat that year at roughly that location.
Some duplication happens by accident – consider this example:
There are 3 records in iRecord for Doros profuges, seen on the 18th June 2017, by 2 people in roughly the same location. I happen to know (because I was there!) that there was only 1 fly and 2 of us recorded it. My record was probably entered into my iRecord app on the day but another seems to have been entered by accident later using an estimated location. It was a simple mistake but it was better to have all records and not to risk losing that sighting. By the way, please don’t try to go to that site to refind this species because it’s inside the Main Impact Area on Salisbury Plain military training area and the ground is littered with unexploded mortars! Marc & I were granted access for a few years to survey the site but nobody is allowed on there now.
So, duplication is actually common and preferable to missing information and any deduplication (if necessary) is done by the person who is using the data. Records are hardly ever listed in their raw format and all analysts & researchers worth their salt will know to filter and deduplicate them in whichever way suits their purpose.
Which platform should I use to log & save my records?
With biological recording gaining in popularity, there have been a lot of different apps and websites springing up to give everyone more choice. Each has their pros & cons but there are a few things to take into account when you choose what’s best for you:
- Data is most powerful and effective when lots of it is brought together and the biggest network for bringing data together is the National Biodiversity Network (NBN), which holds hundreds of millions of records dating back decades. Their NBN Atlas website allows anyone to map and look at the data to make their own research and decisions. It makes sense to use a platform that feeds data seemlessly into this network and doesn’t need extra work to make the data compatible.
- It’s useful to have an app that you can use from your phone to record while you are on the move – something that will automatically know your position using the phone’s GPS receiver and can fill in most of the data for you – you just have to upload a photo and choose what you think the species is.
- On the flipside it’s also useful to have a good website where you can go to look at your records and maybe compare them with other people’s.
- It’s also useful if you have a comprehensive database of all the species you are likely to come across with names that are kept in sync with all the partner organisations, so that you know that data on each platform is equivalent and easily compared.
The main recording platforms that cover all types of UK wildlife are:
- iRecord: Run by the Biological Records Centre (BRC) in the UK, this platform is the main feeder network into the NBN Atlas. It has a network of expert data verifiers who can check whether your identification is correct and query any information that seems out of the ordinary. They have an app and it has recently incorporated some image recognition so you can just upload an image and it will tell you what it thinks it is. It uses the UK Species Inventory database of wildlife names, which is completely compatible with the NBN network. I would recommend this to the slightly more technical/expert user.
- iNaturalist: Is a slightly newer platform from the US which covers the whole world and its original selling point was its world class image recognition. Nowadays this is getting commoner but their system is still probably the best because it was trained on a vast database of human-verified photos. It now feeds into the iRecord network and the data can then flow into the NBN. This has been developed with novices in mind and is super easy to use – I’d recommend this app to anyone who likes to take a photo of everything they see btu if you are using raw data without photos then iRecord is an easier/better route.
- BirdTrack: Produced by the British Trust for Ornithology (BTO), this is probably the biggest of a raft of specialist recording apps aimed at one group – in this case, birds. As loing as their network feeds into the NBN directly or via iRecord then they are all good to use. This one has a particular community feel to it as you can quickly see what other people in your area are seeing. Data flows into the NBN and the naming is regularly synchronised with the UKSI.
- LivingRecord: This platform is primarily used by local records centres along the south coast and isn’t compatible with the NBN network. Data stored here will have to be converted before uploading to iRecord or the NBN. It isn’t UKSI compatible and the data needs to be worked on before it can be imported into the NBN network.
- MapMate: This was one of the first systems written to share data between contributors and was originally taken up by the moth-recording community. It gained a lot of support but has largely been superceeded by more modern systems like iRecord. It isn’t UKSI compatible and the data needs to be worked on before it can be imported into the NBN network.
In the field I tend to use iNaturalist for anything I can’t identify – I take a photo of it and get the app to ID it for me and submit a record. If I know what I’m recoprding then I use the iRecord app to put records directly into the system. If I am at home ansd want to upload batches of raw data without photos, from the specimens I identify under the microscope, I would put them into my collection database and then export a spreadsheet for uploading to iRecord.
Should I correct mistakes in other people’s data?
No, never interpret another person’s records or try to add value by using your own interpretation of what they saw – unless you make it absolutely clear that you have done so and you preserve the original record verbatim. The only exception might be where there is a clear typo but even then you should take advice because you might be unaware that the original data is correct.
This problem usually occurs when someone is entering documentary / paper records into a database and they see species names like “greenbottle” and then they try to “improve” that by using their own knowledge and put something like “Lucilia sericata“, not knowing that the term can apply to a dozen similart species. They might see a location name and think that they can add a “better”, more accurate map reference, but the safest thing to do is to enter the data as-found and then contact the original recorder to clarify, if that’s possible. There have been cases where a person sees “Braemar” or “Braemore” and confuse the 2 very different localities – one is in Scotland while the other is in Hampshire! If the original recorder gives you additional information then that’s fine to add to the record.