1 00:00:02,920 --> 00:00:07,080 This project is based on some work by a PhD student of mine, Pete Tuckett, 2 00:00:07,080 --> 00:00:09,480 and the rationale was that 3 00:00:09,480 --> 00:00:12,880 we've got two problems with mapping surface water on Antarctica, 4 00:00:12,880 --> 00:00:16,280 one being the scale of Antarctica and another being clouds. 5 00:00:16,280 --> 00:00:20,320 So, we wanted to develop a method that could solve both those problems. 6 00:00:20,320 --> 00:00:22,960 And so we used some open source software, 7 00:00:22,960 --> 00:00:26,000 Google Earth Engine, which allows us to process 8 00:00:26,000 --> 00:00:29,440 huge amounts of satellite data using Google's powerful servers. 9 00:00:29,440 --> 00:00:33,920 I think a key aim of the study was to produce a method that could be used and developed by others. 10 00:00:33,920 --> 00:00:37,800 So, we developed and applied it to one particular area of Antarctica 11 00:00:37,800 --> 00:00:41,280 with the overall aim of upscaling it to the whole of Antarctica. 12 00:00:41,280 --> 00:00:44,280 We really want it to be an approach that can be used by others 13 00:00:44,280 --> 00:00:45,240 and not just Antarctica. 14 00:00:45,240 --> 00:00:48,240 You could apply it to other ice masses like Greenland, for instance. 15 00:00:48,240 --> 00:00:53,600 We wanted it to be in the FAIR principles. 16 00:00:53,600 --> 00:00:57,160 Google Earth Engine is freely available 17 00:00:57,160 --> 00:00:59,440 so the code was written in Google Earth Engine. 18 00:00:59,440 --> 00:01:03,160 We made sure that we've got clear metadata and it was clearly commented 19 00:01:03,160 --> 00:01:04,600 for other people to use, 20 00:01:04,600 --> 00:01:06,960 take that data and use it for themselves. 21 00:01:06,960 --> 00:01:09,040 I don't think we suffered too much in the way of barriers. 22 00:01:09,040 --> 00:01:13,000 I think actually, in physical geography and in general glaciology 23 00:01:13,000 --> 00:01:17,000 there has been a big move to making data open and accessible. 24 00:01:17,000 --> 00:01:20,560 We published in The Cryosphere which is an open access journal. 25 00:01:20,560 --> 00:01:24,800 They have their own policy around making sure data is accessible. 26 00:01:24,800 --> 00:01:27,280 I would say that actually, we were almost encouraged to do so 27 00:01:27,280 --> 00:01:29,400 as part of the publishing process. 28 00:01:29,400 --> 00:01:33,800 Really, that’s what we wanted: to be able to make it so that others can make use of it. 29 00:01:33,800 --> 00:01:37,480 So in retrospect, yeah, I think we could have done 30 00:01:37,480 --> 00:01:39,440 some of the processing in other freely available software. 31 00:01:39,440 --> 00:01:43,360 So Google Earth Engine is where we produce the code to map surface water, 32 00:01:43,360 --> 00:01:47,320 then we applied post-processing to try and clean it up 33 00:01:47,320 --> 00:01:49,800 and extract various metrics of what we wanted. 34 00:01:49,800 --> 00:01:51,440 That was done in MATLAB. 35 00:01:51,440 --> 00:01:55,040 MATLAB does require a license, so it's not accessible to everyone. 36 00:01:55,040 --> 00:01:59,040 So I guess looking back, we could have also tried to do that 37 00:01:59,040 --> 00:02:02,040 in a language that could be used by all. 38 00:02:02,040 --> 00:02:06,800 I think FAIR principles are going to be fundamental to my research going forward. 39 00:02:06,800 --> 00:02:10,040 The FAIR principles are key to being able to produce 40 00:02:10,040 --> 00:02:14,680 or make our data and code available to as wide a range of people as possible. 41 00:02:14,680 --> 00:02:19,560 We want to build on what other people have done and tap into the big data that now exists. 42 00:02:19,560 --> 00:02:23,920 I think FAIR principles is something that should be relatively straightforward 43 00:02:23,920 --> 00:02:27,560 to do now, or is becoming increasingly accessible in itself 44 00:02:27,560 --> 00:02:30,600 just because of the push from journals to do it. 45 00:02:30,600 --> 00:02:34,720 You get help when you're publishing in terms of having publishable statements. 46 00:02:34,720 --> 00:02:39,720 There are also repositories that allow us to store data and generate DOI’s associated with it, 47 00:02:39,720 --> 00:02:44,080 like the Sheffield repository that we've put our data in. 48 00:02:44,080 --> 00:02:47,920 My advice to other researchers interested in making their data FAIR 49 00:02:47,920 --> 00:02:50,440 would firstly be that there's lots of support out there. 50 00:02:50,440 --> 00:02:54,680 I think the second point is that I think there's been 51 00:02:54,680 --> 00:02:58,920 a clear step towards seeing the importance of FAIR data 52 00:02:58,920 --> 00:03:02,040 to that collaborative approach of generating knowledge. 53 00:03:02,040 --> 00:03:05,800 So, when I think of science, I think of it as little building blocks, 54 00:03:05,800 --> 00:03:09,960 building towards making these little advances which build towards other advances 55 00:03:09,960 --> 00:03:14,640 and being able to share your data allows other people to use it 56 00:03:14,640 --> 00:03:18,160 and build on it and take your code and apply it to different areas. 57 00:03:18,160 --> 00:03:21,640 So I think it's a fundamental part of that progress in science 58 00:03:21,640 --> 00:03:24,520 and allowing us to make progress in lots of different areas.