Facilitate Open Science Training for European Research Focus on data management planning – helping researchers develop their plans Martin Donnelly, Digital Curation Centre, University of Edinburgh OVERVIEW 1.  Data Management Plans and Planning 2.  DMP in Horizon 2020 (pilot overview and specifics) 3.  Support and resources Facilitate Open Science Training for European Research 1. Data management plans and planning 1a. DMP overview •  Research funders often ask for a data management plan (DMP) to be submitted alongside grant applications and/or developed over the course of the research project. (HEIs are increasingly asking their researchers to do this too…) •  A DMP is a short document detailing specifics of the data that will be created during a research project, together with information on how it can be accessed and utilised •  Data management planning is the process of planning, describing and communicating the activities carried out during the research lifecycle in order to… •  Keep sensitive data safe •  Maximise data’s reuse potential •  Support longer-term preservation •  Data management planning underpins and pulls together different strands of data management activities, often across multiple project partners 1b. Why plan? •  Planned activities stand a better chance of meeting their goals than unplanned ones •  Research is increasingly collaborative, interdisciplinary and international. Academia and industry may have different norms, and data management planning can enhance internal communication/coordination •  The human challenges in data management are often more difficult to meet than the technological ones, so communication is vital. •  The process of planning can be more important than the plans created, although the plans themselves play a crucial role in communicating with future data re-users •  But the map is not the land, and you shouldn’t mistake planning something with actually doing that thing… 1c.  What  does  a  data  management  plan  look  like?   Usually  a  brief  statement  defining:     ?  how  data  will  be  captured/created   ?  how  it  will  be  documented   ?  who  will  be  able  to  access  it   ?  where  it  will  be  stored   ?  how  it  will  be  backed  up,  and     ?  whether  (and  how)  it  will  be  shared  and  preserved  long-­‐term   ?  etc     DMPs  are  o?en  submi?ed  as  part  of  funding  applica?ons,  but  will  be  useful   whenever  researchers  are  crea?ng  (or  reusing)  data,  especially  where  the   research  involves  mul?ple  partners,  countries,  etc…   1d. Roles and responsibilities Who’s involved? Remember that RDM is a hybrid activity, involving multiple stakeholder groups… •  The principal investigator (usually ultimately responsible for data) •  Research assistants (may be more involved in day-to-day data management) •  The institution’s funding office (may have a compliance role) •  Library/IT/Legal (The library may issue PIDs, or liaise with an external service who do this, e.g. DataCite.) •  Partners based in other institutions •  Commercial partners •  etc 2. DMP in Horizon 2020 2a. DMP in Europe: overview •  Horizon 2020 includes a data management (planning) pilot •  http://ec.europa.eu/research/participants/data/ref/ h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data- mgt_en.pdf •  Proposals covered •  “Innovation actions” and “Research and innovation actions” •  DMP contents •  Data types; Standards used; Sharing/making available; Curation and preservation •  Multi phase approach •  Initial DMP due within first 6 months •  Mid-term DMP •  Final review stage DMP •  There are opt-out conditions. A detailed description and scope of the Open Research Data Pilot requirements is provided on the Participants’ Portal 2b. The Horizon 2020 DMP pilot DMP v1: Within Six Months For each data set specify the following: •  Data set reference and name •  Data set description •  Standards and metadata •  Data sharing •  Archiving and preservation (including storage and backup) .docx output from DMPonline 2b. The Horizon 2020 DMP pilot DMP v2 and v3: Mid-Term and Final Reviews Scientific research data should be easily: 1.  Discoverable •  Are the data and associated software produced and/or used in the project discoverable (and readily located), identifiable by means of a standard identification mechanism (e.g. Digital Object Identifier)? 2.  Accessible •  Are the data and associated software produced and/or used in the project accessible and in what modalities, scope, licenses? 3.  Assessable and intelligible •  Are the data and associated software produced and/or used in the project assessable for and intelligible to third parties in contexts such as scientific scrutiny and peer review? 4.  Usable beyond the original purpose for which it was collected •  Are the data and associated software produced and/or used in the project useable by third parties even long time after the collection of the data? 5.  Interoperable to specific quality standards •  Are the data and associated software produced and/or used in the project interoperable allowing data exchange between researchers, institutions, organisations, countries, etc? 3. Support and resources 3a. Useful resources •  DCC Guidance, e.g. “How-To Develop a Data Management and Sharing Plan” •  DCC Checklist for a Data Management Plan: http://www.dcc.ac.uk/resources/data- management-plans/checklist •  Links to all DCC DMP resources via http://www.dcc.ac.uk/resources/data- management-plans •  Non-DCC guidance, esp. UKDA, NERC, other universities (Oxford, Edinburgh, Stanford are all v. good) •  DMPonline (and US DMPTool) • Helps  researchers  write  DMPs   • Provides  funder  ques?ons  and  guidance   •  Includes  a  template  DMP  for  Horizon  2020   • Provides  help  from  universi?es   • Examples  and  suggested  answers   • Free  to  use   • Mature  (v1  launched  April  2010)   • Code  is  Open  Source  (on  GitHub)   h?ps://dmponline.dcc.ac.uk       3b.  DMPonline   Ins?tu?ons  can  customise  the  tool  by…   •  Adding  templates     •  Adding  custom  guidance   •  Providing  example  or  suggested  answers   •  Monitoring  usage  within  their  organisa?on   •  Offering  non-­‐English  language  versions     www.dcc.ac.uk/news/customising-­‐dmponline-­‐admin-­‐ interface-­‐launches     More  informa?on   Customising  DMPonline   www.dcc.ac.uk/news/customising-­‐ dmponline-­‐admin-­‐interface-­‐launches     h?p://www.screenr.com/PJHN     Get  the  code,  amend  it,  run  a  local  instance,  flag  issues,  request  features...   h?ps://github.com/DigitalCura?onCentre/DMPonline_v4     3c. Helping researchers •  There are lots of data management plans available on the Web. The DCC provides links to a number of DMPs via http://www.dcc.ac.uk/resources/data-management-plans/ guidance-examples •  Be wary of leaning too heavily on these. There is no magic bullet, and no one-size-fits-all solution •  Much of the benefit of data management planning lies in the process of planning, above and beyond the plans produced at the end of the process •  “Benchmarking” with successful plans from other projects/ disciplines is not a substitute for thinking things through and communicating them well •  The communication involved is twofold: between the contemporary project team (inc. project partners and funders/institutions), and with future data re-users… THANK YOU Martin Donnelly Digital Curation Centre University of Edinburgh martin.donnelly@ed.ac.uk Twitter: @mkdDCC www.dcc.ac.uk www.fosteropenscience.eu Image credits: slides 3 and 6, https://www.flickr.com/photos/jayd/; slide 13, https://twitter.com/AFC_Fisher/