Questions about Research Data Management – and some answers Marjan Grootveld, DANS @MarjanGrootveld@openaire_eu Where do we go? • What’s OpenAIRE?• EC’s Open Research Data Policy • Questions and answers about RDM and DMPlanning • And apologies for the unbalanced number of Dutch examples 2 WHAT IS OPENAIRE? 3 Human Network A “dual core” eInfrastructure for Open Scholarship Digital Network Fosters the social and technical links that enable Open Science in Europe and beyond 4 OpenAIRE •50 Partners from every EU country, and beyond •In 24/7 operation since 2010 •4 project phases to date•Outlook: OpenAIRE Advance starts in 2018 OpenAIRE support materials • Briefing papers, factsheets, webinars, workshops, FAQs • Information on: • Open Research Data Pilot • Creating a data management plan • Selecting a data repository • Personal data https://www.openaire.eu/what-is-the-open-research-data-pilot https://www.openaire.eu/support (factsheets) https://www.openaire.eu/webinars/ https://www.openaire.eu/briefpaper-rdm-infonoads 6 Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem. Nulla consequat massa quis enim. Donec pede justo, fringilla vel, aliquet nec, www.openaire.eu EC’S OPEN RESEARCH DATA POLICY 8 - 9Source: Daniel Spichtinger, European Commission DG RTD, Unit A.6. – October 11, 2017 - 10Source: Daniel Spichtinger, European Commission DG RTD, Unit A.6. – October 11, 2017 Voluntary opt- ins: 81 + 953 projects - 11Source: Daniel Spichtinger, European Commission DG RTD, Unit A.6. – October 11, 2017 EC FAIR data EC in the Guidelines: “This template is not intended as a strict technical implementation of the FAIR principles, it is rather inspired by FAIR as a general concept (…) without suggesting any specific technology, standard, or implementation solution” 12http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf The EC Open Research Data policyKey sources of information • Guidelines on Open Access to Scientific Publications and Research Data in Horizon 2020 http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa -pilot-guide_en.pdf • Guidelines on FAIR Data Management in Horizon 2020 http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-o a-data-mgt_en.pdf • Infographic summarising key policy points http://ec.europa.eu/research/press/2016/pdf/opendata-infographic_072016.pdf • Open Access and Data Management http://ec.europa.eu/research/participants/docs/h2020-funding-guide/cross-cutting-issues/open -access-dissemination_en.htm •Version 5.0 of the Multi-Beneficiary General Model Grant Agreement (18-10- 2017) http://ec.europa.eu/research/participants/data/ref/h2020/mga/gga/h2020-mga-gga-multi_en.pd f 13 SUBMITTED QUESTIONS 14 Many thanks for submitting your questions! • “Funder processes” • What is required when we opt out (on a H2020 project starting in 2017)? • How qualified are reviewers? • Do funders check that action points in plans are carried out? (DMP compliance) • Support for DM Planning • How are librarians involved in producing/ collaborating on DMPs? And in getting good funding? • How can we calculate the impact of RDM on institutional budgets? • Storing and sharing data • How can we be sure that data we collect is real, and how to process data without altering it? • How can we manage research data in repositories? • Legal and ethical questions • How can we deal with IP issues, especially in the case of project partners from the private sector? • GDPR: should agreement on long-term preservation and sharing of data be built into the consent process? • How is anonymisation implemented in text analysis and in large surveys in social sciences? (education researcher dealing with datasets containing students’ opinions and personal information.) • Which online course will teach my team to manage the project’s data coherently, reproducible and open access? 15 1. Funder processes – opting out • Opting out is possible at all stages: grant application, project start, during project. • Consider if a partial opt-out is possible: it’s preferable to full opt-out. • “During the lifetime of a project, a total opt-out is possible for any of the reasons highlighted above. In this case, Article 29.3 is removed from the Grant Agreement via an amendment.” (1) • Justify opting-out in the grant proposal or the DMP. • Inform your project officer. • Adapt your DMP. • “A DMP is required for all projects participating in the extended ORD pilot, unless they opt out of the ORD pilot. However, projects that opt out are still encouraged to submit a DMP on a voluntary basis.” (2) (1) http://ec.europa.eu/research/participants/docs/h2020-funding-guide/cross-cutting-issues/open-access-dat a-management/open-access_en.htm (2) http://ec.europa.eu/research/participants/docs/h2020-funding-guide/cross-cutting-issues/open-access-da ta-management/data-management_en.htm 16 Reasons for opting out 17 • Participation is incompatible with the Horizon 2020 obligation to protect results that can reasonably be expected to be commercially or industrially exploited;• Participation is incompatible with the need for confidentiality in connection with security issues; • Participation is incompatible with rules on protecting personal data; • The project will not generate / collect any research data; or • There are other legitimate reasons not to take part in the Pilot. Note that partial opt out is possible – and preferable to full opt out! Funder processes – reviewers “How qualified are reviewers?” Maybe not formally qualified, but • The EC contracts data management experts to review DMPs and provide recommendations • Project officers receive training • The Commission installed some Expert Groups, e.g. on the FAIR data principles 18 What makes a good DMP? Has the researcher taken time to reflect on what to do? • The reviewer wants to be reassured that due consideration has been given to data and that the approach seems reasonable. • Focus on the data – don’t describe how you’ll deliver your publications. • Make plausible that all project partners and work packages commit to the DMP. • Is the plan appropriate? • Adopt relevant standards and practices that are in line with norms for your field. • Show that you know about support services e.g. university storage, subject repositories… • Does the plan show proper engagement with the issues? • Be specific. • Justify decisions and any restrictions. • Plan when you evaluate and update your DMP. Funder processes – DMP Compliance “Do funders check that action points in plans are carried out?”•When you announce an update, make sure to deliver it. •Machine-actionable DMPs will support compliance checking. 20https://www.zonmw.nl/en/research-and-results/access-to-data/ 2. Support for Research Data Management “How are librarians involved in producing/ collaborating on DMPs?” Van Berchum & Grootveld (2017) 21https://dans.knaw.nl/en/about/organisation-and-policy/information-material/Whitepaper_ResearchdatamanagementAnoverview_DEF.pdf Support for RDM – budgeting “How can we calculate the impact of RDM on institutional budgets?” 22https://www.uu.nl/en/research/research-data-management/what-we-do/data-management-plan 3. (Storing and) sharing data 23 With collaborators while research is active Data are mutable (Open) data sharing Data are stable, searchable, citable, clearly licensed Storing and sharing data – data integrity “How can we be sure that data we collect is real, and how to process data without altering it?” Hm. “real” is hard to guarantee, but transparancy helps: •Be explicit about your method for collecting/generating and processing data. •Always store your raw data and document your changes. •Commit to a good data organisation: folder and file naming, versioning agreements etc. •Have a clear procedure for authorising people to access the data. •Find a trustworthy repository for long-term access: they have processes for data integrity and authenticity and access regimes. 24 CoreTrustSeal Data Seal of Approval ICSU World Data System nestor seal ISO 16363 Storing or archiving? Both! Storing and backing up files while research is active Likely to be on a networked filestore or hard drive Easy to change or delete Archiving or preserving data in the long-term Likely to be deposited in a digital repository Safeguarded and preserved Storing and sharing data nodo “How can we manage research data in repositories?” Hm.• Rather, you outsource data management (stewardship) to the repository. • TDRs may seem picky, but that’s necessary for preservation.• Or join the OpenAIRE network when you want your content to be visible. 26 https://zenodo.org/ https://www.openaire.eu/openaire-factsheet-for-repository-managers 4. Legal and ethical – IPR “How can we deal with IPR issues, especially in the case of project partners from private sector?” Good practice: H2020 project i-Media-Cities addresses IPR in – public! – DMP. 27 https://www.iprhelpdesk.eu/ http://ec.europa.eu/research/participants/data/ref/h2020/other/gm/h2020-guide-cons-a_en.pdf https://www.iprhelpdesk.eu/Fact-Sheet-Copyright-Essentials https://imediacities.eu/results/ Legal and ethical – GDPR “Once GDPR – General Data Protection Regulation – comes into effect next year, what issues are likely to emerge for RDM where human subjects are involved (e.g. social science datasets) - should agreement on the long-term preservation and sharing of those data be built into the consent process?” • GDPR makes explicit exemptions for the processing of personal data for research as well as for archiving personal data “in the public interest” (including long-term preservation). • No clear picture yet on complementary national laws.• Highly recommended to include in informed consent forms that data might or will be stored for the long term, and that these datasets can be accessed:• Available for everyone or access restricted to academic research• Without informed consent forms personal data may only be stored/accessed for research under very strict conditions – dependent on national legislation.• In all cases: • principle of data minimisation: collect & store as little personal data as possible;• processing personal data will be subject to appropriate technical and organisational safeguards for the rights and freedoms of the data subject (respondent, test subject…). 28 http://www.eugdpr.org/ EU general GDPR portal Based on information provided by Heiko Tjalsma, DANS legal officer The OpenAIRE webinar tomorrow morning is about Legal aspects of Open Science Disclaimer: OpenAIRE is no legal advisor Legal and ethical – anonymisation “How is anonymisation implemented in text analysis, and in large surveys in social sciences?” In general:• Don’t collect identifiable data – data minimisation principle!• Anonymise data: remove, generalise, aggregate or distort data identifiers. Plan anonymisation at the time of transcription or initial write-up. • Use pseudonyms or replacements that are consistent within the research team and throughout the project.• Generalise the meaning of a detailed text variable by replacing potentially disclosive free- text responses with more general text. • Get informed consent for sharing personal data (tricky but might be worth trying). In short: preventing identification should be part of the study design. 29https://www.slideshare.net/OpenAIRE_eu/amnesia-data-anonymization-made-easy-8th-openai re-workshop 5. RDM training “Which online course will teach my team to manage the project’s data coherently, reproducible, open access?” 30 RDNL’s Essentials 4 Data Support http://datasupport.researchdata.nl/en/ all content available for blended learning under CC-BY-SA Summary of training materials from various organisations http://www.dcc.ac.uk/training/data-managemen t-courses-and-training Many resources on Open Science and RDM https://www.fosteropenscience.eu/resourc es MOOC by University of North Carolina at Chapel Hill & University of Edinburgh https://www.coursera.org/learn/data-ma nagement Bonus: updated GA Version 5.0, 18 October 2017, shows changes in •Article 29.3 “Open access to research data”: to provide for third party access to research data in health actions in cases of public health emergencies. Access may also be required of health actions not participating in the Open Data pilot. •Article 34 “Ethics and research integrity”: to align the provisions on ethical and research integrity principles to the new (2017) European Code for Research Integrity adopted by ALLEA - All European Academies. 31 http://ec.europa.eu/research/participants/data/ref/h2020/mga/gga/h2020-mga-gga-multi_en.pdf http://www.allea.org/wp-content/uploads/2017/03/ALLEA-European-Code-of-Conduct-for-Research-Integri ty-2017-1.pdf GA Article 34.1 Grant Agreement article 34.1 “Obligation to comply with ethical and research integrity principles” 32 http://ec.europa.eu/research/participants/data/ref/h2020/mga/gga/h2020-mga-gga-multi_en.pdf http://www.allea.org/wp-content/uploads/2017/03/ALLEA-European-Code-of-Conduct-for-Research-Integri ty-2017-1.pdf OpenAIRE support materials https://www.openaire.eu/opendatap ilothttps://www.openaire.eu/support https://www.slideshare.net/openai re_eu • Briefing papers, factsheets, webinars, workshops, FAQs • Information on: • Open Research Data Policy • Creating a data management plan • Selecting a data repository • Personal data 33 Related webinars Introductory RDM webinar , Tony Ross-Hellauer & Marjan Grootveld, 30 May 2017: • Reasons to manage data • How to manage and share data (+ how to respond to concerns about sharing)• OpenAIRE services• Slides: https://www.slideshare.net/OpenAIRE_eu/20170530open-research-data-in-horizon-2020 Open Research Data in H2020 and Zenodo , Marjan Grootveld & Krzysztof Nowak, 26 October 2016:• Sustainable file formats differ across domains and repositories• Funders embrace the FAIR data principles – implications for Data Management Planning?• Slides: www.slideshare.net/OpenAIRE_eu/openaire-webinar-on-open-research-data-in-h2020-oaw2016 • Q&A document: https://www.openaire.eu/public-documents?id=843&task=document.viewdoc How to write a DMP, Sarah Jones & Marjan Grootveld, 7/14 July 2016: • What is a Data Management Plan and why to write it?• Example DMPs in different domains, with lots of links!• Guidance, e.g. storing =/= archiving; how to find a repository; file-naming conventions FAIR data in Trustworthy Data Repositories , Peter Doorn & Ingrid Dillo, 12/13 December 2016:• Proposal for scoring datasets on Findability, Accessibility and Interoperability = Reusability levels• Inspired by the Data Seal of Approval criteria for Trustworthy Data Repositories • Slides: http://www.slideshare.net/EUDAT/fair-data-in-trustworthy-data-repositories-webinar-1213-december-2016 https://www.eudat.eu Research Data Services, Expertise & Technology 34 www.openaire.eu @openaire_eu facebook.com/groups/openaire linkedin.com/groups/OpenAIRE-3893548 marjan.grootveld@dans.knaw.nl 35 Any more questions? Thanks to all who submitted questions when registering for the webinar. And enjoy tomorrow’s webinars!