Main Content
HeFDI Research Data Day on 18 June 2021
Digitally supported research and research data management - challenges and opportunities
Goals
Programme
More information on sessions, lightning talks and live consultation
Goals
The HeFDI Research Data Day aims to give researchers, teachers and all other interested parties the opportunity to learn about and try out data management topics and services in sessions. In addition, various infrastructure facilities and centres from Hessen will present their services and offerings in short lightning talks. At the same time, we offer interested parties live advice on data management in concrete projects.
Programme
Time | Agenda | Presenter |
8h30-9h00 | Digital welcome coffee and technical hiccups | |
9h00-9h15 | Welcome, programme introduction, transition to the sessions, lightning talks and live consultation | Ortrun Brand, HeFDI |
9h15-10h45 |
Parallel Events I |
|
Sessions 1-5 | ||
Session 1: Introduction to research data management | Birte Cordes, Philipps-Universität Marburg | |
Session 2: Learning to program with Python and Jupyter Notebooks | Andreas Schieberle, Hochschule Darmstadt | |
Session 3: Version control in research projects | Tamara Cook, Philipps-Universität Marburg | |
Session 4: Data wrangling with OpenRefine | Andre Pfeifer, Technische Universität Darmstadt; Jens Freund, Technische Universität Darmstadt | |
Session 5: Data protection concepts for research projects | Nina Raschke, Philipps-Universität Marburg | |
Live advice on data management in your project | Esther Krähwinkel, Philipps-Universität Marburg; Nina Dworschak, Uni Frankfurt | |
Lightning Talks |
|
|
10h45-11h15 | Coffee break | Opportunity for break talks and interaction on wonder.me |
11h15-12h45 |
Parallel Events II |
|
Sessions 6-10 | ||
Session 6: Workshop on data management plans | Judith Dähne, Hochschule Rhein-Main; Patrick Langner, Hochschule Fulda | |
Session 7: XML-Workshop | Andre Pietsch, Justus-Liebig-Universität Gießen | |
Session 8: Deep Learning | Marcel Giar, Technische Universität Darmstadt; Tim Jammer, Justus-Liebig-Universität Gießen | |
Session 9: Citizen Science: Citizens doing science | Andrea Rapp, Technische Universität Darmstadt; Eva L. Wyss, Universität Koblenz-Landau (Campus Koblenz) | |
Session 10: Software-based quantitative social research | Robert Lipp, Frankfurt University of Applied Sciences | |
Live advice on data management in your project | Esther Krähwinkel, Philipps-Universität Marburg; Nina Dworschak, Uni Frankfurt | |
Lightning Talks |
|
|
12h45h-14h | Lunch - socialising, discussion, interaction | Opportunity to continue discussion and deepen questions on wonder.me |
More information on Sessions, Lightning Talks and Live Consultation
Session 1: Introduction to research data management
This introduction provides an overview of all important topics of research data management: including data life cycle, data management plan, requirements of third-party funding bodies and universities, organisation and documentation, data publication and legal issues.
Session 2: Learning to program with Python and Jupyter Notebooks
An introduction for absolute beginners. When dealing with research data, you often have to develop your own solutions for the preparation, analysis and structured storage of data. In this session you will get a first overview of the programming language Python and the possibilities of interactive data analysis with Jupyter Notebooks as an alternative to classical spreadsheets.
Session 3: Version control in research projects
Version control as a method, especially the tool Git, has long since established itself as a self-evident quality standard in professional software development. Moreover, Git can also be usefully applied outside of pure software projects. The more software is used in research, the more it is our responsibility to also use the available tools and methods for quality assurance. In this context, I would like to introduce you to the use of Git and illustrate how version control in longer-term projects can support the traceability of the project history, collaboration and FAIR data management.
Session 4: Data preparation with OpenRefine
Data wrangling with OpenRefine. In our workshop "Data wrangling with OpenRefine" you will learn the basics of wrangling and transforming tabular data with the open source software OpenRefine. OpenRefine provides functions to identify and correct inconsistencies in large amounts of data under a graphical user interface that looks similar to spreadsheet software. For example, it is possible to combine slightly different spellings of a name in different entries (e.g. TU Darmstadt and TU_Darmstadt) by clustering and then label them uniformly. Such data preparation often makes later analysis of the data much easier.
Using an example data set, the operation and important functions of the software are presented and can be practically reproduced on one's own PC. This includes, among other things:
creating a project and importing data,
the use of facet, filter and cluster functions,
the transformation of data (e.g. splitting cell contents) and
the export of data.
Instructions on how to install the software on one's own computer and on the sample data set will be provided in advance of the workshop. No special prior knowledge is required for participation in the workshop.
Session 5: Data protection concepts for research projects
On the way to your research goal, you have to overcome many hurdles, one of which is data protection. A structured check of compliance with data protection regulations can be carried out on the basis of a data protection concept and save you valuable time. The creation of a data protection concept guides researchers through a data protection audit and serves to document it, for example vis-à-vis data subjects, cooperation partners or supervisory authorities. The lecture presents the purposes and necessary contents of such a concept. It will be explained in which cases a concept is worthwhile or even required by law and will address research-specific issues.
Session 6: Workshop on data management plans
Well planned is half the battle! Whether to meet the requirements of research funders or to better structure your own working methods. A well-developed data management plan is an integral part of any research project and helps to think about data types, data processing, storage, data publication, legal issues and other central questions of research organisation at an early stage. In this session you will get an insight into what exactly a data management plan is, which elements it should contain and which tools and checklists are available to help you. Short exercises, practical examples and digital tools will lighten up the lecture and introduce the topic step by step.
Session 7: XML-Workshop
practical introduction to XML and metadata schemas using TEI as an example. In the field of research data management, one hears more and more about XML and cross-disciplinary and discipline-specific metadata schemas. Unfortunately, the application of XML and the use of such metadata schemas is not trivial.
This workshop is aimed at people who have had little or no previous exposure to this topic and is intended to provide a brief practically oriented insight into XML and.
The following questions will be addressed in this workshop:
1) What is XML?
2) When and why should I use XML?
3) How do I use XML?
4) What is a metadata schema?
5) How is a metadata schema created?
6) How do I apply a metadata schema?
Session 8: Deep Learning
An introduction. Deep Learning is a method from the field of Machine Learning. Deep Learning applications can be used to comprehensively analyse data for which formally based models can only be used to a limited extent. This is often the case with data related to human behaviour. Other important application areas are medicine (e.g. tumour detection), linguistics (e.g. speech recognition) or sociology (e.g. social dynamics and processes). In addition to a brief introduction to the topic, this session will provide an initial overview of how Deep Learning systems are structured and the challenges involved in training neural networks. This will be demonstrated using a concrete example of image recognition.
Session 10: Software-based quantitative social research
Practical input and exchange of experience. In quantitative social research, a steadily growing number of software tools are available. These can facilitate the collection and processing of survey data and improve their usability.
In this discussion-oriented workshop, we want to discuss together which tools you use in your research, what experiences you have had with them and what challenges you have had to overcome in this regard. The topics are divided into (1) data collection, (2) data preparation and (3) data publication. Experiences from the nationwide panel study GUS, in which around 10,000 pupils were surveyed annually between 2014 and 2020 using a standardised tablet questionnaire, serve as an introduction.
Lightning Talk: Power Point is dead!
Or: Versioning of presentations incl. data sets via GitLab. With the help of GitLab CI/CD as well as tools such as Pandoc, versionable presentations can be created that also invite collaboration and are easy to update.
Lightning Talk: Das Marburg Center for Digital Culture and Infrastructure (MCDCI)
The session presents the interdisciplinary Marburg Center for Digital Culture and Infrastructure (MCDCI), which is operated as a scientific institution by Philipps-Universität Marburg together with other partner institutions. The MCDCI is dedicated to the application of digital methods in the humanities and social sciences and researches the effects of digitalisation on society and culture. Part of the presentation will be dedicated to the new Master's programme "Cultural Data Studies" offered by the MCDCI.
Lightning Talk: The HeFDI repositories using the example of TUdatalib
Presentation of TUdatalib, the institutional research data repository of TU Darmstadt and the operating model for other universities within HeFDI.
Lightning Talk: Data quality management with LIDO
The use of the data harvesting format LIDO for cultural data is presented, using the practical example of art-historically relevant works of prints and drawings.
Lightning Talk: Digitising collections: the Corvey project at Philipps-Universität Marburg
The project "Medieval Manuscripts of the Corvey Monastery Library digital" focuses on the more than 150 surviving manuscripts and fragments of Corvey provenance, which are currently distributed among 51 institutions worldwide. The aim is to digitise the manuscripts of Corvey provenance located in Marburg and Paderborn at Marburg University Library according to DFG specifications. In addition, the digitisation of scattered holdings at German institutions will be commissioned locally. Finally, all digitised manuscripts are to be recorded with the existing description data in a web portal and thus presented in context again. In the process, the remaining worldwide stray holdings of manuscripts will also be made accessible via the portal as far as possible.
Lightning Talk: Central HPC resources at the Philipps-Universität Marburg
MaRC2, MaRC3a, MaSC und Ma*.
* When is it worth using a High Performance Computing (HPC) cluster?
* Which central HPC clusters exist at Marburg University?
* How do I get access to HPC resources?
The Lightning Talk is aimed at researchers with an increased need for compute resources.
Live consultations on data management in your project I and II
Are you currently planning a research project and wondering how to implement your data management? Would you like advice on specific questions? Would you like to know about current developments in the field of research data management? Then come and visit us. We will be happy to discuss your questions with you in our live consultation.