Main Content

HeFDI Research Data Day on 18 June 2021

Foto: Colourbox.de/Andrei Shupilo

Digitally supported research and research data management - challenges and opportunities

Goals
Programme
More information on sessions, lightning talks and live consultation

Goals

The HeFDI Research Data Day aims to give researchers, teachers and all other interested parties the opportunity to learn about and try out data management topics and services in sessions. In addition, various infrastructure facilities and centres from Hessen will present their services and offerings in short lightning talks. At the same time, we offer interested parties live advice on data management in concrete projects.

Programme

Time Agenda  Presenter
8h30-9h00 Digital welcome coffee and technical hiccups
9h00-9h15 Welcome, programme introduction, transition to the sessions, lightning talks and live consultation Ortrun Brand, HeFDI
9h15-10h45

Parallel Events I

Sessions 1-5
Session 1: Introduction to research data management Birte Cordes, Philipps-Universität Marburg
Session 2: Learning to program with Python and Jupyter Notebooks Andreas Schieberle, Hochschule Darmstadt
Session 3: Version control in research projects Tamara Cook, Philipps-Universität Marburg
Session 4: Data wrangling with OpenRefine Andre Pfeifer, Technische Universität Darmstadt; Jens Freund, Technische Universität Darmstadt
Session 5: Data protection concepts for research projects Nina Raschke, Philipps-Universität Marburg
Live advice on data management in your project Esther Krähwinkel, Philipps-Universität Marburg; Nina Dworschak, Uni Frankfurt
 
Lightning Talks
  • 9h15 Power Point is dead! Versioning of lectures incl. data sets via GitLab (Christian Krippes, Justus Liebig University Giessen)
  • 9h45 The Marburg Center for Digital Culture and Infrastructure (MCDCI) (Stefan Schulte, UMR)
  • 10h15 The HeFDI repositories using the example of TUdatalib (Gerald Jagusch, Technische Universität Darmstadt)
10h45-11h15 Coffee break Opportunity for break talks and interaction on wonder.me
11h15-12h45

Parallel Events II

Sessions 6-10
Session 6: Workshop on data management plans Judith Dähne, Hochschule Rhein-Main; Patrick Langner, Hochschule Fulda
Session 7: XML-Workshop Andre Pietsch, Justus-Liebig-Universität Gießen
Session 8: Deep Learning Marcel Giar, Technische Universität Darmstadt; Tim Jammer, Justus-Liebig-Universität Gießen
Session 9: Citizen Science: Citizens doing science Andrea Rapp, Technische Universität Darmstadt; Eva L. Wyss, Universität Koblenz-Landau (Campus Koblenz)
Session 10: Software-based quantitative social research Robert Lipp, Frankfurt University of Applied Sciences
Live advice on data management in your project Esther Krähwinkel, Philipps-Universität Marburg; Nina Dworschak, Uni Frankfurt
 
 
Lightning Talks 
  • 11h15 Data Quality Management with LIDO (Christian Bracht, German Documentation Centre for Art History - Bildarchiv Foto Marburg,)
  • 11h30 Research Data with GeoEngine (Bernhard Seeger, Philipps-Universität Marburg)
  • 11h45 Digitisation of collections: the Corvey project at Philipps-Universität Marburg (Alexander Vielhauer, Philipps-Universität Marburg; Alexander Maul, Philipps-Universität Marburg)
  • 12h00 Central HPC Resources at the University of Marburg (Thomas Gebhardt, Philipps-Universität Marburg)
12h45h-14h Lunch - socialising, discussion, interaction Opportunity to continue discussion and deepen questions on wonder.me

More information on Sessions, Lightning Talks and Live Consultation

Session 1: Introduction to research data management

This introduction provides an overview of all important topics of research data management: including data life cycle, data management plan, requirements of third-party funding bodies and universities, organisation and documentation, data publication and legal issues.

Session 2: Learning to program with Python and Jupyter Notebooks

An introduction for absolute beginners. When dealing with research data, you often have to develop your own solutions for the preparation, analysis and structured storage of data. In this session you will get a first overview of the programming language Python and the possibilities of interactive data analysis with Jupyter Notebooks as an alternative to classical spreadsheets.

Session 3: Version control in research projects

Version control as a method, especially the tool Git, has long since established itself as a self-evident quality standard in professional software development. Moreover, Git can also be usefully applied outside of pure software projects. The more software is used in research, the more it is our responsibility to also use the available tools and methods for quality assurance. In this context, I would like to introduce you to the use of Git and illustrate how version control in longer-term projects can support the traceability of the project history, collaboration and FAIR data management.

Session 4: Data preparation with OpenRefine

Data wrangling with OpenRefine. In our workshop "Data wrangling with OpenRefine" you will learn the basics of wrangling and transforming tabular data with the open source software OpenRefine. OpenRefine provides functions to identify and correct inconsistencies in large amounts of data under a graphical user interface that looks similar to spreadsheet software. For example, it is possible to combine slightly different spellings of a name in different entries (e.g. TU Darmstadt and TU_Darmstadt) by clustering and then label them uniformly. Such data preparation often makes later analysis of the data much easier.

Using an example data set, the operation and important functions of the software are presented and can be practically reproduced on one's own PC. This includes, among other things:

creating a project and importing data,
the use of facet, filter and cluster functions,
the transformation of data (e.g. splitting cell contents) and
the export of data.

Instructions on how to install the software on one's own computer and on the sample data set will be provided in advance of the workshop. No special prior knowledge is required for participation in the workshop.

Session 5: Data protection concepts for research projects

On the way to your research goal, you have to overcome many hurdles, one of which is data protection. A structured check of compliance with data protection regulations can be carried out on the basis of a data protection concept and save you valuable time. The creation of a data protection concept guides researchers through a data protection audit and serves to document it, for example vis-à-vis data subjects, cooperation partners or supervisory authorities. The lecture presents the purposes and necessary contents of such a concept. It will be explained in which cases a concept is worthwhile or even required by law and will address research-specific issues.

Session 6: Workshop on data management plans

Well planned is half the battle! Whether to meet the requirements of research funders or to better structure your own working methods. A well-developed data management plan is an integral part of any research project and helps to think about data types, data processing, storage, data publication, legal issues and other central questions of research organisation at an early stage. In this session you will get an insight into what exactly a data management plan is, which elements it should contain and which tools and checklists are available to help you. Short exercises, practical examples and digital tools will lighten up the lecture and introduce the topic step by step.

Session 7: XML-Workshop

practical introduction to XML and metadata schemas using TEI as an example. In the field of research data management, one hears more and more about XML and cross-disciplinary and discipline-specific metadata schemas. Unfortunately, the application of XML and the use of such metadata schemas is not trivial.

This workshop is aimed at people who have had little or no previous exposure to this topic and is intended to provide a brief practically oriented insight into XML and.

The following questions will be addressed in this workshop:
1) What is XML?
2) When and why should I use XML?
3) How do I use XML?
4) What is a metadata schema?
5) How is a metadata schema created?
6) How do I apply a metadata schema?

Session 8: Deep Learning

An introduction. Deep Learning is a method from the field of Machine Learning. Deep Learning applications can be used to comprehensively analyse data for which formally based models can only be used to a limited extent. This is often the case with data related to human behaviour. Other important application areas are medicine (e.g. tumour detection), linguistics (e.g. speech recognition) or sociology (e.g. social dynamics and processes). In addition to a brief introduction to the topic, this session will provide an initial overview of how Deep Learning systems are structured and the challenges involved in training neural networks. This will be demonstrated using a concrete example of image recognition.

Session 10: Software-based quantitative social research

Practical input and exchange of experience. In quantitative social research, a steadily growing number of software tools are available. These can facilitate the collection and processing of survey data and improve their usability.
In this discussion-oriented workshop, we want to discuss together which tools you use in your research, what experiences you have had with them and what challenges you have had to overcome in this regard. The topics are divided into (1) data collection, (2) data preparation and (3) data publication. Experiences from the nationwide panel study GUS, in which around 10,000 pupils were surveyed annually between 2014 and 2020 using a standardised tablet questionnaire, serve as an introduction.

Lightning Talk: Power Point is dead!

Or: Versioning of presentations incl. data sets via GitLab. With the help of GitLab CI/CD as well as tools such as Pandoc, versionable presentations can be created that also invite collaboration and are easy to update.

Lightning Talk: Das Marburg Center for Digital Culture and Infrastructure (MCDCI)

The session presents the interdisciplinary Marburg Center for Digital Culture and Infrastructure (MCDCI), which is operated as a scientific institution by Philipps-Universität Marburg together with other partner institutions. The MCDCI is dedicated to the application of digital methods in the humanities and social sciences and researches the effects of digitalisation on society and culture. Part of the presentation will be dedicated to the new Master's programme "Cultural Data Studies" offered by the MCDCI.

Lightning Talk: The HeFDI repositories using the example of TUdatalib

Presentation of TUdatalib, the institutional research data repository of TU Darmstadt and the operating model for other universities within HeFDI.

Lightning Talk: Data quality management with LIDO

The use of the data harvesting format LIDO for cultural data is presented, using the practical example of art-historically relevant works of prints and drawings.

Lightning Talk: Digitising collections: the Corvey project at Philipps-Universität Marburg

The project "Medieval Manuscripts of the Corvey Monastery Library digital" focuses on the more than 150 surviving manuscripts and fragments of Corvey provenance, which are currently distributed among 51 institutions worldwide. The aim is to digitise the manuscripts of Corvey provenance located in Marburg and Paderborn at Marburg University Library according to DFG specifications. In addition, the digitisation of scattered holdings at German institutions will be commissioned locally. Finally, all digitised manuscripts are to be recorded with the existing description data in a web portal and thus presented in context again. In the process, the remaining worldwide stray holdings of manuscripts will also be made accessible via the portal as far as possible.

Lightning Talk: Central HPC resources at the Philipps-Universität Marburg

MaRC2, MaRC3a, MaSC und Ma*

* When is it worth using a High Performance Computing (HPC) cluster?
* Which central HPC clusters exist at Marburg University?
* How do I get access to HPC resources?
The Lightning Talk is aimed at researchers with an increased need for compute resources.

Live consultations on data management in your project I and II

Are you currently planning a research project and wondering how to implement your data management? Would you like advice on specific questions? Would you like to know about current developments in the field of research data management? Then come and visit us. We will be happy to discuss your questions with you in our live consultation.

Cooperation Partners