Content from IPCC FAIR Background
Last updated on 2024-10-25 | Edit this page
Overview
Questions
- What are the fundamentals to produce a FAIR IPCC Assessemt Report`?
Objectives
- Learn the FAIR priniciples and motivations
- Learn about genereal Research Data and Software Management Practices
- Learn about story telling and visualisation
FAIR data principles at IPCC
motivation for archiving code, figures, input data and metadata FAIR guidance
The experience in AR6 and shortcomings
including some examples.
Fundamentals of Sustainable Research Software
Software Management Plans eScience
Fundamentals of Research Data Management
Models of interactive products:
Storytelling, layered storytelling, static vs dynamic, GIS-based models.
Content from Research Data and Software Managment
Last updated on 2024-11-15 | Edit this page
Overview
Questions
- What is research data?
- What is research software?
- Why is important to properly describe, protect and share research data and software?
Objectives
- Understand the importance of disseminating research data and the code used for its generation
- Undertand the benefits of a Research Data Management plan (via the Turing Way)
- Understad the difference between research code and software and the benefits of a Software Management Plan
Research Data Management
Climate science has significant public interest, as it affects people’s lives, economies, and ecosystems. Effective Research Data Management supports open science initiatives by making data accessible to the public, policymakers, and other stakeholders, increasing transparency, and encouraging public engagement. This openness builds trust and fosters greater awareness and informed decision-making regarding climate action.
Research Data Management underpins the accuracy, reproducibility, and impact of research findings. It supports collaborative and transparent science. In IPCC it helps ensure that investments in the realisation of the assessments continue to benefit scientific inquiry and public policy.
Callout
“The Turing Way”, an open science and community-driven project focused on making data science more accessible, understandable, and effective, offers a general overview on the purposes and practices that motivates RDM, illustrating guidelines and useful approaches to put that into practice.
Reproducible Research according to the Turing Way
For instance some IPCC Working Groups may propose a Data Management Plan.
Software Management Plans
Before diving into Software Management Plans, it is important to highlight the distinctions between research code and research software
Rresearch Code is the individual, often experimental, coding work that solves specific problems in the research process, It is often a custom solution developed for a specific research question or experiment. For instance the script used to generate one of the figures in the IPCC reports.
Research Software is a broader, often more stable tool or platform that assists in conducting research across various stages of the workflow. Both are critical components of modern research, with research code often contributing to the development of research software. Eg. the ESMVal Tool
Key characteristics of Research Code
- Custom and Domain-Specific: It is typically tailored to address the unique needs of a particular research task or domain (e.g., bioinformatics, physics simulations, social sciences).
- Prototyping and Experimentation: Often experimental, evolving during the research process as the researcher tests and refines ideas. This could be in the form of scripts for data collection, analysis, or visualization.
- Reproducible: In many cases, research code is shared openly to promote reproducibility and transparency. Open-source platforms like GitHub, GitLab, and Bitbucket are commonly used for sharing and collaborating on research code. Scripts my be expresssed as Jupyter Notebooks and re-executed in Jupyter platforms like Jupyter Lab, Jupyter Hub or Binder.
Key characteristics of Research Sofware
- Comprehensive and integrated: supporting tasks like data management, analysis, and visualization, often with a user-friendly interface. For instance, tools like SPSS, MS Excel, or Tableau.
- Production-ready: stable, and maintainable, featuring error handling and documentation. It can be domain-specific (e.g., statistical tools, simulation platforms) or general-purpose (e.g., text editors, database systems) and is widely used in research.
Callout
Some working groups may consdier to propose a Software Management Plan. This is usually a document that addresses questions such as.
- What does it do?
- Who is it for?
- What resources does it need?
- Who is responsible?
- What licence does it needs?
Having such clarity early on, avoid problems later, with the objective of facilitatng IPCC to deliver FAIR code and software. Example of SMPs exists in many organisations. A detailed list of elements that are relevant in the defintion of SMP for Research Code and Software is provided by the Dutch insitute for eScience.
In IPCC, SMPs can have the scope to define key aspects which should be taken into account by the authors, depending whether they will develop and release simple scripts, for data analysis and visualisation purposes, or more complex Research Software, like for instance a new IPCC Atlas.
Challenge 1: Can you classify the following software types?
::::::::::::::::::::::::::::::::::::::::::::::::
Key Points
Content from AR7 Tutorial
Last updated on 2024-10-25 | Edit this page
Overview
Questions
- How do I generate digital outputs?
- How do I describe and curate digital outputs?
- How do I transfer to the TSU the digital outputs?
Objectives
- Produce data and figures (tooling/programming)
- Create and Manage a software repository
- Generate Metadata
- Obtain a DOI for data
- Obtain a DOI for software
The AR6 experience and lessons learnt
Author profiles
spreadsheet sam, notebook nancy, script sandy
Categories and governance of digital IPCC products
From figures to interactive applicatio
Content from Editing Tutorial - Markdown
Last updated on 2024-10-25 | Edit this page
Overview
Questions
- How do you write a lesson using Markdown and sandpaper?
Objectives
- Explain how to use markdown with The Carpentries Workbench
- Demonstrate how to include pieces of code, figures, and nested challenge blocks
Introduction
This is a lesson created via The Carpentries Workbench. It is written in Pandoc-flavored Markdown for static files and R Markdown for dynamic files that can render code into output. Please refer to the Introduction to The Carpentries Workbench for full documentation.
What you need to know is that there are three sections required for a valid Carpentries lesson:
-
questions
are displayed at the beginning of the episode to prime the learner for the content. -
objectives
are the learning objectives for an episode displayed with the questions. -
keypoints
are displayed at the end of the episode to reinforce the objectives.
Challenge 1: Can you do it?
What is the output of this command?
R
paste("This", "new", "lesson", "looks", "good")
OUTPUT
[1] "This new lesson looks good"
Challenge 2: how do you nest solutions within challenge blocks?
You can add a line with at least three colons and a
solution
tag.
Figures
You can use standard markdown for static figures with the following syntax:
{alt='alt text for accessibility purposes'}
Callout
Callout sections can highlight information.
They are sometimes used to emphasise particularly important points but are also used in some lessons to present “asides”: content that is not central to the narrative of the lesson, e.g. by providing the answer to a commonly-asked question.
Math
One of our episodes contains \(\LaTeX\) equations when describing how to create dynamic reports with {knitr}, so we now use mathjax to describe this:
$\alpha = \dfrac{1}{(1 - \beta)^2}$
becomes: \(\alpha = \dfrac{1}{(1 - \beta)^2}\)
Cool, right?
Key Points
- Use
.md
files for episodes when you want static content - Use
.Rmd
files for episodes when you need to generate output - Run
sandpaper::check_lesson()
to identify any issues with your lesson - Run
sandpaper::build_lesson()
to preview your lesson locally
Content from Lifecycle of figures in the IPCC Reports
Last updated on 2024-10-25 | Edit this page
Overview
Questions
- How do figures and figure submission requirements evolve throughout the cycle ?
Objectives
- Provide an overview of the figures life-cycle
- Describe evolving requirements for figure submission at the different draft versions
Figures evolve throughout the cycle. At each draft, new figures are created, some are discarded, or combined. At the end of the process, for the publication of the final draft, we wish to collect information on who created the figure, how, and using what data. Storing this information allows figure authors to get credit for their work, and allows other researchers to build on the work of the IPCC, in line with the best practices of open science.
The following lays out instructions for authors on how to organize figure information for submission to the TSU. Requirements are basic for the zero order draft, and increase in comprehensiveness as we move toward the Final Government Draft.
Zero Order Draft
At this point, figures are mostly placeholders. Authors will for example suggest that “here should be a figure showing x,y,z”. There are no expectations of an actual figure being submitted at this stage.
First Order Draft
TODO
Second Order Draft
TODO
Final Government Draft
Here we expect authors to submit - The figure itself - The data used to create the figure, and a reference for each dataset. Note that this data should be as close as possible to what is shown in the figure. If any analysis is required to translate original input data into figure-ready data, then authors should publish this data, get a DOI for it, and reference it in the figure metadata. See TODO. - The code used to create the figure based on the data provided. - Information on the author(s) of the figure - The proposed caption for the figure (???)
Figures adapted to different audiences
Some key figures prepared by chapters are highlighted in the technical summary (TS), and later in the summary for policy makers (SPM). The intended audience for chapters, TS and SPM are of course different, and as a result, figures need to be adapted. This process will be facilitated by the data collection described above.
For example, the figures below show how chapter figure 6.3 and its underlying data was reused to create new figures for the technical summary (TS9), and later the SPM (SPM3).
Content from Licensing Tutorial
Last updated on 2024-11-14 | Edit this page
Overview
Questions
- What licenses are required for datasets or data products that are used?
- What licenses should we apply to created datasets?
Objectives
- Understand how to classify source code and data (input, intermediate assessment, final assessment)
- Understand recommended licenses for each data type and code
Introduction
The content of this lesson is taken from the recommendations from the IPCC Task Group on Data Support for Climate Change Assessments (Huard et al 2022).
Licensing of IPCC material, with clear and consistent meaning in all legal jurisdictions, is essential to facilitate its appropriate use to address pressing climate change challenges, while protecting the rights of data providers.
Callout
The IPCC reports and data are licensed separately!
IPCC reports are published under a copyright license that prohibits commercial use and the creation of derivative products, unless discussed first and then given permission by the IPCC Secretariat. This license is applied to protect IPCC reports from distortion since these are accepted by member governments, or approved in the case of the Summary for Policymakers, and adopted in the case of the Synthesis Report. If the same license was applied to data products, it would severely limit their usefulness and value. A different IPCC data license is required to allow the creation of derivatives for the pursuit of research and the re-use of IPCC data-based products for national assessments, adaptation and mitigation policies.
Classifying Data Types
TG-Data distinguishes three categories of data: input data, intermediate assessment data, and final assessment data.
Input data denotes the source data that underpins information in the assessment reports. It is typically authored by credible, authoritative, trusted sources, who decide under which license it is published.
Intermediate assessment data is the outcome of data processing and analysis performed as part of the assessment as an intermediate step in the generation of final assessment data. Data is only defined as intermediate if it has gone through non-trivial processing to be considered an original product, distinct from the input data.
Final assessment data refer to data which is directly presented in data tables or graphically displayed (e.g. as a line graph or a spatial map) in the report.
Source code refers to scripts, online code repositories, and software libraries written to create intermediate and final assessed data, as well as the figures included in the reports.
Licenses For Different Data Types
Input data shall be licensed under the same license terms and conditions imposed by the data providers. Input data copyright holders are encouraged to adopt well-known licenses enabling broad usage, including commercial use, and avoid “ShareAlike” licenses.
Intermediate and final assessment data should be licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0) license, where this does not infringe the interests of relevant license holders. The Creative Commons family of licenses are designed to provide legal interoperability across virtually all jurisdictions.
When input datasets are published under restrictive licenses, waivers or exemptions can be sought for the IPCC assessment reports. These waivers should be negotiated with copyright holders by Working Group co-chairs, with guidance from TG-Data representatives.
These waivers would ensure that derivative products can be licensed by the IPCC under CC BY 4.0, and that the version used by the assessment report is curated in a long-term archive, either by IPCC DDC or another trusted data repository. If exemptions cannot be obtained from the copyright owners, the applicable licenses of input data will apply.
To ensure maximal reusability of source code, similarly to data, code should be published under permissive (non- copyleft) open source licenses that do not restrict commercial use.
Challenge 1: Can you classify the following data types?
- A map used in the report
- Output from a CMIP6 model
- Model agreement on changes in temperature in a warming scenario
- A map used in the report: Final
- Output from a CMIP6 model: Input
- Model agreement on changes in temperature in a warming scenario: Intermediate
Key Points
- Data can be classified into input, intermediate assessment, or final assessment data.
- Input data shall be licensed under the same license terms and conditions imposed by the data providers.
- Data produced as part of the IPCC assessment, be it intermediate or final assessment data, shall be published, wherever possible, under the CC BY 4.0 license.