Working with Digital Data

in Religious Studies

4. Working with Text: Mark it down and Mark it up (HTML and XML)

Summer Semester 2024
Prof. Dr. Nathan Gibson

Outline

  1. Hands-on Review: Git
  2. Group Work/Q&A: Your Datasets
    Break
  3. Markdown & Markup

📈 Review

Last objective: Be able to use basic features of git to track changes to your data.

📈 Review: Why Git?

  • Git keeps track of every change you or others make to your data.
  • You can create different versions (branches) of your project where you test new things, without risking breaking the whole project.
  • You have full control over when and how these changes become part of the rest of the project.
  • You can identify each release of your project with a version number and timestamp.
  • Git is the standard for software development. Github alone is used by over 100 million developers.

📈 Review: Git Assignment

Tutorial at https://24data.pages.gwdg.de/4.

  1. Getting started (installing Github Desktop and cloning the “Recipes” project)
  2. Syncing and editing files (creating a new branch, editing a file, and publishing your branch with changes)
  3. Setting up an account in Gitlab GWDG
  4. Creating a merge request

Group Work/Q&A: Your Datasets

  • In groups: What dataset(s) are you thinking about using? What format(s) are the data in (text, tables, web pages, images, video, etc.)?
  • Q&A: What questions do you have for me about deciding on a dataset?

Break

🧭 Today’s Learning Objective

Be able to add formatting or structure to your text with markdown and markup (HTML and XML).

What is markdown?

  • A simple way of adding formatting to plain text.
  • Formatting might be bold, italics, bullet points, numbered lists, even tables or images.

How does markdown work?

  • You use special characters such as *, _, #, - to add things to the text.
  • E.g., you can make a word bold like this **word**.
  • You can write or edit markdown in any text editor, but to see the formatting you have to export it (e.g., as a web page, Word, or PDF) or use editors that support it (like VS Code or Zettlr).

Markdown vs. other types of formatting

  • Word (.doc or .docx)
  • PDF
  • RTF (Rich Text Formatting)

Markdown vs. other types of formatting

  • open (doesn’t require a specific program)
  • easily readable/editable (uses simple structure and characters)
  • can be handled as plain text

Where can you use markdown?

  • blogging, web apps, even WhatsApp!

What is markup?

  • Markup adds formatting or structured information to text.
  • The most common types are HTML and XML, which have essentially the same structure, but different purposes.
  • Markup has a “hierarchical” structure, meaning some things are inside other things.

What is HTML?

  • HTML is used mainly for designating the types of text in web pages: titles, headers, lists, tables, where images or videos should be inserted, etc.
  • Any web page you view is HTML (plus other things). Your browser (Chrome, Safari) “interprets” this HTML and decides how to display it. Try saving a web page by right-clicking > Save page as … > HTML. Then open it in Visual Studio Code.
  • A simple HTML web page:
<!DOCTYPE html>
<html>
    <head>
        <title>
            A title for this page
        </title>
    </head>
    <body>
        Some text
    </body>
</html>

What is XML?

  • XML has the same structure as HTML, but is used as a database.

Preview

  1. Accessing & Structuring Datasets: Structure Your Data with Tables and Databases