Working with Digital Data

in Religious Studies

5. Accessing & Structuring Datasets: Structure Your Data with Tables and Databases

Summer Semester 2024
Prof. Dr. Nathan Gibson

Outline

  1. Text Formats: Review & Final Thoughts
    1. Hands-on Review: Markdown
    2. Review of Text: Plain Text, Markdown, Markup
    3. XML as Markup
      Break
  2. Tables & Databases
    1. Entities & Relationships: Activity 1
    2. Entities & Relationships: Activity 2
    3. Database Formats

1.1. 📈 Hands-on Review: Objective

Last objective: Be able to add formatting or structure to your text with markdown and markup (HTML and XML).

1.1. 📈 Hands-on Review: What is Markdown?

  • Plain text + special characters such as *, _, #, - to add simple formatting
  • Open & exportable to HTML, Word, PDF

1.1. 📈 Hands-on Review: What is markup?

  • Formatting or structured information added to text.
  • Usually HTML or XML (format <tag attribute="value">some text</tag>)
  • hierarchical

1.2. 📈 Review of Text: Plain Text, Markdown, Markup

_ Plain Text Markdown HTML XML
formats text X no ✔️ yes (simple) ✔️ yes (sophisticated) X not directly
structures data in text X no X no ✔️ somewhat ✔️ yes

1.3. 📈 HTML vs. XML as Markup: HTML

Semantic (“meaningful”) information in HTML: metadata (title, keywords, date) & sometimes more

  • HTML is a standard for how content should be displayed
  • But it doesn’t tell you what that information is

1.3. 📈 HTML vs. XML as Markup: XML

XML is semantic.

  • It shows the structure or meaning of the information
  • But it doesn’t tell you how it should be displayed
  • XML has many standards or rule sets (“schemas”) to choose from. You can even invent your own!

1.3. 📈 HTML vs. XML as Markup: TEI

TEI (Text Encoding Initiative) provides one of these schemas.

  • invented by humanist scholars for working on texts (history, philology, literary studies, etc.)
  • a way to “highlight” certain things you want to study in a text (e.g., words, grammatical forms, textual variations, historical persons or places) and make it possible to “grab” them from the text for analysis

1.3. 📈 HTML vs. XML as Markup: TEI Examples

Break

🧭 Today’s Learning Objective

Structure your data as “entities” and “relationships” using tables.

2.1. Entities & Relationships: Cat in the Hat Game

Thing 1 > does x to > Thing 2

2.2. Entities & Relationships: What are they?

  • entity: a thing in a database “with an identity independent of change to its attributes” (Wikipedia)
  • attribute: information about about an entity
  • relationship: statements linking entities

2.2. Entities & Relationships: Religion in Film

IMDB: What entities, attributes, and relationships do you see there?

https://docs.google.com/spreadsheets/d/1j13FMOnXBvEqQnU9s1MuhSn1wcMf92c7O2Zc9wcUOYQ/edit?usp=sharing

2.3. Databases: What are they?

By Internet Archive Book Images - https://www.flickr.com/photos/internetarchivebookimages/14781001892/, No restrictions, Link, By ArnoldReinhold - Own work, CC BY 2.5, Link

2.3. Databases: What are they?

Tables of relationships or a model of the universe

2.3. Database Formats

Spreadsheet Relational Database XML JSON
easy to create & edit best for high-performance web apps durable & transportable great for connecting web aps

Preview

  1. Accessing & Structuring Datasets: Clean and Augment Your Data with OpenRefine