Working with Digital Data
in Religious Studies
5. Accessing & Structuring Datasets: Structure Your Data with Tables and Databases
Summer Semester 2024
Prof. Dr. Nathan Gibson
Outline
- Text Formats: Review & Final Thoughts
- Hands-on Review: Markdown
- Review of Text: Plain Text, Markdown, Markup
- XML as Markup
–Break–
- Tables & Databases
- Entities & Relationships: Activity 1
- Entities & Relationships: Activity 2
- Database Formats
1.1. 📈 Hands-on Review: Objective
Last objective: Be able to add formatting or structure to your text with markdown and markup (HTML and XML).
1.1. 📈 Hands-on Review: What is Markdown?
- Plain text + special characters such as
*
, _
, #
, -
to add simple formatting
- Open & exportable to HTML, Word, PDF
1.1. 📈 Hands-on Review: What is markup?
- Formatting or structured information added to text.
- Usually HTML or XML (format
<tag attribute="value">some text</tag>
)
- hierarchical
1.2. 📈 Review of Text: Plain Text, Markdown, Markup
_ |
Plain Text |
Markdown |
HTML |
XML |
formats text |
X no |
✔️ yes (simple) |
✔️ yes (sophisticated) |
X not directly |
structures data in text |
X no |
X no |
✔️ somewhat |
✔️ yes |
1.3. 📈 HTML vs. XML as Markup: HTML
Semantic (“meaningful”) information in HTML: metadata (title, keywords, date) & sometimes more
- HTML is a standard for how content should be displayed
- But it doesn’t tell you what that information is
1.3. 📈 HTML vs. XML as Markup: XML
XML is semantic.
- It shows the structure or meaning of the information
- But it doesn’t tell you how it should be displayed
- XML has many standards or rule sets (“schemas”) to choose from. You can even invent your own!
1.3. 📈 HTML vs. XML as Markup: TEI
TEI (Text Encoding Initiative) provides one of these schemas.
- invented by humanist scholars for working on texts (history, philology, literary studies, etc.)
- a way to “highlight” certain things you want to study in a text (e.g., words, grammatical forms, textual variations, historical persons or places) and make it possible to “grab” them from the text for analysis
1.3. 📈 HTML vs. XML as Markup: TEI Examples
🧭 Today’s Learning Objective
Structure your data as “entities” and “relationships” using tables.
2.1. Entities & Relationships: Cat in the Hat Game
Thing 1 > does x to > Thing 2
2.2. Entities & Relationships: What are they?
-
entity: a thing in a database “with an identity independent of change to its attributes” (Wikipedia)
-
attribute: information about about an entity
-
relationship: statements linking entities
2.3. Databases: What are they?
Tables of relationships or a model of the universe
Preview
- Accessing & Structuring Datasets: Clean and Augment Your Data with OpenRefine