The Future of Humanities Research (Part II)
In last week’s post, I presented a list of advanced language tools that humanities scholars might find useful. That list was an attempt to initiate a conversation about the value, or lack thereof, of AI-enabled research tools in the humanities. This week I continue that exploration with a brief description of each opportunity. This topic, though somewhat prosaic, is important because little has been written about the practical utility of AI in the humanities. An exploration and a conversation needs to happen and hopefully my small contribution will support that.
Search
Search has been a feature of research systems for many years. Think EBSCO, ProQuest, and others. Indexes are typically how these older systems offer this vital service. In recent years, this approach to search has been updated. Retrieval augmented generation (RAG) systems, for instance, now support direct search of the model’s word embedding. (See the section on Word Embeddings) The advantage of this approach lies in the technology’s ability to support less exact, fuzzier searches.
Sentiment Analysis
AI models can now assess the emotional tone of a text and assign a label or numerical score to classify it. Thus, sentiment analysis allows scholars to explore the emotional dimensions of history and the feelings people expressed as they wrote about significant events, social issues, or cultural movements. By examining sentiment patterns, one can identify similarities, differences, or shifts in emotional expression over time, providing insights into cultural or historical change. In a nutshell, sentiment analysis is a way of mapping the emotional terrain of a specific text or body of documents from a given period.
Translation
Translation involves all the tasks a scholar executes to convert a document into another language. Tools like Google Translate allow scholars to explore a wider range of materials and incorporate diverse (multilingual) perspectives into their research. Additionally, these tools facilitate communication among multinational research teams, enabling the exchange of ideas. AI-enabled translators are not yet as good as humans. Even so, scholars who have not yet achieved fluency in a target language can now generate “good enough” translation on the fly.
Question Answering
Google has been able to answer simple questions for some time now. But only recently has that ability been augmented by large language models. Question answering (QA) technology is helpful when conducting literature reviews as it can search through a single document or multiple documents to answer questions. This is especially helpful when a scholar needs to identify relevant literature for a research project. QA technology can also extract specific information such as dates, names, locations, and events from unstructured texts.
Text Generation
Much of the success of large language models (LLMs) is due to their ability to generate human-like text in response to prompts. This ability is useful to the humanities in many ways. The most obvious is creative writing and composition. With well-designed prompts, scholars can explore writing styles, experiment with language, or simulate the literary voices of historical figures or authors.
Generative AI tools like Grammarly are helpful when one is word-smithing a document to achieve maximum clarity and precision. However, there is no substitute for the art of writing and thinking through the evidence. As William Zinsser (2013) puts it, writing is the best way to learn and master a topic. This most fundamental work should never be outsourced to a machine.
Parts of Speech (POS) Analysis
Parts of speech (POS) analysis seeks to reveal the underlying grammatical structure of a text. POS techniques are beneficial when a researcher seeks insight into an author’s writing style. Style is a function of the sentence structures – the verbs, adjectives, and adverbs an author uses in their writing. Hence, POS analysis is typically the first step in author attribution and genre classification.
POS analysis can also shed light on language change over time. By analyzing the frequency and usage of different parts of speech, scholars can track linguistic shifts, identify new syntactic patterns, and explore the evolution of word meanings.
And finally, POS plays an essential role in comparative analysis projects. Here, the scholar analyzes the distribution of nouns, pronouns, verbs, adverbs, adjectives in multilingual corpora to identify similarities and differences between languages.
Named Entity Recognition (NER)
Named Entity Recognition models can now identify personal names, organizations, and places in a document. NER supports indexing and annotation tasks as well as authorship analysis. In the case of authorship, researchers can study the influence, style, and evolution of specific authors with NER, including their cultural impact.
A similar kind of analysis can be done at the geographic level. NER can assist in geospatial analysis by identifying named entities at specific locations, such as cities, landmarks, or regions. This meta-data, in turn, allows scholars to analyze spatial patterns, including the influence of specific locations of cultural or historical events.
Network Analysis
Network analysis tools represent text data as networks or graphs where nodes represent entities (characters, authors, ideas) and edges are the relationships between them. It comes in a variety of flavors. Here are three:
Social network analysis maps relationships among organizations, groups, or authors. It’s useful for exploring historical networks, literary circles, intellectual collaborations, and social interactions between individuals. When a network is visualized, scholars can uncover influence patterns, identify key figures, examine information flow, and understand the dynamics of intellectual communities.
Bibliometric analysis is the application of network analysis to bibliometric data (citation and co-authorship networks) to uncover scholarly communication patterns. A typical bibliometric study examines not just the networks but also the diffusion and uptake of ideas.
Spatial network analysis is like social network analysis, except its focus is geographic – exploring trade networks, migration patterns, or cultural exchanges.
Summarization
ChatGPT and other generative AI systems do a pretty good job of summarizing documents and articles, making this a valuable tool when conducting literature reviews. With it, scholars can quickly grasp the key arguments, findings, and methods used in a body of research literature without having to read each document in its entirety. This enables them to efficiently identify key references, achieve a broad overview of existing scholarship, and discover gaps in the literature.
Topic Modelling (TM)
Topic Modeling (TM) is a statistical technique that extracts underlying themes from a collection of documents. TM facilitates historical research by enabling scholars to observe the evolution of ideas, social movements, cultural shifts, and literary styles during a specific period. It’s a comparative analysis tool, helpful when one wants to explore thought patterns, connections, and influences between literary works and cultures. Literature reviews, classification, and author attribution tasks can all benefit from topic modeling.
Text Similarity Analysis (TSA)
Text similarity analysis (TSA) algorithms calculate how similar two documents are to each other. TSA has a wide variety of applications. Here are just a few:
Discover patterns, motifs, or recurring themes in a body of literature.
Cluster like documents, do topic modeling, and automatically classify texts.
Explore variation in texts, trace influences, or examine intertextuality.
Identify discontinuities in a document series.
In brief, TSA allows researchers to explore literary connections and influences by identifying similarities in writing style, character development, and narrative structure.
Word Embeddings
A word embedding is a numerical representation of words found in a corpus of documents. It is a high-dimensional space because each word can have hundreds, even thousands, of numbers linked to it. Each number is a single dimension. Word embeddings are valuable research tools because they can infer the meaning of an ambiguous word based on its distance from similar words around it. That is, they can clarify the meaning of a word within its context. This ability allows researchers to explore word meanings and their changing usage patterns over time.
Word embeddings can also group and categorize documents in a large corpus. This is done by calculating the average of all the word embeddings for a given document and then comparing that to other documents. Lastly, word embeddings enable researchers to achieve a much finer level of analysis in various research settings, including opinion mining, genre/period analysis, or cross-lingual investigations.
Logical Flow
Logical flow analysis is a way of discovering a document's underlying logic, be it a screenplay, story, poem, or any other kind of text. A key technology in this space is dependency parsing (DP). DP is a technique that analyzes the structure of sentences by identifying relationships between words. Specifically, DP visualizes these relationships as links (dependencies) between words in a dependency parse tree. A parse tree, in turn, permits scholars to see the logical flow of a sentence or document. A similar kind of analysis can be done at the discourse level, using parse tools to discover relationships between text sections.
Speech to Text / Text to Speech
Speech-to-text technology enables scholars to quickly create initial drafts of articles, research proposals, and other academic documents. With text-to-speech technology, teachers can quickly generate sound files in target languages without having to do any recording themselves. Audio resources are helpful to students who want to improve their speaking skills. Additionally, text-to-speech is a critical technology in virtual learning worlds where interactive digital humans are needed, specifically digital beings who can talk.
Transcription
Transcription is the process whereby a scholar copies a document and typically converts it into machine-readable text. Before the invention of the printing press, scribes penned books in a variety of fonts. Often, they wrote in cursive, in a style specific to a particular region. Modern scholars must, therefore, learn these writing systems in paleography workshops and courses. Recently, AI models from Transkribus and others have proven helpful in transcription work. Once trained on a corpus of documents in a unique writing style or from a specific region, models can then “read” images of the originals and generate transcriptions automatically.
Generative AI
AI-enabled systems like Dall-e and Dream Studio can generate photo-realistic images. For example, a scholar might ask a generative AI system to create an image of a scene or a character from a historical text, using the author’s written description as a prompt. Although this is not a language application per se, it could prove helpful in the classroom, allowing students to visualize what they’re reading.