Craig Messner, JHU (various)
Penn Libraries, April 9th 2026
LLMs are succesful due to their multitask capability.
"Language Models are Unsupervised Multitask Learners"
Radford et al., 2019
LLMs outperform traditional NLP pipelines, while being natural-language tunable
via in-context learning.
What if I want to extract Italian city names from text?
["Maria", "traveled", "from", "Rome", "to", "Florence", "last", "summer"]
[NNP, VBD, IN, NNP, TO, NNP, JJ, NN]
Maria: PERSON, Rome: GPE, Florence: GPE
This requires expertise, time, and sometimes custom models.
Extract all Italian city names from the following text.
Text: "Maria traveled from Rome to Florence last summer."
Italian cities:
["Rome", "Florence"]
Previously, NLP tasks required supervised training with sets of distinctly labeled data.
Summarization
Sentiment Analysis
Machine Translation
LLMs can even adapt to new tasks not seen during training.
By prompting in a zero-shot or many-shot fashion, we recruit the model's in-context learning ability.
No expensive gradient training required.
The most common way users interact with LLMs is through a web-based chat interface.
The dialogic models offers few options for automatable, testable results.
Task: Extract US cities founded before 1830 and analyze sentiment.
Extract US cities founded before 1830 and the sentiment of their context.
Text: "Boston was magnificent, but Denver felt dreary."
Output:
Extract US cities founded before 1830 and the sentiment of their context.
Text: "New York was thrilling but Seattle seemed dull."
Output: [{"city": "New York", "sentiment": "positive"}]
Text: "Baltimore seemed grim."
Output: [{"city": "Baltimore", "sentiment": "negative"}]
Text: "Boston was magnificent, but Denver felt dreary."
Output:
[{"city": "Boston", "sentiment": "positive"}]
Denver (founded 1858) is correctly excluded.
This is powerful, but also poses a unique challenge.
Output can be convincing but misleading.
Let's discuss:
A major advantage of LLMs is performing operations at scale.
Ask yourself:
Use LLM inference to zero-shot identify the first publication date of historical printed works.
Dataset: Work-author combinations from WikiData.
| Author | Title | Pub |
|---|---|---|
| Amelia Opie | Adeline Mowbray; or, The Mother and Daughter | |
| Barbara Hofland | The Barbadoes Girl: A Tale for Young People |
Each datapoint: a work-title pair with a blank for the publication date.
Identify a dataset tied to your field.
Data often represented in a semi-structured form.
Author,Title,Pub
Amelia Opie,"Adeline Mowbray; or, The Mother and Daughter",
Barbara Hofland,"The Barbadoes Girl: A Tale for Young People",
[
{
"author": "Amelia Opie",
"title": "Adeline Mowbray; or, The Mother and Daughter",
"pub": ""
},
{
"author": "Barbara Hofland",
"title": "The Barbadoes Girl: A Tale for Young People",
"pub": ""
}
]
These formats allow us to programmatically feed data alongside prompts into the LLM.
Design a prompt with:
Determine the first date of
publication of the following work by the given author.
Return the date as a valid JSON object with the single field "date".
Author: {author}
Work: {title}
The placeholders {author} and {title} are filled programmatically for each datapoint.
Design a prompt over the datapoints from your dataset.
Tips:
Three options for getting inference from an LLM:
Consider generation hyperparameters for replicability.
How do we make sure the model has correctly performed our task?
We need to quantify the error rate.
| Author | Title | Ground Truth | Model Output |
|---|---|---|---|
| Amelia Opie | Adeline Mowbray | 1804 | 1804 |
| Barbara Hofland | The Barbadoes Girl | 1816 | 1832 |
Issues for the researcher to keep in mind: