The Data

To download: Fill out the registration form and then visit the data server.

To learn more about the data, see the linked pages below. Also be sure to check out:

EduceLab-Scrolls (2019): technical paper describing the original data.
EduceLab Data Sheet (2023): technical paper describing more recent scans added to the dataset.
Tutorials: what to do with the data.
Our libraries to access data in 1-2 lines of code: in Python (with intro notebook) and in C!

Scrolls

Micro-CT scans of intact Herculaneum scrolls. The mission is to virtually unwrap the contents of the scrolls from the CT scans, revealing the text hidden within. Scroll 1 was used to win the 2023 Grand Prize, but 95% of the scroll remains unread!

More information

Scroll 1 (PHerc. Paris. 4)

Scroll 2 (PHerc. Paris. 3)

Scroll 3 (PHerc. 332)

Scroll 4 (PHerc. 1667)

Scroll 5 (PHerc. 172)

Fragments

Micro-CT scans of detached scroll fragments. Since the fragments have exposed text on their surfaces, they can be used as ground truth for machine learning-based ink detection approaches (see Tutorial 5: Ink Detection).

More information