Efforts can be grouped into two categories:
- Segmentation: the mapping of sheets of papyrus (“segments”) in a 3D X-ray volume. See Tutorial 3 and Data > Segments.
- Ink Detection: detecting ink in segments. See Tutorial 4.
Segmentation is the mapping of sheets of papyrus (“segments”) in a 3D X-ray volume. See Tutorial 3. The community has built various tools to do this.
We have also set up a small team of contractors producing segments. These are available for anyone to download. See Data > Segments for more information.
What are people working on?
- Segmenting the new scans
- Better tooling for dealing with the higher resolutions and multiple volumes
- Current state of the art: Philip’s Volume Cartographer fork
- Better tools for high-accuracy segmentation or correcting
- Current state of the art: Khartes
- Autosegmentation (minimal human input)
- Merging of segments
- Visualizing segments both in 3D and in flattened form
- Current state of the art: Segment Viewer
The main tool that the segmentation team (contractors and volunteers) currently uses is Volume Cartographer.
This tool was originally created by EduceLab (in particular Seth Parker), and is now being improved both by EduceLab and the Vesuvius Challenge community. The tool has been improved greatly, with more accurate and faster segmentation algorithms and UI improvements.
- Tutorial 3. We have an in-depth tutorial on how to use Volume Cartographer.
- Original Volume Cartographer repo. The original version by Seth Parker and others.
- Julian’s Volume Cartographer fork. The segmentation team previously used this version by Julian Schilliger (@RICHI on Discord).
- Philip’s Volume Cartographer fork. The segmentation team currently uses this version by Philip Allgaier (@spacegaier on Discord).
- The Segmenter’s Guide to Volume Cartographer (for contractors). For more technical details about how the segmentation team operates, check out this doc.
- Volume Cartographer’s Apps and Utilities page. Also be sure to check out the various other docs in this directory.
- Data Processing Workflow doc which is in the ink-id repo, but is mostly about Volume Cartographer and segmentation. Also goes into how to do alignment (”registration”) of infrared photos of the fragments, and how to create binary ink labels.
- Ben’s Segmentation Tutorial. Ben (@Hari_Seldon on Discord) goes into great detail on how to do segmenting.
- JP’s Segmentation Party. JP does a bunch of segmentation.
- .vcps parser. Useful code if you’re working with the custom .vcps data format.
A new tool created by Chuck (@khartes_chuck on Discord) which allows for very precise segmentation, using real-time texturing UIs that show the segment from different angles. It is used on occasion by the segmentation team.
VC Whiteboard and Segment Viewer
These are used by the segmentation team primarily to see which segments they have worked on already. We host them respectively here and here. The datasets that these live versions pull from are updated about once per month.
A web-based tool to browse layers and open source ink detection results of all released segments.
A reimplementation of Volume Cartographer in Python by Moshe Levy (@moshelevy on Discord). Not quite feature-complete yet (in particular it’s missing Julian’s improvements), but long term this might be easier to maintain than the C++ codebase of Volume Cartographer.
Created by EduceLab for annotating a large air gap in Scroll 1, and then projecting from that gap to either side to create two large segments, colloquially referred to as the “Monster Segment”. Hasn’t been used for more segmentation, since it was the only large air gap we could find.
QuickSegment has a built in tutorial which you can find when going to “Help” (in the menu bar) => “Tutorial”. Do note that QuickSegment only works on thumbnail volumes (e.g.
/volumes_small/ renamed to
/volumes/) and then scaled up using
A couple of folks have written data transformations that allow tools to more efficiently load data from the server.
- Masked Slices. by James Darby (@thatGuy on Discord). Versions of the scroll slices with irrelevant data masked out, which leads to about 2x smaller files when applying compression. The tradeoff is that loading the files is a bit slower because of the compression. Used for
/volumes_masked/on the data server.
- Vesuvius-Build. by Santiago Pelufo (@spelufo on Discord). Used for
/volume_grids/on the data server.
Various standard tools exist for viewing 3D data. Some have been augmented to support the file formats that we’re dealing with.
- Meshlab. Most useful for viewing segments. We introduce how to use this with our data in Tutorial 2 and subsequent tutorials.
- ImageJ/Fiji. Useful for viewing surface volumes. We introduce how to use this with our data in Tutorial 2 and subsequent tutorials.
- Blender. Generic 3D program. Adapted for segment viewing by Santiago Pelufo (@spelufo on Discord). Tutorial can be found here and here.
- ilastik. Generic segmentation toolkit. Adapted by Santiago Pelufo (@spelufo on Discord) for use with our volumes.
- 3D Slicer. Check out this tutorial by James Darby (@thatGuy on Discord).
There are two major avenues people have been pursuing for detecting ink in the scrolls.
- Fragment-based. Training ML models on fragments, then running them on scroll segments. This is the method we originally envisioned, and created some prizes around, like the Ink Detection prize on Kaggle. Resulted in Youssef’s First Letters Prize results.
- Crackle-based. Searching the scrolls for the “crackle pattern” discovered by Casey Handmer. Resulted in Luke’s First Letters Prize results.
What are people working on?
Fragment-based ink detection
- Youssef’s First Letters Prize model. By Youssef Nader (@YoussefNader on Discord).
- Scroll pretraining. Youssef’s original idea for pretraining on the scrolls and finetuning on the fragments, which led him to winning the First Letters Prize.
- Ryan Chesler’s analysis. From the #1 Kaggle team, Ryan Chesler did an analysis of retraining their model on 8µm, and applying that to the Monster Segment.
- OverthINKingSegmenter’s analysis. Another analysis from the #7 Kaggle team, on the importance of resolution.
- Stephen Parsons’ PhD dissertation. Lots and lots of gems in here. And of course we’d be remiss not to mention his original ink-id software.
Before the First Letters Prize result, we ran an Ink Detection prize on Kaggle. These are the top 10 results:
- 1st place: ryches. Writeup / Github / Inference notebook / Presentation
- 2nd place: RTX23090. Writeup / Github / Inference notebook / Presentation
- 3rd place: wuyu. Writeup / Github / Inference notebook / Presentation
- 4th place: POSCO DX - Heeyoung Ahn. Writeup / Github / Inference notebook / Presentation
- 5th place: Aksell. Writeup / Github / Inference notebook
- 6th place: chumajin. Writeup / Github / Inference notebook / Presentation
- 7th place: OverthINKingSegmenter. Writeup / Github / inference in repo
- 8th place: Luck is all you need. Writeup / Github / inference in repo / Presentation
- 9th place: still 1 fold, 2 net. Writeup / Github / Inference notebook / Presentation
- 10th place: Feng Qilong. Writeup / Github / Inference notebook / Presentation
Crackle-based ink detection
Casey Handmer discovered a “crackle pattern” in Scroll 1, which appears to be ink.