View on GitHub Logo

Data

Here, you can download the data used in the examples in the paper describing SCOT:

Simulated Datasets:

  Domain 1 Domain 2 Notes
Simulation 1: Bifurcating Tree 300 x 1000 group labels 300 x 2000group labels Originally from here
Simulation 2: Swiss Roll 300 x 1000group labels 300 x 2000group labels Originally from here
Simulation 3: Circular Frustum 300 x 1000group labels 300 x 2000group labels Originally from here
Simulation 4: Synthetic RNA-seq 5000 x 50group labels 5000 x 500group labels Generated using Splatter

Real-world Sequencing Datasets:

Note that the files in “Domain 1” and “Domain 2” columns of the real sequencing datasets contain data pre-processed according to their original publications (linked in Notes), so they are dimensionality reduced. To get access to the original raw datasets, follow the “Raw data” links.

  Domain 1 Domain 2 Notes
SNAREseq Cell Line Mixture 1047 x 19 (chromatin accessibility) 1047 x 10 (gene expression) Original publication. Raw data
scGEM Dataset 177 x 34 (gene expression) 177 x 27 (DNA methylation) Original publication. Raw data
sciOmics Dataset 177 x 34 (gene expression) 177 x 27 (chromatin accessibility) Original publication. Raw data
Don’t hesitate to contact us if you have any questions about these datasets.