View on GitHub Logo

Data

Here, you can download the data used in the examples in the paper describing SCOT:

Simulated Datasets:

  Domain 1 Domain 2 Notes
Simulation 1: Bifurcating Tree 300 x 1000
group labels
300 x 2000
group labels
Originally from here
Simulation 2: Swiss Roll 300 x 1000
group labels
300 x 2000
group labels
Originally from here
Simulation 3: Circular Frustum 300 x 1000
group labels
300 x 2000
group labels
Originally from here
Simulation 4: Synthetic RNA-seq 5000 x 50
group labels
5000 x 500
group labels
Generated using Splatter

Real-world Sequencing Datasets:

Note that the files in “Domain 1” and “Domain 2” columns of the real sequencing datasets contain data pre-processed according to their original publications (linked in Notes), so they are dimensionality reduced. To get access to the original raw datasets, follow the “Raw data” links.

  Domain 1 Domain 2 Notes
SNAREseq Cell Line Mixture 1047 x 19 (chromatin accessibility)
cell types
1047 x 10 (gene expression)
cell types
Original publication. Raw data
scGEM Dataset 177 x 34 (gene expression)
cell types
177 x 27 (DNA methylation)
cell types
Original publication. Raw data
Don’t hesitate to contact us if you have any questions about these datasets.