Born of the desire to systematize analyses from The Cancer Genome Atlas pilot and scale their execution to the dozens of remaining diseases to be studied, GDAC Firehose now sits atop ~55 terabytes of analysis-ready TCGA data and reliably executes thousands of pipelines per month.
The Broad Institute TCGA GDAC Firehose Provides
These can be explored & retrieved interactively through our data dashboard and analysis dashboard, or downloaded en masse with firehose_get.Towards the aim of reproducibility, our online suite of reports provides thousands of pages of documentation for the analyses performed; in addition, extensive release notes are available for each versioned dataset and analysis package release. Finally, more information is available in many of the talks and posters we've presented and our online FAQ.
For a discussion of Firehose in the broader context of Big Cancer Data, see Nature Methods 10, 293–297 (2013) doi:10.1038/nmeth.2410