The Broad GDAC standardized data runs aim to produce versioned packages representing a frozen snapshot of all TCGA analysis data at a given time:
- Cast in a form amenable to immediate algorithmic analysis (no additional data preparation required)
- Which provides a consistent point of reference for analysis and citation by marker papers and users of TCGA data
- Towards a formal definition of what constitutes a given tumor dataset
- While minimizing redundant effort across centers and groups to download & prepare data for further analysis
- And enhancing provenance and reproducibility
These standardized data packages may be accessed in several ways:
- directly from the Broad: by clicking on the via the Open download links below (public data, no password required)
- from IGV (no password required)
- from the DCC, via the Protected links below (requires DCC protected data credentials) or as described here.
Historical information on this effort is available in this presentation from the April 2011 TCGA meeting, which was refined in subsequent presentations on May 12th and May 19th, as well as the 2011 NCI TSM meeting and ongoing discussions with TCGA collaborators. Release notes for these runs are available here.