Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 49 Next »

firehose_get v0.3.6

Retrieving or utilizing TCGA results need not be difficult, especially for open-access data.  To help simplify, we've introduced the firehose_get retrieval script.  To use it, simply download the zip file from here, perform these 2 steps from a Unix-compatible command line

        unix%   unzip firehose_get<VERSION>.zip
  unix%  ./firehose_get 

and follow the instructions (documentation excerpt below).    If you are missing wget, please look here for links to pre-built versions for your system.

Please note that downloading data from the Broad TCGA GDAC site constitutes agreement to this data usage policy.



firehose_get : retrieve open-access results of Broad Institute TCGA GDAC runs
Version: 0.3.5 (Author: Michael S. Noble)

Usage: firehose_get [flags]  RunType  Date  [tumor_type, ... ]

Two arguments are required; the first must be one of


while the second must EITHER be a date (in YYYY_MM_DD form) of an
existing GDAC run of the given type OR 'latest'.  An optional third,
fourth etc argument may be specified to prune the retrieval, given
as a subset of these case-insensitive TCGA tumor type abbreviations:


Note that as a convenience 'analysis' and 'data' are accepted as
synonyms for the 'analyses' and 'stddata' run types


  -b | -batch              do not prompt: assume YES answer to all queries
  -e | -echo               show commands that would be run, but do nothing
  -h | -help | --help    this message
  -l | -log                   write output to log file, instead of stdout
  -r | -runs                display list of all available Firehose runs
  -t | -tasks <list>      further prune the set of archives retrieved, by
                                INCLUDING only the tasks (pipelines) whose
                                names match the given space-delimited list of
                                patterns; matching is performed with glob-style
                                wildcards; if a tilde ~ is prepended to a task
                               name then matching tasks will be EXCLUDED; when
                               no pattern list is given firehose_get will display
                               all tasks in the selected run

                               NOTE: not all tasks will execute for all tumor
                               sets; what tasks are run depends upon the
                               data available for that tumor type

  -v                          display the version of firehose_get
  -x                          debugging: turn on bash set -x (warning: very verbose)

For more information see the Broad GDAC website or send an email to

 Change Log
v0.3.6:     2012_09_20
   -runs considers ONLY those GDAC runs with ./data subdir
v0.3.5:     2012_09_12
   discontinue use of static run lists, in favor of dynamically querying GDAC
                site to display list of runs, what kinds of runs, etc
    support EXCLUDE in -tasks with tilde/~ prefix
v0.3.4:     2012_09_07
    tweak date regex to correctly detect October months
v0.3.3:     2012_07_12
    fix printf msg emitted when nothing downloaded
    employ --cache=off, so that most up-to-date run lists are always retrieved
v0.3.2:     2012_06_08
    added -b/-batch for headless use
    'latest' now translated to date prior to download
    be less compulsive when cleaning up
v0.3.1:     2012_05_02
    accept --version, too
    use tumor types to subset list of tasks returned, too
    warn user when subsetted runs return nothing for the given tumor(s)
v0.3.0:     2012_04_22
    tweak awkward wording of -tasks help
    allow --help, too
    -runs flag to display list of available runs
    -tasks flag to subset by glob-pattern matching against task names
 Copyright and Disclaimer
# This software and its documentation are copyright 2012 by the
# Broad Institute/Massachusetts Institute of Technology. All rights reserved.
# This software is supplied without any warranty or guaranteed support whatsoever.
# Neither the Broad Institute nor MIT can be responsible for its use, misuse, or
# functionality.
  • No labels