- TCGA Legacy symposium (9/26-9/29, posters starting this week)
- CPTAC F2F (Oct 15-18):
- for F2F our only licensing concern is access to the CPTAC communit
- so each user who wants access sends email to us, and is granted, b/c they are members of CPTAC consortiumn
- Paper on proteomics pipeline (Dec 2018)
These all imply that pipelines should be made public-ready in near future
- Definition: releasing pipeline "in FC" is treated effectively as releasing source code
- In part because docs usually live in repo wiki, and so it must be opened to public for (unless docs are duplicated to elsewhere)
- Do we need to have an "I Agree" dialogue box for codes which require licenses? For example, on codes which come from other labs.
- At the very least, for EXTERNAL codes used in our pipeline OUR codes which need commercial tools (like MatLab REs)
- For proteomic pipeline, Mani says it's ALL OK, everything can be run, for whatever reason, by anyone
- But we need to go over EVERY task in GDAC genomic pipeline to ensure all components (especially external codes) are releasable
- Including APOBEC, GSEA (and also tools like Philosopher from proteomic pipeline)
- We may want to release "genomic lite" pipeline, with certain tools omitted
- Possible interim solution: until "I Agree" is possible at runtime in FC, to prevent commercial entities from profiting:
- Karl made strong point about re-distribution being what most method developers want to control/prevent, but how?
- Mike pointed out that while preventing re-distrib is important, mere "runnability" also needs to be monitored/protected:
- otherwise commercial entities can take our labor/tools (or even re-brand as their own) & then profit by analyzing/etc their data
- Chet pointed out that GenePattern team wrestled with this years ago, eventually taking the approach: "We can't enforce good behavior," but we can record whether LICENSE terms have been acknowledged & record in a database, as well as record of each instance when a any user ran any tool. Then let the lawyers fight it out if need be.
- Mike suggested that enforcing acknowledgement of LICENSE terms can be done with a boolean attribute or config parameter, which defaults to False, but user sets to True to signify that LICENSE terms have been read and/or accepted. This facilitates "I Agree" dialog box suggested above for FireCloud UI, but also covers the programmatic use case.
Is it time for us to Hydrant-ize our GDAN configs, which would benefit both genomic & proteomic pipelines (GDAN & CPTAC)
Mani & Karsten considered this for proteo-genomic portion of pipelines, and decided NO, not until after Oct 2018 F2F (and possibly Dec paper)
CGA then adopted similar position.
- This is substantial effort: really need extra SWE effort
- Also: we do not have staffing to respond to wide-ranging GISTIC, MutSig etc user questions
- Lastly: policy on sharing methods/workspaces with collaborators, e.g. Kwiatkowski lab
- if the pipeline is instantiated in our GDAN (or CPTAC workflow)
- and we provide access only through FireCloud
- then the method (and/or method configuration) is essentially for public consumption
- and may be shared through FC with essentially anyone?
- but, in particular, Broadie collaborators (as a first step)