Child pages
  • CPTAC Meeting Agendas & Notes
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 103 Next »

Sept 17, 2018:  

  1. Upcoming milestones:

    1. TCGA Legacy symposium (9/26-9/29, posters starting this week)
    2. CPTAC F2F (Oct 15-18)
    3. Paper on proteomics pipeline (Dec 2018)
  2. These all imply that pipelines should be made public-ready in near future

    1. Definition: releasing pipeline "in FC" is treated effectively as releasing source code
    2. Do we need to have an "I Agree" dialogue box for codes which require licenses?  For example, on codes which come from other labs.
    3. In part because documentation usually lives in repo wiki, and so it must be opened for public access
  3. Therefore, is it time for us to Hydrant-ize our GDAN configs?

  4. This would benefit both genomic & proteomic pipelines (GDAN & CPTAC)
  5. Prioritization:
    1. This is substantial effort: really need extra SWE effort
    2. Also: we do not have staffing to respond to wide-ranging GISTIC, MutSig etc user questions
  6. Lastly:  policy on sharing methods/workspaces with collaborators, e.g. Kwiatkowski lab
    1. if the pipeline is instantiated in our GDAN (or CPTAC workflow) 
    2. and we provide access only through FireCloud
    3. then the method (and/or method configuration) is essentially for public consumption 
    4. and may be shared through FC with essentially anyone?
    5. but, in particular, Broadie collaborators (as a first step)

August 6, 2018

  1. Budget alert: 50% gone?

  2. Methylation data progress
  3. SOP slides
  4. Anybody here of "persistently slow Broad institute portal access" from abroad, e.g. China?
  5. Action: reconnect with PDC, to see 
    1. current progress towards v1 alpha release (Fall 2018)
    2. chat about when will ingest occur for CPTAC3 data?

June 11, 2018

  1. Latest data?

  2. Progress report: content due soon

  3. Upcoming site visit: July 17, can mostly come from expanded length/depth of F2F presentation (which was abbreviated to only 10mins)

May 7, 2018

  1. Recap of last week's F2F: 

    1. Good FireCloud workshop attendance & feedback

    2. Proteomic:

    3. Genomic: 

      1. GDC has made progress automating their pipelines

      2. CPTAC genomic data will be HG38 going forward; 

    4. Phased delivery CCRC & UCEC first, to allow publications to be in pipeline by next funding cycle (Gantt chart)

  2. Two special journals in Fall 2018:
    1. Special issue Mol. Cell Proteomics:tools, algorithms & computational methods
      1. Submission deadline will be in November, 2018
    2. Journal of Proteome Research: Software Tools and Resources
      1. Submission deadline September 14, 2018
  3. FOA: Sustained Support for Informatics Resources for Cancer Research and Management (U24)
    1. Due June 14
    2. Submit LOI by May 14
    3. Letters of support: gathering now
  4. Time Permitting:  genomic pathway analyses
    1. GDAN Lung pathifier analyses
    2. iCluster: Hailei

April 2, 2018 

  1. Access to the GDAC bucket for reference files
    1. egress pay: need to turn on requestor_pays bit 
    2. authorization domain
    3. proxy groups and how to keep track of them in a bucket: currently have 2 proxy groups
  2. CPTAC3 data in FC may help entice new CPTAC users, but it is also akin to replicating DCC
    1. so, let's wait until potential CPTAC users make explicit requests
  3. TODO:
    1. Mani will explore using auth domain for new CPTAC FC users
    2. Mike will ping NCI about FireCloud SW session in May F2F
    3. Mike/Sam will price physical hardware & compare to Google VMs, as potential spend for $25K FC disbursement

March 19, 2018

  1. Review draft agenda for May F2F
  2. Decide additional attendees
  3. (Fire)Cloud costs for CPTAC-wide usage:  $25K seems to have effectively been reduced
  4. Batches 1 and 2 of genomic data are located at   /xchip/gdac_data/cptac3/genomic_data_mirror

    1. WashU has apparently re-submitted batch1 RNASeq data, using MapSlice to map transcripts to gene level

    2. So we should be able, in principle, to proceed with our mRNA pipelines

  5. Mike has a short, unifying wrapper to all 3 of the DCC upload/download utilities, and can install to Unix server upon request
  6. Integrative Analyses:
    1. Karsten:  proteomic pathways ... 
    2. Mani: map genomic CN data to LINCs, correlate w/ RNAseq signatures?
    3. Possibly: multi-omics clustering
  7. TODO:
    1. Firecloud workshop scheduling
    2. Mike push cptac wrapper script to Unix servers

March 5, 2018

  1.  Summation of HG19  WashU/GDC workaround & potential recommendations to CPTAC leadership & collaborators.

    1. Consider: combing the GDC website to see if Dockers are available for their pipelines, and whether these could be instantiated in FC
    2. Consider: running local MOAP-style pipeline on WXS data, to generate CN, mutation, RNASeq
      1. Decision:  wait for now, it's not fully baked yet
  2. TODO:  
    1. open edit permissions (on this page) to all viewers
    2. Identify 2-3 integrative proteo-genomic analyses: but must be on CPTAC3 data
    3. Combined into iCoMut output
  3. Planning F2F in May 1,2,3:
    1. Quilts in FC?
    2. Although CustomProDB (from Baylor/Bing Zhang group) does similar as Quilts and is already in FC (from Karsten)
    3. FireCloud workshop
    4. Karsten & Mani: current instantiation of proteomic pipeline 
      1. On prospective BRCA data

Update to Jan 22, 2018 entry:

  1. Genomic data for CPTAC3 downloaded to:  /xchip/gdac_data/cptac3/2018_02_02_genomic_data

  2. Only 2 cohorts (CRCC and UCEC, i.e. kidney and endometrial) have genomic data available so far
  3. The 3rd cohort (LUAD, lung) proteomic data not submitted by Broad yet, so WashU has apparently not processed the genomic either

Jan 22, 2018

  1. Brief review of items missed from last meeting
  2. Proteogenomic Data Commons Steering Committee: 
    1. Held 2nd advisory meeting call last Wed
  3. WashU/GDC workaround: summary of discussion & decisions from 1/19 call

  4. New science: degradome?

Jan 8, 2018

  1. Welcome Yifat Geffen, newest member of CGA
  2. Brief review of latest suite of genomic run reports (total of 830)
  3. Whither pathology image browser in CPTAC?  The GTEX pathology browser was authored here (and we have strong knowledge of cancer path viewer), so we have a good deal of expertise & code that could in principle be leveraged.  I've drafted a suggestion for PAAD dwg here.
  4. NMF clustering module question (auto-selection of K) from Mani?
  5. FireCloud hosting of CPTAC data (as partial workaround to lack of CPTAC genomic data at GDC)
  6. Medblast paper

Dec 11, 2017

  1.  Items from 11/27 meeting that was cancelled
  2. GDC and CPTAC:  summary notes from week of  2017_12_06
    1. Original plan (and data products) given here
    2. Impact to CGDAC (the CGA part of proteomics GDAC) sketched below
    3. Initial data generation will be shifting from GDC to WashU
          Mutation calls (both WES and WGS)
          CN generation
          RNAseq calls
      WashU products deposited to Georgetown DCC
      Broad download & remap names as needed/appropriate

    4. What's next?
      FireCloud (as a trusted partner) now being considered as a distribution point
      Per Chris Kinsinger feeler conversation on 2017_12_01

    5. So, because these data will be HG19 ... our CGA/GDAC in CPTAC may be better utilized by shifting gears, from running existing FireCloud HG38 genomic pipelines on HG19 data (which lead to broken results) ... to loading these HG19 data products from WashU into FireCloud so that it can serve as a distribution point

    6. Side Q: why Georgetown DCC not considered for this? Scale? Absence of trusted partner status?

  3. Status on $25K to fund use of FireCloud across entire CPTAC?
    1. any progress: NO, there was an attempt to issue as AWS credits ... currently stuck w/r/t GoogleCredits ... stay tuned
    2. billing project?

Nov 27, 2017

  1.  Timeline for LUAD, UCEC and KIRC projects:  given here

Oct 30, 2017: tentative agenda

  1. Discuss CPTAC-wide use of FireCloud: how to allot funding, make billing projects, add users etc
  2. Recall supervisor modein FISSFC:
  3. Update on DSDE collaboration:
    1. Show recent CGA/DSDE collaboration proposal
    2. FISS backbone of Jupyter notebooks in FireCloud
    3. Code generator progress:
      1. standalone tool
      2. works on GTEX
      3. Swagger2 / FireCloud proof of concept has been done
      4. Full Swagger2 support is next
  4. Discussion for Wed 11/1 AWG telecon:
    1. Thoughts omitted from F2F talk, for time constraint: slides 21-39
  5. Chet: recent CPTAC2 workspaces ... where to go next?

July 2017:  FYI on proteomics deliverables from FireCloud CGA team

Per Chet Birger/D.R. Mani meeting:
  • FireCloud data workspaces
    • one (possibly two - see below) for each of the three CPTAC AWGs (breast, ovarian, colon)
    • The workspaces will contain, at a minimum, the end results (protein level quantification) produced by each AWG.
    • We may also include the raw MS files, and/or the standardized mzML files.  But all of the pipelines used for analyzing these files rely on windows-based software, and so cannot be run on FireCloud. 
    • We will include the TCGA genomic, clinical and biospecimen data as well - this will help researchers who want to conduct correlative analyses.  It will mean, however, that we'll want to create both open and controlled access versions of these workspaces, as the BAMs and VCF files are controlled access. 
    • We may also include the outputs of the CDAP pipeline, which are published on the CPTAC data portal.  
    • We will aim to get these workspace in place by the end of August
  • Workflows
    • Since all of the workflows that run on either the raw MS files or the mzML files (CDAP) include windows-based tasks, they cannot be run on firecloud.
    • Mani and Mike's teams are developing workflows for correlative analysis; we agreed to touch base with them at the end of August to see how far along any of these pipelines are and whether they could be included in our deliverables.  If not, so be it....I'm hoping that NCI will see the value in the data workspaces for the future development of workflows.

May 31, 2017 On-Site (Broad Institute, Cambridge MA)

  • Mike's slides:  here

April 4, 2017 Face-to-Face (Bethesda, MD)

  • Mike's slides:  here
  • No labels