Color track by codons:OFFgenomic codonsHelp on codon coloringShow codon numbering:Display data as a density graph: UCSC Genes ConfigurationType of graph:pointsbarGraph configuration helpTrack height:pixels (range: 11 to 128)Data view scaling:use vertical viewing range settingauto-scale to data viewAlways include zero: OFFONVertical viewing range: min: max: (range: 0 to 1000)Transform function:Transform data points by: NONELOG (ln(1+x))Windowing function:mean+whiskersmaximummeanminimumsumSmoothing window:OFF2345678910111213141516 pixelsNegate values:Draw y indicator lines:at y = 0.0:ONOFF at y =OFFONData schema/format description and downloadData last updated at UCSC: 2013-06-14DescriptionThe UCSC Genes track is a set of gene predictions based on data from RefSeq, GenBank, CCDS,Rfam, and the tRNA Genes track. The trackincludes both protein-coding genes andnon-coding RNA genes. Both types of genes can produce non-coding transcripts, but non-codingRNA genes do not produce protein-coding transcripts. This is a moderately conservative set ofpredictions. Transcripts of protein-coding genes require the support of one RefSeq RNA, or oneGenBank RNA sequence plus at least one additional line of evidence. Transcripts of non-coding RNAgenes require the support of one Rfam or tRNA prediction. Compared to RefSeq, this gene set hasgenerally about 10% more protein-coding genes, approximately four times as many putative non-codinggenes, and about twice as many splice variants.

The UCSC Genes transcripts are annotated in numerous tables, each of which is also available as adownloadable file. Theseinclude tables that link UCSC Genes transcripts to external datasets (such asknownToLocusLink, which maps UCSC Genes transcripts to Entrez identifiers, previously knownas Locus Link identifiers), and tables that detail some property of UCSC Genes transcript sequences(such as knownToPfam, which identifies any Pfam domains found in the UCSC Genesprotein-coding transcripts). One can see a full list of the associated tables in theTable Browser by selecting UCSC Genes at the track menu;this list is then available at the table menu. Note that some of these tables refer to UCSCGenes by its former name of Known Genes, sometimes abbreviated as known or kg.While the complete set of annotation tables is too long to describe, some of the more importanttables are described below.

UCSC Genes (knownGene for hg19) can be explored interactively using theREST API, theTable Browser or theData Integrator.The genePred files for hg19 are available in ourdownloads directory or in ourgenes downloads directory in GTF format.All the tables can also be queried directly from our public MySQLservers. Information on accessing this data through MySQL can be found on ourhelp page as well as onour blog.

There are several caveats to the results reported in this study. First, there is relatively low power to detect non-coding mutations in the cohort, particularly in cancer types with small numbers of patients. Second, transcriptomic data were available for only a subset of samples, further reducing our ability to validate our predictions using gene expression data. Third, our pathway and network analysis relied on the driver p-values from the PCAWG Drivers and Functional Interpretation Working Group analysis16. While this analysis accounts for regional variations in the background mutation rate across the genome, it is possible that these corrections are incomplete. Furthermore, if the uncorrected confounding variables are correlated with gene membership in pathways and subnetworks, then the false positive rates in our analysis may be higher than estimated. All of these factors, plus other unknown confounding variables, make it difficult to assess the false discovery rate of our predictions, particularly for PID-N genes. Further experimental validation of these predictions is necessary to determine the true positives from false positives in our PID gene lists. 041b061a72


