extractTranscripts {Biostrings} | R Documentation |
extractTranscripts
allows the user to extract a set of
transcripts specified by the starts and ends of their exons
as well as the strand from which the transcript is coming.
transcriptWidths
only returns the lengths of the
transcripts (called the "widths" in this context) specified
by the starts and ends of their exons.
transcriptLocs2refLocs
converts transcript-based
locations into reference-based locations.
extractTranscripts(x, exonStarts=list(), exonEnds=list(), strand=character(0), reorder.exons.on.minus.strand=FALSE) transcriptWidths(exonStarts=list(), exonEnds=list()) transcriptLocs2refLocs(tlocs, exonStarts=list(), exonEnds=list(), strand=character(0), reorder.exons.on.minus.strand=FALSE)
x |
A DNAString or MaskedDNAString object. |
exonStarts, exonEnds |
The starts and ends of the exons, respectively.
Each argument can be a list of integer vectors,
an IntegerList object,
or a character vector where each element is a
comma-separated list of integers.
In addition, the lists represented by |
strand |
A character vector of the same length as exonStarts and
exonEnds specifying the strand ("+" or "-" )
from which the transcript is coming.
|
reorder.exons.on.minus.strand |
TRUE or FALSE . Should the order of exons
for transcripts coming from the minus strand be reversed?
|
tlocs |
A list of integer vectors of the same length as exonStarts
and exonEnds . Each element in tlocs must contain
transcript-based locations.
|
extractTranscripts
allows the user to extract a set of
transcripts specified by the starts and ends of their exons
as well as the strand from which the transcript is coming.
See extractTranscriptsFromGenome
in the GenomicFeatures package for extracting transcripts from
a genome.
A DNAStringSet object for extractTranscripts
.
An integer vector for transcriptWidths
.
A list of integer vectors of the same shape as tlocs
for transcriptLocs2refLocs
.
extractTranscriptsFromGenome
,
reverseComplement
,
DNAString-class,
DNAStringSet-class
## --------------------------------------------------------------------- ## A. EXTRACTING WORM TRANSCRIPTS ZC101.3 AND F37B1.1 ## --------------------------------------------------------------------- ## Transcript ZC101.3 (is on + strand): ## Exons starts/ends relative to transcript: rstarts1 <- c(1, 488, 654, 996, 1365, 1712, 2163, 2453) rends1 <- c(137, 578, 889, 1277, 1662, 1870, 2410, 2561) ## Exons starts/ends relative to chromosome: starts1 <- 14678410 + rstarts1 ends1 <- 14678410 + rends1 ## Transcript F37B1.1 (is on - strand): ## Exons starts/ends relative to transcript: rstarts2 <- c(1, 325) rends2 <- c(139, 815) ## Exons starts/ends relative to chromosome: starts2 <- 13611188 - rends2 ends2 <- 13611188 - rstarts2 exon_starts <- list(as.integer(starts1), as.integer(starts2)) exon_ends <- list(as.integer(ends1), as.integer(ends2)) library(BSgenome.Celegans.UCSC.ce2) ## Both transcripts are on chrII: chrII <- Celegans$chrII transcripts <- extractTranscripts(chrII, exonStarts=exon_starts, exonEnds=exon_ends, strand=c("+","-")) ## Same as 'width(transcripts)': transcriptWidths(exonStarts=exon_starts, exonEnds=exon_ends) transcriptLocs2refLocs(list(c(1:6, 135:140, 1555:1560), c(1:6, 137:142, 625:630)), exonStarts=exon_starts, exonEnds=exon_ends, strand=c("+","-")) ## A sanity check: ref_locs <- transcriptLocs2refLocs(list(1:1560, 1:630), exonStarts=exon_starts, exonEnds=exon_ends, strand=c("+","-")) stopifnot(chrII[ref_locs[[1]]] == transcripts[[1]]) stopifnot(complement(chrII)[ref_locs[[2]]] == transcripts[[2]])