Add protein lengths from fasta file to data frame (id_col - protein id column.)
Source:R/tidyMS_SaintExpress.R
saintExpress.RdExecutes SAINTexpress on prepared input data. Delegates to the saintexpressbin package for the native binary / Docker engine, and to the saintexpress package for the pure-R engine. If `engine = "binary"` is requested but saintexpressbin is not installed, a warning is issued and the call falls back to the R engine.
Usage
add_protein_lengths(intdata, fasta, id_col = "protein_Id")
protein_2localSaint(
xx,
quantcolumn = "mq.protein.intensity",
proteinID = "protein_Id",
geneNames = proteinID,
proteinLength = "protein.length",
IP_name = "raw.file",
baitCol = "bait",
CorTCol = "CorT"
)
runSaint(
si,
filedir = getwd(),
spc = TRUE,
CLEANUP = TRUE,
use_docker = NULL,
engine = c("binary", "r"),
optimizer = c("base", "nloptr")
)Arguments
- intdata
data.frame
- fasta
list of sequences created with
read.fasta- id_col
column with protein ids/accessions.
- xx
data.frame in long format
- quantcolumn
intensity column
- proteinID
protein accession
- geneNames
column with gene names
- proteinLength
column with protein lengths
- IP_name
raw.file
- baitCol
column with bait definition (condition)
- CorTCol
is it control or TRUE (SaintExpress speach)
- si
output of protein_2localSaint function
- filedir
where to store the saint express inputs
- spc
if TRUE spectral counts, if FALSE intensities (see SAINTexpress documentation)
- CLEANUP
TRUE remove all files generated by SAINTexpress
- use_docker
logical or NULL. NULL (default) uses a native executable when available and falls back to Docker on macOS. TRUE forces Docker. FALSE forces native execution.
- engine
execution engine.
"binary"runs bundled SAINTexpress via saintexpressbin;"r"runs the R implementation via saintexpress.- optimizer
optimizer for
engine = "r"."base"usesoptim;"nloptr"uses NLopt COBYLA when nloptr is installed.
Value
`intdata` with an added `protein.length` column.
named list with `inter`, `prey`, and `bait` SAINT input tables.
list with SAINTexpress `listFile`, parsed `list`, and run `out`.
Examples
add_protein_lengths(
data.frame(protein_Id = "P1"),
list(P1 = "MPEPTIDE")
)
#> protein_Id protein.length
#> 1 P1 8
bb <- prolfqua::prolfqua_data('data_IonstarProtein_subsetNorm')
bb$config <- bb$config$clone(deep = TRUE)
xx <- prolfqua::LFQData$new(bb$data, bb$config)
exampleDat <- xx$data_long() |>
dplyr::mutate(CorT = dplyr::case_when(dilution. == "a" ~ "C", TRUE ~ "T"))
# sample protein lengths
tmp <- data.frame(protein_Id = unique(exampleDat$protein_Id))
tmp$proteinLength <- as.integer(runif(nrow(tmp), min = 150, max = 2500))
exampleDat <- dplyr::inner_join(tmp, exampleDat)
#> Joining with `by = join_by(protein_Id)`
res <- protein_2localSaint(exampleDat,quantcolumn = "medpolish",
proteinID = "protein_Id",
proteinLength = "proteinLength",
IP_name = "raw.file",
baitCol = "dilution.",
CorTCol = "CorT"
)
stopifnot(names(res) == c( "inter", "prey", "bait"))