ProteinDataPrep
ProteinDataPrep
Details
Handles data preparation for differential expression analysis: contaminant/decoy filtering, peptide-to-protein aggregation, and normalization.
Public fields
prolfq_app_configProlfquAppConfig
lfq_data_peptideLFQData peptide level
lfq_dataLFQData protein level (after aggregation)
lfq_data_transformednormalized LFQData
lfq_data_peptide_transformedtransformed peptide-level LFQData (for nested facades)
aggregatoraggregator object
rowAnnotProteinAnnotation
summarydata.frame with contaminant/decoy summary
Methods
Method new()
Initialize ProteinDataPrep
Usage
ProteinDataPrep$new(lfq_data_peptide, rowAnnot, prolfq_app_config)Method build_deanalyse()
Build a DEAnalyse object with the correct data for the chosen facade
Examples
pep <- prolfqua::sim_lfq_data_peptide_config(Nprot = 100)
#> creating sampleName from fileName column
#> completing cases
#> completing cases done
#> setup done
pep <- prolfqua::LFQData$new(pep$data, pep$config)
pA <- data.frame(protein_Id = unique(pep$data$protein_Id))
pA <- pA |> dplyr::mutate(fasta.annot = paste0(pA$protein_Id, "_description"))
pA <- prolfquapp::ProteinAnnotation$new(pep, row_annot = pA, description = "fasta.annot")
#> Warning: no exp_nr_children column specified, computing using nr_obs_experiment function
GRP2 <- prolfquapp::make_DEA_config_R6()
GRP2$processing_options$transform <- "robscale"
data_prep <- prolfquapp::ProteinDataPrep$new(pep, pA, GRP2)
data_prep$cont_decoy_summary()
#> totalNrOfProteins percentOfContaminants percentOfFalsePositives
#> 1 100 0 0
#> NrOfProteinsNoDecoys
#> 1 100
data_prep$remove_cont_decoy()
#> Joining with `by = join_by(protein_Id)`
#> INFO [2026-03-23 19:48:08] removing contaminants and reverse sequences with patterns: ^zz|^CON|Cont_^REV_|^rev_
data_prep$aggregate()
#> INFO [2026-03-23 19:48:08] AGGREGATING PEPTIDE DATA: medpolish.
#> Column added : log_abundance
#> starting aggregation
#> Column added : exp_medpolish
#> INFO [2026-03-23 19:48:09] END OF PROTEIN AGGREGATION
data_prep$transform_data()
#> INFO [2026-03-23 19:48:09] Transforming using robscale.
#> Column added : log2_exp_medpolish
#> data is : TRUE
#> Joining with `by = join_by(protein_Id, sampleName, isotopeLabel)`
#> INFO [2026-03-23 19:48:09] Transforming data : robscale.