immuneSIM: tunable multi-feature simulation of B- and T-cell receptor repertoires for immunoinformatics benchmarking¶
immuneSIM enables in silico generation of single and paired chain human and mouse B- and T-cell repertoires with user-defined tunable properties to provide the user with experimental-like (or aberrant) data to benchmark their repertoire analysis methods.
The simulation algorithm encompasses an in-silico VDJ recombination process with on-the-go detailed annotation of the generated sequences and if enabled by the user somatic hypermutation (SHM) and sequence motif implantation. The user-definable parameters are: Clone count distribution, Germline Gene Usage, Insertion and Deletion Occurrence, SHM likelihood and Motif Implantation.
Table of Contents¶
- Quickstart guide
- Simulation
- Parameters
- Parameter 1: Choice of model (species and receptor)
- Parameter 2: Repertoire size (Number of simulated sequences)
- Parameter 3: Maximal and minimal amino acid CDR3 length
- Parameter 4: Clone count distribution
- Parameter 5: V,D,J germline gene frequencies
- Parameter 6: Insertion and deletions
- Parameter 7: Somatic hypermutation likelihood
- Parameter 8: Frequency update threshold
- Parameter 9: Random repertoires
- Parameter 10: Motif implantation
- Parameter 11: Sequence similarity
- Parameter 12: Codon bias
- Table: Reference datasets
- Somatic hypermutation
- Motif implantation
- Report generation
- Acknowledgments