-
Notifications
You must be signed in to change notification settings - Fork 4
Gene mutation script
This script is downloading gene mutations for selected tissue from CosmicDB. After downloading CSV files with samples and gene mutations, it changes FASTA sequence of gene with top 10 distinct mutations.
CosmicDB allows download of already filtered gene mutations for specific tissue, unlike gene expressions which have to be filtered for selected tissue in VINI.
getGeneMutations
method is trying with 10 attempts to download mutations from CosmicDB. Sometimes CosmicDB randomly responds with 401(unauthorized) response code, so in that case script sleeps for 2sec and tries again with download request.
Working directory for saving mutations is ./genes/mutations/
Working directory for saving FASTA sequences is ./genes/sequences/
Types of mutation:
-
Substitution - missense
-
Substitution - nonsense
-
Substitution - coding silent
-
Deletion - in frame
-
Deletion - frame shift
-
Insertion - frame shift
-
Complex - deletion inframe
-
Complex - frame shift
-
Unknown
python get_gene_mutation.py -g <gene Uniprot ID or file path> -t <tissue name>
python generateMutatedFASTAseq.py