-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Homer Findingmotifs TFBS #15
Comments
@fangwuwang Thanks for your information. I already installed homer in the remote server since installing xcode and related tools is taking too much time in my mac. Hope It will work. |
@rawnakhoque are you passing it a HOMER peak file or a BED file? I don't understand what "Column5: not used" means (attached pic) |
@rawnakhoque Our postdoc mentioned that Xcode installation may take 1-2 hours since it's 1-2 GB large. But you can try it at the same time as you are running the remote server, 1-2 hours is not that long and it will be very useful to you in the future. |
General note for the future on XCode: http://railsapps.github.io/xcode-command-line-tools.html |
@acavalla Are you around in the BCCRC building for a while? Rawnak is coming here to discuss some results she got from Homer in probably an hour. |
Yep - I'm on the 8th floor. where would you like to meet? @fangwuwang we also need to discuss getting the poster printed :) |
@acavalla we can meet on the main floor lunch area or the meeting room on 13th. @rawnakhoque Can you also send an email to Annie (acavalla@bcgsc.ca) know when you arrive? |
I arrived 😊. In the main floor.
On Wednesday, March 29, 2017, fangwuwang ***@***.***> wrote:
@acavalla we can meet on the main floor lunch area or the meeting room on
13th. @rawnakhoque Can you also send an email to Annie (acavalla@bcgsc.ca)
know when you arrive?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.<
|
@acavalla We are on the 13th floor meeting room. |
@rawnakhoque @acavalla Two comparisons have been uploaded to this folder so far and I am working on the other. The promoter file was in the same format as enhancers since I think we are using the same assay for promoters right? Let me know if there is any problem with the files. If it is small error, you can modify with text reader/excel but save it as windows formatted text file. |
@psomdeb25 i think maybe you forgot to filter the file GMP-CLP_promoters_filtered.csv in the methylation results? the other comparisons have ~200 entries but this file has 28000 ;) |
@fangwuwang @rawnakhoque I can't run homer on the promoters now as i can't install wget in my remote server. i can ask admin to do it in the morning so i can run them then, or i could start working on the html file, see if there is a TF database to pull TFs down from? |
@acavalla I have updated the file. You can have a look at it. |
@fangwuwang Do you think we should be discussing the final stage of our analysis on Friday? |
@acavalla @rawnakhoque @psomdeb25 I've done all the text files. As you see, there are two files (low methylation in either cell type) for each comparison because I separated them by the positive and negative differential methylation values, which indicates either higher methylation in HSC compared to MPP (for example) or vice versa. So please run these two files individually for promoter and enhancer regions of each comparison, which means four files for each comparison. |
On Friday I have class on campus until 12.30 and then a meeting at 2pm at the GSC, so would either have to be a quick meeting around 12.30 on campus, or after 3 at the GSC. not ideal i know :( |
@rawnakhoque can you upload some of the html meth files to the repo (maybe into a new dir within DNA-meth) so i can have a look? |
@rawnakhoque I ran homer on one of the promoter txt files - what did you put for the size parameter and other options? i used 200 for the size and -preparse but i got various warnings like "Something is wrong... are you sure you chose the right length for motif finding? i.e. also check your sequence file" and "Illegal division by zero at /projects/acavalla_prj/stat540/homer/bin/findKnownMotifs.pl line 152" and "Use of uninitialized value in numeric gt (>) at /projects/acavalla_prj/stat540/homer/bin/compareMotifs.pl line 1381." help!! mine is tab separated but it's saved as txt, does that make a diff? |
@acavalla I am still working on the enhancer files. I will upload the results once the job is finished. For the size parameter you can use -given instead of a specific number. It worked for me. |
@rawnakhoque then i cant get it to work, sorry. i'm not getting an html output file because it says no sequences are found. I'll keep trying and i'll let you know if i get anywhere but i'm not optimistic :( |
@fangwuwang Did you read this paragraph titled 'Finding Instance of Specific Motifs' in http://homer.ucsd.edu/homer/ngs/peakMotifs.html. What do you think? Do we need this analysis? N.B. My remote server account will be expired by tomorrow. So I have to complete all the jobs by tomorrow. |
@rawnakhoque @acavalla No we don't need the location information for the scope of this project. #Change directory: #create a list of files using regex and wildcards #For loop echo "BASH: saved results in $OUT file..." done |
@fangwuwang |
@acavalla Good that it's running. Did you complete any run? How long did it take? |
@rawnakhoque Thanks for the high efficiency! Can you update the command you used for installation, setting up and analysis in the repo so that our members can refer to it. |
At first I downloaded the configureHomer.pl script from http://homer.ucsd.edu/homer/introduction/install.html |
@psomdeb25 Somdeb, are you available today to discuss about the interpretation of methylation results and what results to upload to github and put into the poster? I am on campus all day. @rawnakhoque @acavalla you can join if you are done with the TFBS analyses. Thanks! |
Yes. I am free today.
On Mar 31, 2017, at 9:14 AM, fangwuwang <notifications@github.com<mailto:notifications@github.com>> wrote:
@psomdeb25<https://github.com/psomdeb25> Somdeb, are you available today to discuss about the interpretation of methylation results and what results to upload to github and put into the poster? I am on campus all day. @rawnakhoque<https://github.com/rawnakhoque> @acavalla<https://github.com/acavalla> you can join if you are done with the TFBS analyses. Thanks!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#15 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AXxFM0aLIWJqRRWyf-mZ6ob-lS66fJN7ks5rrSZ6gaJpZM4Mtgpl>.
|
@fangwuwang @acavalla @psomdeb25 |
@acavalla Have you been able to complete the analysis for promoter region. If you are still having any problem I can do the analysis for the promoter region as well. Please let me know ASAP. |
Hi! I completed one run but i didn't use mask. I'll start over because it's probably better to have the same conditions. It all works now so it'll take 9h, I can do some over the weekend too |
@rawnakhoque @acavalla Sounds good. It might be necessary to keep the parameters the same across enhancer/promoter analyses. After you finish and upload the TF data, I will try to do the clustering analysis on normal and leukemia RNA-seq data. It might be better to split the job so that we get the data earlier. |
@acavalla Can you mention the files you will be working on so that I can work on the others. |
I'm running them all in the for loop, so they'll run overnight and finish when they finish. I can upload them to the github tomorrow. I've got one set already, so I'll upload that now. |
ok. I am running for CMP-MLP-CMP, CMP-MLP-MLP and GMP-CLP-CLP. may be we
can compare our results for these three.
…On Fri, Mar 31, 2017 at 4:43 PM, acavalla ***@***.***> wrote:
I'm running them all in the for loop, so they'll run overnight and finish
when they finish. I can upload them to the github tomorrow. I've got one
set already, so I'll upload that now.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#15 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AX06pBuOpfoWxlbonBa48rNo_4rjV1B9ks5rrY-PgaJpZM4Mtgpl>
.
|
I ran as -size given, -mask and -preparsed (I'm not sure what that one means but it complained when i didn't use it). I've uploaded the known motifs for CMP-MLP-CMP here, so download it and have a look |
@acavalla |
@fangwuwang Could you please post some update on your analysis and also if you would like me to do some. If you think so you can email me for the detail. |
Thanks @rawnakhoque, I want to inspect the RNA expression of the transcription factors in the normal cell data, is it possible to get the gene symbols (shown as hgnc_symbol in your converted list) for the raw data (all transcripts) ? |
@fangwuwang Please find the files here. I split the file into (raw_genes_1, 2, and 3) since the program got stuck due to the big file. I uploaded the code as well. |
Thanks @rawnakhoque, looks great. |
@rawnakhoque Can you please look at the clustering analysis as well (refer to this seminar)? Sorry I am working on the expression of TFs and introduction of the poster and may not be able to dedicate to it. Have you done any promoter analysis of the TFBS since I am not sure where Annie is at her analysis. Thank you. |
@rawnakhoque You mentioned you have done both known and de novo motif finding, but in the folder, only known motif results are there. Can you upload de novo results as well? I found this Homer page provides great details about the analysis mechanisms and output explanation. We can see that the (13. de novo output) is different from (14. known motif output) in terms of the layout of html page. |
@fangwuwang Sorry, my bad. Now uploaded the de novo results file as well. |
@fangwuwang @acavalla I am running homer for rest of the promoter groups. |
I'm going in to work later so I can check where the analysis is at then. It should be finished and then I'll upload it all. The one that's up already is CMP MLP CMP, sorry for the confusion |
@rawnakhoque Also, for the gene id conversion of the RNA-seq file, can you tell me how you separated the raw data into three parts (row [x to y] converted to gene list 1, row [y to z] converted to gene list 2, and so forth)? Since there are some missing rows compared to original data when you add up the total number of three gene list files, which create a big trouble to the matching to the original file. Thanks. |
@acavalla Can you let us know what has been done so Rawnak don't need to run again? And I don't know what the advantage of the de novo finding is, the results are quite different from the known finding. Just thought maybe we can pool the two results together for the rest of the analysis like expression level and clustering. |
@fangwuwang This is not due to the missing rows. The input was correct but the program did not find gene ids for some of the transcripts so the row number reduced. You can see the files for the transcript id here |
I ran the for loop in the shell for all the promoters, so they should all be done. I don't think we should present all the found motifs either - we're not trying to find sequences, just link them to TFs I thought? |
Hi all! Just uploaded all the knownResults files into the dna meth folder. Rawnak, I'm not sure why we would have got different results. I used -size given -mask -preparse |
I think we need the de novo results file not the known results.
…On Sun, Apr 2, 2017 at 8:32 PM, acavalla ***@***.***> wrote:
Hi all! Just uploaded all the knownResults files into the dna meth folder.
Rawnak, I'm not sure why we would have got different results. I used -size
given -mask -preparse
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#15 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AX06pC1KdCoA1EXEsGDZ3yaak-MHD0xPks5rsGhVgaJpZM4Mtgpl>
.
|
We discussed this when we met and decided to go with the known. What is the benefit of the others in our analysis? |
@acavalla Can you please upload all the text files for known motifs? These files should be in your output folder when you completed the jobs. |
@rawnakhoque done :) |
@acavalla Thanks! :) |
@rawnakhoque I asked the PDF in our lab and he showed me that everything has been done in bash. Follow the installation and basic configuration step by step here. As shown in the webpage, genome configuration is done using this line (see Download Homer Packages session)-- perl /path-to-homer/configureHomer.pl -install hg19_
And to do the analysis there is only one line to run (link)-- findMotifsGenome.pl <peak/BED file> -size # [options]
The text was updated successfully, but these errors were encountered: