-
Notifications
You must be signed in to change notification settings - Fork 0
syagev/weiz-grid
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
############################################################################### ## WeizGrid ## By Stav Yagev, 2013 (Please contact me if you find bugs !) ## ## ## Simple framework for using the SGE cluster on Weizmann for parallel work. ## ## FEATURES: ## - Helps split your own code into fractions that can be run in parallel on the cluster. ## - Allows you to aggregate results in an efficient manner. ## - Write your code ONCE! Execute the same code on PC for debugging and on the cluster ## for production. ## - Recover from errors on the cluster - splitting means if 1 iteration out of a ## 1000 failed you still have 999 iterations in your hand! ## ## Use case example: ## You have a job that processes 1000 images in some way, running the same algorithm ## for each image, outputing something for each image, and finally aggregating the ## results from all images. Up until now, you ran this in a single job that took 1000 ## minutes (because lets assume it takes 1 minute to process each image). Using WeizGrid ## you can take advantage of the fact that the job can be parallelized - instead of 1 ## job WeizGrid helps you easily split the job to 80 parallel jobs so that you get all ## 1000 images in 1000/80=12.5 minutes (!!) ## Auto Installation: ================== 1. Copy all the files to ~/WeizGrid 2. In a UNIX terminal, type: chmod +x ~/WeizGrid/wgsetup 3. Then type: ~/WeizGrid/wgsetup Usage: ====== 1. Create files in the spirit of 'sample.m' and 'calcPrimes.m' (make sure the WG m-files are in MATLAB's path) 2. Upload your files to UNIX 3. Under UNIX, from the direcotry of YOUR project, run: ~/WeizGrid/qsubWG [name] [your-m-file] [queue] Example: ~/WeizGrid/qsubWG ParallelCracker sample all.q 4. Enjoy! Manual Installation (if auto doesn't work...): ============================================= 1. Copy all files to ~/WeizGrid. 2. Make sure all bash scripts have execute permission. 2. Create the following directory structure (1 folder for each queue you intend to use): ~/.matlab/cluster_jobs/all.q ~/.matlab/cluster_jobs/test.q ... NOTE: The ~/.matlab/ usually already exists but is hidden, so make sure you are viewing hidden files and folders . Tips & Troubleshooting: ======================= - If your algorithm uses a random number, make sure you are not setting the RngShuffle option to false when invoking WGexec - because this will yield the same random series for every "parallel" piece of work. - Sometimes, during the aggregation part of your script, there will be a bug. Then, you might think all your work is lost! This is not the case, look into part #2 of the documentation of WGgetResults for more information. - There isn't verbose error checking, so if something doesn't work, check the output files on your UNIX home for hints... If still unsuccessful, feel free to contact me for help.
About
A simple framework for using the SGE cluster at Weizmann's CS department for parallel work
Resources
Stars
Watchers
Forks
Packages 0
No packages published