TRFolder GENERAL INFORMATION ------------------- TRFolder is an utility capable of searching for the core structure of telomerase RNAs. Exhaustive search and incremental approach are applied for RNA structure prediction. Time needed: 4KB sequence (~5mins) TRFolder is implemented by C and python. 1. C compiler version: Dev-Cpp(http://wxdsgn.sourceforge.net/) 2. python compiler version: python 2.5.1 (http://www.python.org/ftp/python/2.5.1/python-2.5.1.msi) 3. (1) Installation of Python Imaging Library(PIL) for python (http://www.pythonware.com/products/pil/) If it doesn't work, please try the second way. (2) Installation of Python Imaging Library(PIL) for python (http://effbot.org/downloads/Imaging 1.1.6.tar.gz) Copy PIL folder to the python lib directory (e.g. C:\Python25\Lib) It has been compiled and tested on Windows platforms. Authors: Dong Zhang, Yong Wu Department of computer Science University of Georgia Athens, GA 30602 Copyright (C) 2008 University of Georgia. All Rights Reserved. WHAT IS CONTAINED IN THIS PACKAGE --------------------------------- This package is for windows platform (vista and xp). The package contains TRFolder program and GUI. INSTALLATION --------------------------------- The already compiled .exe versions of these programs included in the package may work fine for you. If so, you can skip the compiling step. You will need to install Python and PIL to use the graphical interface. Compile TRFolder Type 'make' to build four executable program as following 1. calFreq.exe 2. stempair.exe 3. findBEle.exe 4. findStemOne.exe Re-Compile TRFolder Type 'make clean' to remove all executable program, and then type 'make' If you have Dev-Cpp installed and 'make' doesn't work, please try to compile the files yourself. use a command such as: path:\devcpp\bin\gcc.exe -o calFreq.exe calFreq.c path:\devcpp\bin\gcc.exe -o stempair.exe stempair.c path:\devcpp\bin\gcc.exe -o findBEle.exe findBEle.c path:\devcpp\bin\gcc.exe -o findStemOne.exe findStemOne.c Type 'python interface.py' to build Tkinter GUI USAGE ----- TR structure profiling Panel: All structure profiles have been trained. If you want to use default settings, please skip this panel. If you want to use your own training data, please check the input format in "calMatrix\fastaDir" directory. TR structure search Panel: Press the Browse button to select the Sequence file to be used (next to the Sequence Name box). Please specify the template position and length. The result will be displayed on the interface and be written to "wholeTstruct.txt" file. e.g. Open up any of the 4 kb files in "sequence\4KB" directory. Press the Search TR Structure button. Wait about 5 mins and the results should appear. Note: It takes 5 mins for 4KB sequence. If you want to test longer sequence, it will take more time to get result. Menu: 1. [help] 2. [show TR structure] show the TR structure 3. [PK structure] show pseudoknot structure 4. [TH structure] show triple helix 5. [BE structure] show boundary element 6. [CC structure] show core closing stem Parameters for TR structure profiling task 1. structures profile 1) [pk profile] training Fasta format file of the pseudoknot structure 2) [triple helix profile] training Fasta format file of the triple helix structure 3) [boundary element profile] training Fasta Format file of the boundary element structure 4) [closing stem profile] training Fasta Format file of the closing stem structure For each structure profile: 1) [default vaule] set other parameters to be default values 2) [change vaule] set user defining values -- [reversal and sequential] (indicate the directions of paired regions *Triple helix is sequential, others are reversal.) -- [priorFile] (background basepair probability matrix used to adjust matrix obtained from the training data) -- [pseudocount] (pseudocount value used to adjust LogOdds matrix, default = 0.0001) -- [SD Multiple] (multiple of standard deviation used to adjust stem-loop range, default=3) -- [origin Coefficient] (used to set coefficient of original pair-frequency matrix, default = 1) -- [prior Coefficient] (used to set coefficient of prior matrix, default = 0) -- [gap penalty] (used to set identical gap penalty of logodds matrix) 2. sequence profile 1) [sequence profile] training sequence file for background base composition TR structure searching task 1. [Sequence Name] testing sequence file 2. [template position] the beginning of template position 3. [template length] the length of template 4. [number of hit] the number of candidates that you want to keep 5. [default value] set other parameters to be default vaules 6. [chaneg vaule] set user defining values 1) [PK Range File] (stem-loop range file) 2) [PK LogOdds Matrix] (logOdds matrix used for PK) 3) [Triple Helix LogOdds Matrix] (logOdds matrix used for triple helix) 4) [Closing Stem LogOdds Matrix] (logOdds matrix used for closing stem) 5) [Window Size] (PK searching window size, default = 150) 6) [Min Score] (The threshold score of PK, default = 2) 7) [Number Of PK Candidate] (user defining the number of PK candidates, default = 100) 8) [Pre-processing] (finding stems of left arm of U-rich before grouping into PK, default = 'applied') 9) [Similarity Filter] (filter similar PK, default = 'applied') 10)[Find Triple Helix] (search for triple helix, default = 'applied') 11)[Content Filter] (define four stem arms content of PK e.g. ---A (A-rich in the fourth arm) default = '----') 12)[Max Distance Template to Boundary Element] (distance from template to boundary element) 13)[Boundary Element Length] (length of boundary element stem) 14)[Loop Two Length] (length of loop two which is between boudary element arms) 15)[Loop Two SD] (standard deviation of loop two' length) 16)[Boundary Element SD] (standard deviation of boundary element's length) 17)[Boundary Element to Closing Stem] (distance from boundary element to closing stem left arm) 18)[Boundary Element to Closing Stem SD] (standard deviation of distance from boundary element to closing stem left arm) 19)[Closing Stem Length] (length of closing stem) 20)[Closing Stem SD] (standard deviation of closing stem's length) 21)[PK to Closing Stem] (distance from PK to closing stem right arm) 22)[PK to Closing Stem SD] (standard deviation of distance from PK to closing stem right arm) 23)[PK Weight] (the weight of pseudoknot score) 24)[Triple Helix Weight] (the weight of triple helix score) 25)[Boundary Element Weight] (the weight of boudary element score) 26)[Core-Closing Weight] (the weight of core closing stem score) Explanation of output --------------------- The TR structures are ranked by the score of pk, boundary, closing stem. Total:97.23 =pk Score:3*20.11 +Triple Score:1*9.03 +boundaryEle Score:2*9.92 +Core Closing Score:1*8.03 ..-336.....................-311..-295.....-286...-13.........-2.....0..........+13..+255....+263..+278+281..+283.....+292..+304....+312..+335.....+344..+355.........................+384 .....|........................|.....|........|.....|..........|.....|............|.....|.......|.....|..|.....|........|.....|.......|.....|........|.....|............................|..... .....CAUUUCCCGUUUGAAUUUCUAUCAUG.....UGGUAGAUGC.....GCAAGUCUACCA.....ACCACACCCACACA.....GAUUUAUUC.....UUUU.....AGUAGAUUUU.....GAAUAAAUU.....AAAAUCUAUU.....UACUGAUAGAAUUUGCAAAUGUGUCAAGUG..... .....(((((.((((((((((.(((((((((.....((((((((((.....)).).))))))).....**************.....(((((((((..............[[[[[[[[[[.....))))))))).....]]]]]]]]]].....)).))))))))))..)))))).)..))))) .....................................................................................................((((..................................))))................................... ...........................................(((.............................)).)................ 1) Listing total score, pk score, boundary element score, closing stem score, and triple helix score. 2) The second line is a rule to measure the relative position to template. The beginning of template is 0. 3) The third line is annotation of ruler. 4) The fourth line is sequence. 5) The fifth line is annotation of pairwise alignment. 6) The sixth line is annotation of triple helix structure. Examples ------------------- example contains: 1) pk profile 2) triple helix profile 3) boundary element profile 4) core closing profile 5) whole chromosome sequence 6) 4KB around template sequence Display ------------------- 1) loggodds matrix 2) stem/loop range -- sum -- average -- standard deviation -- min -- max 3) predicted structure Contact Information ------------------- For suggestions, questions, requests and bug report: please contact Liming Cai at cai@cs.uga.edu.