This tutorial introduces the users how to use BEAN 2.0's stand-alone version in their local machines. If you wish use BEAN 2.0 to predict new type-III effectors at genome level, we strongly recommend you to download the BEAN 2.0's stand-alone version and deploy it on your local machines according to the installation guide.
BEAN 2.0 is upgraded version of BEAN. BEAN 2.0 is a machine learning based method designed to predict type-III effectors from bacteria proteins. BEAN 2.0 is available at http://systbio.cau.edu.cn/bean/.
All codes of BEAN 2.0 are written in Perl, so theoretically you can install it on almost any operating system as you like. But in order to avoid some unnecessary procedures, we recommend you to use 64-bit Linux as the operating system. Because we have tested BEAN 2.0 on classic 64-bit Linux system and are sure it work well on it. In addition to that, some packages like HHsuits are easier to install on 64-bit Linux. More information about Linux system can be found in http://en.wikipedia.org/wiki/Linux.
Perl should have been installed for most of versions of Linux as default. So you can skip this section, if you can get information like below after you typing command "perl -h" under Linux's terminal.
We recommend you to use Perl v5.10.1 (*), because we used Perl v5.10.1 (*) to debug our Perl program. However, if there is no any version of Perl has been installed, you also can find it at Perl's official website http://www.perl.org/ and install it easily according to it's installation guide.
The Pfam database is a large collection of protein domain families. Each family is represented by multiple sequence alignments and hidden Markov models (HMMs). To get the domains of the query proteins, you should download the database from ftp://ftp.sanger.ac.uk/pub/databases/Pfam and deploy them. In addition, please download pfamscan from ftp://ftp.ebi.ac.uk/pub/databases/Pfam/Tools/ and hmmer from http://hmmer.janelia.org/.
HHblits is a new sequence searching algorithm based on hidden markov model. We use it to improve sensitivity of BEAN 2.0. It has been wrapped in HHsuite. Please download HHsuite and related database from HHblits' main page http://toolkit.tuebingen.mpg.de/hhblits/ and deploy them correctly.
BLAST+ suit is a rewrite version of BLAST in C++ language. You can get it and corresponding NR database from NCBI website http://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=BlastDocs&DOC_TYPE=Download.
LibSVM is a machine learning package which implements support vector machine algorithm and related tools. You can download it from its website http://www.csie.ntu.edu.tw/~cjlin/libsvm/ In order to output its decision score, you need to add some codes to "svm.cpp" file before compiling LibSVM source code to executable files. Locate the "svm_predict()" function and modify its function body to this:
double svm_predict(const svm_model *model, const svm_node *x) { int nr_class = model->nr_class; double *dec_values; if(model->param.svm_type == ONE_CLASS || model->param.svm_type == EPSILON_SVR || model->param.svm_type == NU_SVR) dec_values = Malloc(double, 1); else dec_values = Malloc(double, nr_class*(nr_class-1)/2); double pred_result = svm_predict_values(model, x, dec_values); //-----------------------Add below codes-------------------------------- printf("%g\n", dec_values[0]*model->label[0]); //---------------------------------------------------------------------- free(dec_values); return pred_result; }
You can also use the pre-compiled version of LibSVM wrapped with our BEAN 2.0 package (in "libsvm-2.9" subdirectory ). But you need to make sure these binary LibSVM files are executable before using them. You can use "chmod" command to grant these binary files executable permissions.
Click to download the latest version!    Latest release:2.0
Decompress BEAN 2.0 package with below command:
unzip BEAN_2.0.zip cd BEAN_2.0
You will find four subdiectories ("libsvm-2.9/", "db/", "domain/" and "model/") and one Perl files (classify.pl).
BEAN 2.0/ | |-- libsvm-2.9/ # store LibSVM binary files |-- db/ # BLAST database used in BEAN 2.0 |-- model/ # SVM model |-- domain/ # domain database |-- classify.pl # main program of BEAN 2.0 |-- seqs_for_test.fasta # test files
Put the compiled LibSVM binary files ("svm-predict", "svm-scale" and "svm-train") in "libsvm-2.9/". Then open "classify.pl" with a text editor, like vi or vim, to modify corresponding settings of BEAN 2.0 according to instruction in it.*
#------------------------------------------------------------------------------- #------------------------------------------------------------------------------- #BLAST's database #example $blast_nrdb='/home/pub/blastdb/nr'; my $blast_nrdb="/path/to/blast/nr/database/"; #HHBLITS's database #example $hhsuite_db='/home/pub/database/hhsuite_database/nr20_12Aug11'; my $hhsuite_db="/path/to/hhsuite/database/"; #PfamScan's database #example $pfam_db='/home/pub/database/pfam_database'; my $pfam_db="/path/to/pfam/database/"; # HHblits' tool script reformat.pl path # Example: $reformat = '/home/you/local/hhsuite/lib/hh/scripts/' my $reformat= '/path/to/reformat.pl' # Pfam' tool script pfam_scan.pl path # Example: $pfamscan='/var/www/html/bean/PfamScan' my $pfamscan='/path/to/pfam_scan.pl' # Libsvm' tool script svm-predict # Example: $svm_pred='/home/you/bean/libsvm-2.9/svm-predict' my $svm_pred ='/path/to/svm-predict'; #------------------------------------------------------------------------------- #-------------------------------------------------------------------------------
Type below command under Linux terminal to test BEAN 2.0 can work or not.
perl classify.pl seqs_for_test.fasta
$ perl classify.pl SEQS output_fileSEQs is a protein sequence file in FASTA format
$ perl classify.pl seqs_for_test.fasta prediction_result.txt
You will get a result file like this if BEAN 2.0 can successful execute:
protein score(e-value) methods effector YF81_THET2 1e-25 BLAST no G8Z8Z9_BRAOL -0.616122 BEAN_2.0 no