3. Get Start

Before launch PGAP-X, you need a workspace.

3.1. Create Your Workspace

  A workspace is a file folder. You create a folder in your personal computer's disk whereever you want. And name this specific folder whatever you want. Let's say: "WorkDirectoryExample", and this is our "workspace".
  Pay attention, this workspace should contain Two sub-folders, which means you should create Two specific sub-folders in your workspace, and their name is specific:

  • GenomeSequence
  • GenomeAnnotation
  • Genome sequence in Fasta file format should be placed in GenomeSequence fold. Genome annotation file in GFF format should be placed in GenomeAnnotation fold.

    Be sure to Create These Two Folders with exact name above.

    Figure 3.1.1 miss
    Figuer 3.1.1: The Two folders you should create by yourself

    Figure 3.1.1 shows the two folders you should create in your workspace, and if you start to run the tool, you will find some new folders in your workspace, as you can see in Figure 3.1.2, which are create by the tool.

    Please do NOT try to delete or remove any folder or file if you don't understant its function,
    unless the TemporaryData folder, in which there are only some "temporary data" created during computing.

    Figure 3.1.2 miss
    Figuer 3.1.2: Some other folders you will see in your workspace
    Do not try to delete them.

    3.2. Choose Species You Want To Analyze

    The next step is to find species you want to analyze. Here we are interested in 6 strains of Streptococcus pneumonia, which are:

  • NC_012467
  • NC_012468
  • NC_012469
  • NC_014251
  • NC_014494
  • NC_014498
  • We will use data of these 6 genome for following analysis.

    3.3. Prepare Genome Sequence Data

    Genome Sequence Data is basic data for comparative genomic analysis.

    We accept FASTA format(see FASTA format in wikipedia), which you can download directly from NCBI. You should put all the genome annotation data into file folder "GenomeSequence"

    3.4. Prepare Genome Annotation Data

    We accept NCBI PTT File Format of genome annotation, this file is a text file that contains a tab-delimited table with information on location, annotation, COG families, locus tags, GI numbers, Length and Gene symbols. The specific headers follow the order given in the example as per NCBI as of 2009.

    Figure 3.4.1 miss
    Figuer 3.4.1: Sample PTT File of NC_012467, download from NCBI.

    You should put all the genome annotation data into file folder "GenomeAnnotation".

    3.5. Choose Your Workspace

    Now it's time to start PGAP-X, the Main interface is called & PGAP-X Lite Launcher", the following figuers are made under Windows 7 system.

    Figure 3.5.1 miss
    Figuer 3.5.1 ComparativeGenomicTool Lite Launcher.

    Click "Get Start" Button, and choose the folder of your workspace, if you follow formal directions, you will see a interface like Figuer 3.5.2.

    Figure 3.5.2 miss
    Figuer 3.5.2 Get Start(functional modules are not available).
    You can see the directory of your workspace you have chosen, click button "Change Workspace", and you can change your workspace. The six genome you have put have already present in the left size, you can exclude any genome you do not want to analysis, click button "Start" and you can start following analysis. You will see the functional modules are available, like Figure 3.5.3.
    Figure 3.5.3 miss
    Figuer 3.5.2 Functional modules available.

    Anytime you want to change your workspace, click button "Get Start".

    Please be advised that PGAP-X only accept file path containting English letter, numbers, ".", "/", "\", "_", "-" and blank space.