LCP example - Handout from Cookbook Linkage Analysis Course, 10/18/2000

This handout describes how to use lcp to make scripts to run the main FASTLINK programs. The lcp-produced scripts look very different for different operating systems, but the usage of lcp is essentially the same. Like preplink, lcp is menu based. However, the lcp menus are full-screen menus and modifications are done differently.

lcp relies on use of control characters to describe the script. "Control characters" mean that you hold down the control key and an alphabetic key simultaneously.

The following control characters are especially useful:

CTRL-A: abort
CTRL-H: help
CTRL-N: move to the next screen
CTRL-P: move to the previous screen
CTRL-Z: finish gracefully and write out the script.

I will illustrate 3 uses of lcp with respect to CLAP that would be typical in a linkage study.

First we will make an MLINK script that does 2-locus analysis of CLAP vs. locus 2 and CLAP vs. locus 3. This is representative of the genome-search phase.

Second, we will make an ILINK script that helps find the true recombination fraction between loci 2 and 3.

Finally we will make a LINKMAP script that could be used to determine which locus order among:

  1 2 3
  2 1 3
  2 3 1
is correct.

To start up lcp, type:

The first menu looks like:

                L I N K A G E   C O N T R O L   P R O G R A M   

                                  Input Files

             COMMAND file name [pedin] : pedin
             LOG file name [final.out] : final.out
         STREAM file name [stream.out] : stream.out
        PEDIGREE file name [pedin.dat] : pedin.dat
      PARAMETER file name [datain.dat] : datain.dat
       Secondary PEDIGREE file name [] : 
      Secondary PARAMETER file name [] : 

                  CTRL/A - Abort  CTRL/H - Help  CTRL/Z - Exit

I want to change the first three entries.
The first entry is the script name.
The second entry is the name of the primary output file.
The third entry is the name of the secondary output file.

After my changes, the menu looks like:

                L I N K A G E   C O N T R O L   P R O G R A M   

                                  Input Files

             COMMAND file name [pedin] : mlink-pedin
             LOG file name [final.out] : mlink-final.out
         STREAM file name [stream.out] : mlink-stream.out
        PEDIGREE file name [pedin.dat] : pedin.dat
      PARAMETER file name [datain.dat] : datain.dat
       Secondary PEDIGREE file name [] : 
      Secondary PARAMETER file name [] : 

                  CTRL/A - Abort  CTRL/H - Help  CTRL/Z - Exit

Now I am ready to move to the next menu, so I type CTRL-N.

                        Pedigree Options

                     General pedigrees : <-
            Three-generation pedigrees :
          Experimental cross pedigrees :

Here I do not want to change anything, so I type CTRL-N.

The next menu looks as follows.

Here, you select which program you want to use. An lcp-produced script can do multiple runs of the same program, but not runs of different programs. Indeed, multiple runs of MLINK (one per locus pair) and LINKMAP (one per order) are standard.

We want to select MLINK. So I move the arrow to point to that option.

                       General Pedigree Analysis Options

                              LODSCORE :
                                 ILINK :
                               LINKMAP :
                                 MLINK : <-

The next menu looks as follows:

This is MLINK specific and recalls the different usages of MLINK outlined in the slide. In our case we want a "Multiple pairwise Lod table", so I move the arrow to point to that option.

                   MLINK - Test Options

                   Specific evaluation : <-
                       Lod score table :
           Multiple pairwise Lod table :

The next screen is:

Note: It shows that the default setting has male theta and female theta the same. If you want to change this, use CTRL-U to erase the arrow. I will leave the arrow there and move to the next screen.

                 MLINK - Sex Difference Options

                     No sex difference : <-

The next screen asks about what loci to use and what thetas. We want to compare locus 1 against 2 and locus 1 against 3.

              MLINK - Multiple Pairwise Lod Table Specification
                            Command Screen 

                    First locus set [] : 
                   Second locus set [] : 
          Recombination fractions [.0] : .0

   Other recomb. [.01 .05 .1 .2 .3 .4] : .01 .05 .1 .2 .3 .4

Therefore, I fill in 1 as the "First locus set" and 2 3 as the "Second locus set". One can also change the list of candidate thetas that is used.

The next screen that comes up is:

                        Pedigree Options

                     General pedigrees : <-
            Three-generation pedigrees :
          Experimental cross pedigrees :

At this point I am done making the script, so I enter CTRL-Z.

Next I will prepare the ILINK script. The beginning steps are similar. In this case I call the script ilink-pedin and the output files: ilink-final.out and ilink-stream.out.

The first screen that looks different is:

                     ILINK - Order Options

                        Specific order : <-
                            All orders :
           Inversions of adjacent loci :

This is asking whether we want to find the best theta vector for one specific order of loci, all permutations, or only some permutations. In our case, we want the theta between loci 2 and 3, so that is one specific order. Therefore, I leave the default arrow where it is and move on with CTRL-N.

The sex difference options are slightly different for ILINK, but unless you are a linkage wizard, stick to "no sex difference".

The next screen that is very different from what we have seen before, looks like:

                ILINK - Locus Order Specification
                           Command Screen 

                        Locus order [] : 
          Recombination fractions [.1] : .1

Here we enter the locus order: 2 3

The recombination fraction line deserves substantial explanation. ILINK uses methods from a branch of mathematics called "Numerical Analysis" to estimate theta. These methods are "iterative" in the sense that they start with an initial guess and move to a better value, until the value is locally optimal and cannot be improved.

Folklore wisdom about linkage analysis is that .1 is a good starting value. However, there are many examples to show that using only one starting value is dangerous because the local optimum you get to may be far from the global optimum.

Furthermore, it is convenient in the sense of implementation for ILINK to ignore the genetic restriction that theta must be < 0.5. Therefore, ILINK will sometimes output a best theta (or component in a theta vector) that is > 0.5. In these cases it is mandatory to start over with a new initial guess.

After entering the locus order, I am done with the ILINK script and so I move on to making the LINKMAP script. In this case I call the script linkmap-pedin and the output files: linkmap-final.out and linkmap-stream.out.

The first lcp screen that is different for LINKMAP looks like:

                LINKMAP - Test Interval Options

                     Specific interval : <-
                         All intervals :

    Location score confidence interval :

To understand what this is asking, we need to understand the "theory of LINKMAP". The theory is that there is a fixed map of markers whose position is known and 1 locus (typically the disease) that moves relative to that map.

If you know in which gap the moving locus lies, you ask for "Specific interval". If you want to compare different orders, you ask for "All Intervals". The third option is not for novice users.

I want to know where the disease locus (1) lies relative to 2 and 3, so I ask for "All Intervals" by moving the default arrow with the down-arrow key on my keyboard.

The next screen specific to LINKMAP looks like:

                          LINKMAP - Map Specification
                                Command Screen 

                          Test loci [] : 
                Order of fixed loci [] : 
          Recombination fractions [.1] : .1

 Number of evaluations in interval [5] : 5

"Test loci" means: Which loci move across the map? For novice users this is always the disease locus only, which is locus 1 by widely-used convention. In our case, the order of fixed loci is 2 3

For recombination fraction, we should put in whatever ILINK gives as a result, but I will put in 0.25.

Novice users usually leave the last line alone, although some people prefer to wait for 10 evaluations to get more detailed information.

We are done with lcp. We have produced 3 scripts:

The next step is to use these scripts. For usage it is important to remember that:
mlink-pedin will do 2 runs of mlink, one for each pair of loci.
ilink-pedin will do 1 run of ilink.
linkmap-pedin will do 3 runs of linkmap, one for each order.