Welcome to the
BioInformatics & Molecular Analysis Section (BIMAS)

WWW Promoter Scan

PROMOTER SCAN is offered to you as is, and so are its results, with no implied warranty.

Analysis is done using the PROSCAN Version 1.7 suite of programs developed by Dr. Dan Prestridge. Information on SIGSCAN is maintained at the Advanced Biosciences Computing Center, University of Minnesota.

PROMOTER SCAN is designed to find putative eukaryotic Pol II promoter sequences in primary sequence data. This program is experimental in nature, and should be used as an experimental tool. PROMOTER SCAN is best used to find regions in primary DNA sequence that might be good candidate regions to further test for promoter functionality. At this time, using test promoter and non-promoter sequence test sets, the program recognizes approximately 70% of primate promoter sequences, with a false positive rate of about one in every 14,000 bases.

If you locate a promoter sequence using this program, you must cite:

Interpreting the Results:

The results show the location of predicted promoter sequences. Predicted sequence regions are regions of DNA that contain a significant number and type of transcriptional elements (TEs) that are usually associated with Pol II promoter sequences. These promoter associated TEs were previously determined by analysis (Prestridge, D.S.(1995)JMB 249:923-32). Reported putative promoters are those regions of your sequence that score past a predetermined cutoff score set to recognize 70% of primate promoter sequences in the Eukaryotic Promoter Database (Bucher & Trifonov. (1986) NAR 14: 10009-26). At this cutoff score, false positive predictions occur at a rate of approximately one in every 14,000 single strand bases. These predictive estimates are based upon experimental test sets of promoter and non-promoter sequences; you may find different results.

Result Details:

PROMOTER SCAN, if it finds a putative promoter sequence, reports the sequence range in which the putative promoter is found. It then reports if a TATA box was found, and if so makes an estimate of the Transcription Start Site (TSS) position from the TATA position. Both the TATA box location and the Estimated TSS are reported. In test sets, 72% of promoters recognized by PROMOTER SCAN have a recognized TATA box, and in those cases the reported TSS is within +- 10 bases of the actual TSS. Significant signals (most of them transcriptional elements) are also reported. The transcription factor name (or in its absence, the Ghosh site name) are reported as well as the TFD # (Ghosh TFD database reference number) strand, position, and significance weighting. It is important to realize that the signal weight DOES NOT reflect the quality of the signal, instead it is a relative weighting based upon that particular signal's ability to discriminate promoter from non-promoter sequences, and is based upon the relative frequency with which that signal is found in promoter versus non-promoter sequences. For example, in the sample sequence, at position 79 there is a NF-kB reported with a weight of 1.094000. This reflects the fact that NF-kB is found approximately 1.094 times more often in promoter sequences than in non-promoter sequences; in other words, it is not a very useful discriminator. On the other hand, a Sp1 site at position 108 has a weight of > 6. That particular definition of Sp1 is found about 6 times more frequently in promoter than in non-promoter sequences. A score of 50 means that the signal is found ONLY in promoter sequences (in the test sets used so far). The relationship between a signal's weight and the quality of the signal is not known. You will also find multiple sites of the same binding factor at the same location with different weights. These reflect the different consensus or specific signals used in the signal database. For example, a signal for a TFIID site might be "TATA", while another TFIID site definition might be "ATATAAT". Both TFIID site definitions would be reported for the sequence "GATATAATC", however only the first definition would be reported for the sequence "GTATAC".

Dr. Prestridge encourages feedback - of both positive and negative results; send E-mail to danp@biosci.umn.edu. Any feedback will help to improve the program in the future.


Return to Promoter Scan Anlysis