DNA sequences are an important type of biomedical data that contains many biologically meaningful functional sites such as transcription start site, coding region, translation initiation site, splice site, polyadenylation signal and so on. Our DNA functional site miner (DNAFSMiner) is a web-based software toolbox that aims to predict these functional sites in DNA sequences. Currently, we have developed two software tools. One is called TIS Miner which can be used to predict translation initiation site (TIS) in vertebrate mRNA, cDNA, or DNA sequences. The other is called Poly(A) Signal Miner which can be used to predict polyadenylation (poly(A)) signal in human DNA sequences.

DNAFSMiner is built on statistical and data mining technologies. The method consists of three steps: (1) generating candidate features from training sequences; (2) selecting important features from the candidate features, and (3) integrating the selected features to build a prediction system. For more details about these computational algorithms, please refer to Technology Background and Data or our publications.

The two software tools, TIS Miner and Poly(A) Signal Miner, are listed in the left column of this page. Instructions about the data format for the input and output of the softwares are described here. Also, some training sequences can be downloaded from our data repository.