![]() ![]() Most vectors are circular, but their sequences are represented in linear form by opening the sequence at one particular location (the circular junction). The computation time required to screen a query sequence is reduced significantly.Īnalysis of the results is facilitated because redundant hits to multiple copies of the same sequence are largely eliminated. This has two major benefits for screening: This cycle is repeated for each sequence that is to be represented in the finished database.īenefits of a Non-Redundant Database for ScreeningĮlimination of redundant sequence segments reduces UniVec to less than 20% of the size of an equivalent database containing the full sequences for the same set of vectors. ![]() These novel elements are then added to the database. This information is used to extract only those segments of the input sequence that contain novel sequence. The location of any segment identical to database sequences is recorded. The input sequence is first compared to all the sequences already in the database. The UniVec database is built by sequentially processing each input sequence. The size of a database designed for screening can therefore be greatly reduced by eliminating the redundant copies of any sequence (see statistics for the current UniVec build). A single copy of each unique element is sufficient to allow that sequence to be recognized as vector contamination. Consequently, databases with the full sequence for each vector contain multiple copies of such elements. Many vectors have the same backbone or share common functional cassettes. Įliminating the Redundancy from Vector Sequences UniVec can be obtained from the NCBI FTP directory. This enables contamination with these oligonucleotide sequences to be found during the vector screen. In addition to vector sequences, UniVec also contains sequences for those adapters, linkers, and primers commonly used in the process of cloning cDNA or genomic DNA. Screening using UniVec is efficient because a large number of redundant subsequences have been eliminated to create a database that contains only one copy of every unique sequence segment from a large number of vectors. UniVec is a database that can be used to quickly identify segments within nucleic acid sequences which may be of vector origin (vector contamination). ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |