SeqWiz LOGO
SeqWiz: a modularized toolkit for next-generation protein sequence database management and analysis


  • Installation

    (1) . Required environments:

    Python 3.x, with pip and requests

    (2). Install Requests for internet support

    >>> pip install requests

    *Recommended version: 2.x

    (3). Install Biopython for predicting physicochemical properties

    >>> pip install biopython

    *Recommended version: 1.7.x

    See: https://biopython.org/wiki/Download

    (4). Install wxPython for GUI supports

    Recommended version: 4.x
    For Windows:
    >>> pip install -U wxPython

    For Ubuntu 20.04:

    >>> pip install -U -f https://extras.wxpython.org/wxPython4/extras/linux/gtk3/ubuntu-20.04 wxPython

    #for libsdl supports, if needed

    >>> sudo apt-get install libsdl2-2.0
    >>> sudo apt-get install libsdl2-dev

    *For other Linux distributions, please find the correct versions from: https://extras.wxpython.org/wxPython4/extras/linux/

    See: https://www.wxpython.org/pages/downloads/

  • Development and testing environments

    (1). SeqWiz was developed under Windows OS:

    Windows 10, 64bit; Python 3.9.5; requests 2.27.1; Biopython 1.79; wxPython 4.1.1

    GUI snapshot for Windows:

    (2). SeqWIz was aslo tested under Ubuntu and Debian:

    GUI snapshot for Ubuntu (20.04):

    GUI snapshot for Debian (11):

  • Functionalities

    Standalone tools are available in the directory of "tools" and classified into five categories:

    Category

    Tool name

    Functions

    Note

    Sequence Retrieval

    UpSpecies

    Search or view taxonomy ID

    Based on UniProt

    Sequence Retrieval

    UpRetrieval

    Download species specific sequences and annotations,
    create structured database

    Based on UniProt, support full annotations,
    options to create SQPD and perform features prediction

    Sequence Retrieval

    NCBISpecies

    Search or view taxonomy ID

    Based on NCBI

    Sequence Retrieval

    NCBIRetrieval

    Download species specific sequences

    Based on NCBI, options to create SQPD and perform
    features prediction

    Sequence Retrieval

    EnsemblSpecies

    Search or view taxonomy ID

    Based on Ensembl

    Sequence Retrieval

    EnsemblRetrieval

    Download species specific sequences

    Based on Ensembl, options to create SQPD and perform
    features prediction

    Sequence Retrieval

    DbManage

    Create structured database from other sequence sources

     

    Sequence Generation

    MatureSeq

    Generate mature forms of protein sequences

    Based on UniProt annotations

    Sequence Generation

    SepFinder

    Predict sORF and SEPs from transcript sequences

    Supports both liner and circular RNAs

    Sequence Generation

    SeqDecoy

    Generate decoy FASTA sequences

     

    Sequence Conversion

    CheckSeq

    Check the format of FASTA, PEFF, SQPD, SET or PEPLIS

    *PEPLIS: a list of peptide sequence

    Sequence Conversion

    UpConvert

    Convert FASTA from UniProt to PEFF or SQPD

    Based on UniProt, support full annotations

    Sequence Conversion

    SeqConvert

    Simple format converter for non-UniProt sequences, from FASTA to

     

    Sequence Filter

    SeqFilter

    Sequence filter to generate a SET list from SQPD

     

    Sequence Filter

    TabFilter

    Table filter to generate a SET list

    *Recommend to use the result table from SeqAnnotate

    Sequence Analysis

    SeqAnnotate

    Sequence statistics for singular or grouped residues;
    calculation or prediction of physicochemical properties,
    including: isoelectric point, physiological charge, reduced
    molar extinction, cystines molar extinction, aromaticity,
    instability and grand average of hydropathy.

    *Require the biopython package

    Sequence Analysis

    MotifCount

    Motif statistic for sequence files

     

    Sequence Analysis

    SeqWindow

    Sequence window extraction from a
    position table (for a list of sites or peptides)

     

     

  • Usage

    (1). CMD usage

    Each tool of SeqWiz is designed with standard CLI interface, with self-describing arguments. Use the common "-h" flag to show the help in CLI:

    python {script_name}.py -h

    Reference usages:

    (2). GUI usage

    SeqWiz also provides a GUI-to-CLI interface to run the tools.

    Step1: Start with the index window to show the list of tools.

    Step 2: Select the tool in the APP list and lanuch it

    Step 3: Set the parameters as needed

    Step 4: Click the "Assemble" button to check input and generate CMDs

    Step 5: Click the "Run" button to stat the mission

    (3). Module usage

    The basic modules in the directory of "mods" and tool scripts in the directory of "tools" can be used as python modules.

    Run the "AutoAPIDoc.py" in the directory of "docs" to lanuch API documentations for every scripts.

  • Practical examples

    (1). To create mature sequences for mouse proteins

    (2). To predict SEPs derived from circRNA transcripts (via GUI)

    (3).  To generat subsets and retrieving sequences

    (4).  To use "fastabase" as module

    Open "mod_test.py" in the directory of "test", edit or run in the python shell.

  • License

    This project follows the GNU General Public License (version: 3.0).

    See: https://www.gnu.org/licenses/gpl-3.0.en.html