SeqWiz

A modularized toolkit for next-generation protein sequence database management and analysis


About

Protein sequences are the basis of proteomic identification. The SeqWiz project offers an open and free solution to:

  • Design and promote next-generation sequence data formats with high performance
  • Develop basic modules for handling both conventional and novel data formats
  • Develop a collection of easy-to-use tools for proteomic-centric sequence management and analysis

Publications:

  • 1. Zhang P, Wang M, Zhou T, Chen D. SeqWiz: a modularized toolkit for next-generation protein sequence database management and analysis. BMC Bioinformatics. 2023 May 17;24(1):201. doi: 10.1186/s12859-023-05334-9. PMID: 37194023; PMCID: PMC10189941.

NEWS

Fix package for UniProt (v1.0) released!
As UniProt switches to new UI and API, the old API URLs are retired.
This package is to fix the API URLs for SeqWiz version 1.0.
Just download and uncompress the package, and replace the old scripts.

Features

  • Cross-platform

    SeqWiz is developed by the python coding language with minimal requirements. Thus, SeqWiz can be used under various OS, including Windows, MacOS and Linux systems.

  • Species specific sequences

    To easily get species-specific sequences from the UniProt, NCBI and Ensembl databases.

  • Highlighted

    Proteoform supports

    Follows the standards of PEFF (PSI Extended Fasta Format), provides automatic proteoform annotations for UniProt sequences.

  • Highlighted

    sORF and SEP prediction

    sORF (small ORF) and sORF encoded peptides (SEP) have been the research hotspot, especially in the fields of lncRNA and circRNA. SeqWiz offers a tool to predict potential sORFs and SEPs in lncRNAs and circRNAs.

  • Friendly for both end-users and software developers

    SeqWiz provides GUI (with easy-to-use input widgets and automatic validations) for end-users, as well as CMD (with self-describing arguments) based scripts for software developers.

  • Highlighted

    Hihg efficent data formats

    Two new formats based on sqlite3 and json are proposed to store sequence data (ID, sequences, and features) and IDs for high efficient sequence management and analysis.

    • Windows

      GUI under Windows 10

    • Debian

      GUI under Debian 11

    • Ubuntu

      GUI under Ubuntu 20.04

    Current stable version: 1.0

    Downloads

    • SeqWiz package (version 1.0), released on Sunday, June 26, 2022
      Download
    • Fix package for UniProt (version 1.0), released on Wednesday, June 29, 2022
      Download

    Tutorial

    • (1) . Required environments: Python 3.x
      (2). Install Requests for internet support
      #Recommended version: 2.x
      >>> pip install requests
      (3). Install Biopython for predicting physicochemical properties
      #Recommended version: 1.7.x
      >>> pip install biopython
      (4). Install wxPython for GUI supports
      #Recommended version: 4.x
      >>> pip install -U wxPython
      #See: https://www.wxpython.org/pages/downloads/ for more information
    • SeqWiz also provides a GUI-to-CLI interface to run the tools. Just start with the index window (GUI.py in the root directory) to show the list of tools or to launch a specific tool.
    • Each tool of SeqWiz is designed with standard CLI interface, with self-describing arguments. Use the common "-h" flag to show the help in CLI:

      python {script_name}.py -h

    • The basic modules in the directory of "mods" and tool scripts in the directory of "tools" can be used as python modules. Run the "AutoAPIDoc.py" in the directory of "docs" to lanuch API documentations for every scripts.

    Title