1 unstable release
0.1.0 | Dec 13, 2023 |
---|
#161 in Biology
Used in sigalign-impl
6KB
175 lines
What is SigAlign
?
SigAlign
is a library for biological sequence alignment. Its output is defined explicitly and simply based on:
- Minimum Length (MinL): Alignments must exceed this threshold.
- Maximum Penalty per Length (MaxP): Alignments must not exceed this penalty, calculated using a gap-affine scoring scheme which includes penalties for mismatch, gap-open, and gap-extend.
Purpose
SigAlign
is designed for:- Rapidly Identifying Highly Similar Alignments: Quickly pinpoints alignments above defined cutoffs: MinL and MaxP.
- Simplified Parameters: Hides technical detail, focusing on parameters with biological semantics.
- Easy Customization: Offers a small number of easy-to-understand parameters, adaptable for various applications.
SigAlign
is not intended for:- Global Optimum Finding: Does not exclusively aim for the highest scoring alignments, particularly in cases of low similarity.
- Specialized Applications: Not pre-optimized for specific tasks such as read mapping or clustering.
Quick Start
For Rust
developer
- As a Rust library, SigAlign can take advantage of the most abundant features in Rust. SigAlign is a package registered in
crate.io
and can be added usingcargo
in the project directory:cargo add sigalign
- Example for Quick Scratch
use sigalign::{Aligner, ReferenceBuilder};
// (1) Build `Reference`
let fasta =
br#">target_1
ACACAGATCGCAAACTCACAATTGTATTTCTTTGCCACCTGGGCATATACTTTTTGCGCCCCCTCATTTA
>target_2
TCTGGGGCCATTGTATTTCTTTGCCAGCTGGGGCATATACTTTTTCCGCCCCCTCATTTACGCTCATCAC"#;
let reference = ReferenceBuilder::new()
.ignore_case(true)
.ignore_base(b'N')
.add_fasta(&fasta[..]).unwrap()
.build().unwrap();
// (2) Make `Aligner`
let mut aligner = Aligner::new(
4, // Mismatch penalty
6, // Gap-open penalty
2, // Gap-extend penalty
50, // Minimum aligned length
0.2, // Maximum penalty per length
).unwrap();
// (3) Align query to reference
let query = b"CAAACTCACAATTGTATTTCTTTGCCAGCTGGGCATATACTTTTTCCGCCCCCTCATTTAACTTCTTGGA";
let result = aligner.align_query(&reference, query);
println!("{:#?}", result);
- The detailed features and usage are in API documentation of crate (https://docs.rs/sigalign/).
For Python
developer
- If you're a Python developer, you can make use of the Python bindings for SigAlign. You can use the
pip
the python package manager to install the package:pip install sigalign
- Here is a quick start example:
from sigalign import Reference, Aligner
# (1) Construct `Reference`
reference = Reference.from_fasta_file("./YOUR_REFERENCE.fa")
# (2) Initialize `Aligner`
aligner = Aligner(4, 6, 2, 50, 0.2)
# (3) Execute Alignment
query = "CAAACTCACAATTGTATTTCTTTGCCAGCTGGGCATATACTTTTTCCGCCCCCTCATTTAACTTCTTGGA"
results = aligner.align_query(reference, query)
# (4) Display Results
for target_result in results:
print(f"# Target index: {target_result.index}")
for idx, alignment in enumerate(target_result.alignments):
print(f" - Result: {idx+1}")
print(f" - Penalty: {alignment.penalty}")
print(f" - Length: {alignment.length}")
print(f" - Query position: {alignment.query_position}")
print(f" - Target position: {alignment.target_position}")
- For a detailed guide on how to use SigAlign in Python including a more comprehensive tutorial, please refer to the
sigalign-py
subdirectory README.
For Web
developer
- SigAlign offers a WebAssembly (WASM) build, opening up the potential for web-based applications. While it is not currently available through package managers such as
npm
, plans for web support are in the pipeline. - An exemplary WASM implementation can be found within the
example
directory. Below is a TypeScript example showcasing SigAlign's application via this WASM wrapper:
import init, { Reference, Aligner, type AlignmentResult } from '../wasm/sigalign_demo_wasm';
async function run() {
await init();
// (1) Construct `Reference`
const fasta: string = `>target_1
ACACAGATCGCAAACTCACAATTGTATTTCTTTGCCACCTGGGCATATACTTTTTGCGCCCCCTCATTTA
>target_2
TCTGGGGCCATTGTATTTCTTTGCCAGCTGGGGCATATACTTTTTCCGCCCCCTCATTTACGCTCATCAC`;
const reference: Reference = await Reference.build(fasta);
// (2) Initialize `Aligner`
const aligner: Aligner = new Aligner(
4, // Mismatch penalty
6, // Gap-open penalty
2, // Gap-extend penalty
50, // Minimum aligned length
0.2, // Maximum penalty per length
);
// (3) Execute Alignment
const query: string = "CAAACTCACAATTGTATTTCTTTGCCAGCTGGGCATATACTTTTTCCGCCCCCTCATTTAACTTCTTGGA";
const result: AlignmentResult = await aligner.alignment(query, reference);
// (4) Parse and Display Results
const parsedJsonObj = JSON.parse(result.to_json());
console.log(parsedJsonObj);
}
run();
- To gain further insight into web-based implementation of SigAlign, visit the SigAlign tour page. This page utilizes the WASM wrapper exemplified above.
License
SigAlign is released under the MIT License. Please refer to the LICENSE file in the repository for the full text.
Citation
SigAlign
is detailed in our academic paper, which is currently under the publication process. Once published, the complete citation information will be provided here.
Dependencies
~1.3–2MB
~41K SLoC