13 releases (stable)

2.0.3 Nov 13, 2023
2.0.2 Feb 11, 2023
1.1.4 Feb 4, 2019
1.1.3 Oct 21, 2018
0.1.2 Oct 12, 2018

#107 in Parser implementations

Download history 121202/week @ 2024-01-26 131015/week @ 2024-02-02 128196/week @ 2024-02-09 125771/week @ 2024-02-16 127726/week @ 2024-02-23 137932/week @ 2024-03-01 133351/week @ 2024-03-08 143728/week @ 2024-03-15 144660/week @ 2024-03-22 136654/week @ 2024-03-29 134661/week @ 2024-04-05 136786/week @ 2024-04-12 130891/week @ 2024-04-19 107855/week @ 2024-04-26 108918/week @ 2024-05-03 97452/week @ 2024-05-10

467,163 downloads per month
Used in 223 crates (9 directly)

Apache-2.0

23KB
369 lines

unicode-bom

Build status Crate status Downloads License

Unicode byte-order mark detection for Rust projects.

What does it do?

unicode-bom will read the first few bytes from an array or a file on disk, then determine whether a byte-order mark is present.

What doesn't it do?

It won't check the rest of the data to determine whether it's actually valid according to the indicated encoding.

How do I install it?

Add it to your dependencies in Cargo.toml:

[dependencies]
unicode-bom = "2"

How do I use it?

For more detailed information see the API docs, but the general gist is as follows:

use unicode_bom::Bom;

// The BOM can be parsed from a file on disk via the `FromStr` trait...
let bom: Bom = "foo.txt".parse().unwrap();
match bom {
    Bom::Null => {
        // No BOM was detected
    }
    Bom::Bocu1 => {
        // BOCU-1 BOM was detected
    }
    Bom::Gb18030 => {
        // GB 18030 BOM was detected
    }
    Bom::Scsu => {
        // SCSU BOM was detected
    }
    Bom::UtfEbcdic => {
        // UTF-EBCDIC BOM was detected
    }
    Bom::Utf1 => {
        // UTF-1 BOM was detected
    }
    Bom::Utf7 => {
        // UTF-7 BOM was detected
    }
    Bom::Utf8 => {
        // UTF-8 BOM was detected
    }
    Bom::Utf16Be => {
        // UTF-16 (big-endian) BOM was detected
    }
    Bom::Utf16Le => {
        // UTF-16 (little-endian) BOM was detected
    }
    Bom::Utf32Be => {
        // UTF-32 (big-endian) BOM was detected
    }
    Bom::Utf32Le => {
        // UTF-32 (little-endian) BOM was detected
    }
}

// ...or you can detect the BOM in a byte array
let bytes = [0u8, 0u8, 0xfeu8, 0xffu8];
let bom = Bom::from(&bytes[0..]);
assert_eq!(bom, Bom::Utf32Be);
assert_eq(bom.len(), 4);

How do I set up the build environment?

If you don't already have Rust installed, get that first using rustup:

curl https://sh.rustup.rs -sSf | sh

Then you can build the project:

cargo b

And run the tests:

cargo t

Is there API documentation?

Yes.

Is there a change log?

Yes.

What license is it published under?

Apache-2.0.

No runtime deps