Writing an Assembler in Rust, and How I’m Redoing the Lexer

I’ve continued work on the assembler I’ve been working on. I finished the Lexer, got it to compile, ran it, tested it, and found that the lexer just did not work. Luckily, I found a crate called logos that helps you make a fast lexer, which I’m using i…


This content originally appeared on DEV Community and was authored by Ashton Scott Snapp

I've continued work on the assembler I've been working on. I finished the Lexer, got it to compile, ran it, tested it, and found that the lexer just did not work. Luckily, I found a crate called logos that helps you make a fast lexer, which I'm using in order to re-do the lexer.

GitHub logo AshtonSnapp / chasm

The Official Cellia Cross-Assembler for Modern Computers

chasm

The Official Cellia Cross-Assembler for Modern Computers

Building chasm

Clone this repository to your local machine, cd into the chasm directory, and run cargo build. Simple!




As of right now, I am writing some callback functions. Specifically, I'm writing the one responsible for handling character immediates (immediate in the sense that the processor doesn't have to fetch an address then fetch the value from that address, it just has to fetch the value). This is being done via a giant match statement. Here's basically what the code for this function looks like right now:

fn char(lex: &mut Lexer<Token>) -> Result<u8, ()> {
    let slice: &str = lex.slice();

    let poss_char: &str = &slice[slice.len() - 2];

    // Welcome to hell.

    match poss_char {
        "\x00" => Ok(0),
        "\x01" => Ok(1),
        "\x02" => Ok(2),
        "\x03" => Ok(3),
        "\x04" => Ok(4),
        "\x05" => Ok(5),
        "\x06" => Ok(6),
        "\x07" => Ok(7),
        "\x08" => Ok(8),
        "\x09" => Ok(9),
        "\x0A" => Ok(10),
        "\x0B" => Ok(11),
        "\x0C" => Ok(12),
        "\x0D" => Ok(13),
        "\x0E" => Ok(14),
        "\x0F" => Ok(15),
        "\x10" => Ok(16),
        "\x11" => Ok(17),
        "\x12" => Ok(18),
        "\x13" => Ok(19),
        "\x14" => Ok(20),
        "\x15" => Ok(21),
        "\x16" => Ok(22),
        "\x17" => Ok(23),
        "\x18" => Ok(24),
        "\x19" => Ok(25),
        "\x1A" => Ok(26),
        "\x1B" => Ok(27),
        "\x1C" => Ok(28),
        "\x1D" => Ok(29),
        "\x1E" => Ok(30),
        "\x1F" => Ok(31),
        "\x20" => Ok(32),
        "\x21" => Ok(33),
        "\x22" => Ok(34),
        "\x23" => Ok(35),
        "\x24" => Ok(36),
        "\x25" => Ok(37),
        "\x26" => Ok(38),
        "\x27" => Ok(39),
        "\x28" => Ok(40),
        "\x29" => Ok(41),
        "\x2A" => Ok(42),
        "\x2B" => Ok(43),
        "\x2C" => Ok(44),
        "\x2D" => Ok(45),
        "\x2E" => Ok(46),
        "\x2F" => Ok(47),
        "\x30" => Ok(48),
        "\x31" => Ok(49),
        "\x32" => Ok(50),
        "\x33" => Ok(51),
        "\x34" => Ok(52),
        "\x35" => Ok(53),
        "\x36" => Ok(54),
        "\x37" => Ok(55),
        "\x38" => Ok(56),
        "\x39" => Ok(57),
        "\x3A" => Ok(58),
        "\x3B" => Ok(59),
        "\x3C" => Ok(60),
        "\x3D" => Ok(61),
        "\x3E" => Ok(62),
        "\x3F" => Ok(63),
        "\x40" => Ok(64),
        "\x41" => Ok(65),
        "\x42" => Ok(66),
        "\x43" => Ok(67),
        "\x44" => Ok(68),
        "\x45" => Ok(69),
        "\x46" => Ok(70),
        "\x47" => Ok(71),
        "\x48" => Ok(72),
        "\x49" => Ok(73),
        "\x4A" => Ok(74),
        "\x4B" => Ok(75),
        "\x4C" => Ok(76),
        "\x4D" => Ok(77),
        "\x4E" => Ok(78),
        "\x4F" => Ok(79),
        "\x50" => Ok(80),
        "\x51" => Ok(81),
        "\x52" => Ok(82),
        "\x53" => Ok(83),
        "\x54" => Ok(84),
        "\x55" => Ok(85),
        "\x56" => Ok(86),
        "\x57" => Ok(87),
        "\x58" => Ok(88),
        "\x59" => Ok(89),
        "\x5A" => Ok(90),
        "\x5B" => Ok(91),
        "\x5C" => Ok(92),
        "\x5D" => Ok(93),
        "\x5E" => Ok(94),
        "\x5F" => Ok(95),
        "\x60" => Ok(96),
        "\x61" => Ok(97),
        "\x62" => Ok(98),
        "\x63" => Ok(99),
        "\x64" => Ok(100),
        "\x65" => Ok(101),
        "\x66" => Ok(102),
        "\x67" => Ok(103),
        "\x68" => Ok(104),
        "\x69" => Ok(105),
        "\x6A" => Ok(106),
        "\x6B" => Ok(107),
        "\x6C" => Ok(108),
        "\x6D" => Ok(109),
        "\x6E" => Ok(110),
        "\x6F" => Ok(111),
        "\x70" => Ok(112),
        "\x71" => Ok(113),
        "\x72" => Ok(114),
        "\x73" => Ok(115),
        "\x74" => Ok(116),
        "\x75" => Ok(117),
        "\x76" => Ok(118),
        "\x77" => Ok(119),
        "\x78" => Ok(120),
        "\x79" => Ok(121),
        "\x7A" => Ok(122),
        "\x7B" => Ok(123),
        "\x7C" => Ok(124),
        "\x7D" => Ok(125),
        "\x7E" => Ok(126),
        "\x7F" => Ok(127),
        _ => Err(())
    }
}

Yes. I had to write all of that. Because I can't really guarantee that whoever's using the assembler has Unicode support in their program. That whole function was painful to write. At least now, all I have to write in terms of callback functions are the ones for character escape sequences, strings, addresses, immediates, identifiers (labels and symbols), and actual instruction mnemonics. (Also, need to stop trying to Ctrl+S while using a browser)


This content originally appeared on DEV Community and was authored by Ashton Scott Snapp


Print Share Comment Cite Upload Translate Updates
APA

Ashton Scott Snapp | Sciencx (2021-07-13T02:33:14+00:00) Writing an Assembler in Rust, and How I’m Redoing the Lexer. Retrieved from https://www.scien.cx/2021/07/13/writing-an-assembler-in-rust-and-how-im-redoing-the-lexer/

MLA
" » Writing an Assembler in Rust, and How I’m Redoing the Lexer." Ashton Scott Snapp | Sciencx - Tuesday July 13, 2021, https://www.scien.cx/2021/07/13/writing-an-assembler-in-rust-and-how-im-redoing-the-lexer/
HARVARD
Ashton Scott Snapp | Sciencx Tuesday July 13, 2021 » Writing an Assembler in Rust, and How I’m Redoing the Lexer., viewed ,<https://www.scien.cx/2021/07/13/writing-an-assembler-in-rust-and-how-im-redoing-the-lexer/>
VANCOUVER
Ashton Scott Snapp | Sciencx - » Writing an Assembler in Rust, and How I’m Redoing the Lexer. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2021/07/13/writing-an-assembler-in-rust-and-how-im-redoing-the-lexer/
CHICAGO
" » Writing an Assembler in Rust, and How I’m Redoing the Lexer." Ashton Scott Snapp | Sciencx - Accessed . https://www.scien.cx/2021/07/13/writing-an-assembler-in-rust-and-how-im-redoing-the-lexer/
IEEE
" » Writing an Assembler in Rust, and How I’m Redoing the Lexer." Ashton Scott Snapp | Sciencx [Online]. Available: https://www.scien.cx/2021/07/13/writing-an-assembler-in-rust-and-how-im-redoing-the-lexer/. [Accessed: ]
rf:citation
» Writing an Assembler in Rust, and How I’m Redoing the Lexer | Ashton Scott Snapp | Sciencx | https://www.scien.cx/2021/07/13/writing-an-assembler-in-rust-and-how-im-redoing-the-lexer/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.