Rust 练习册 :Forth与解释器实现
Forth是一种基于栈的编程语言,以其简洁性和高效性而闻名。它采用逆波兰表示法(后缀表示法),所有操作都通过操作栈来完成。在 Exercism 的 “forth” 练习中,我们需要实现一个简化版的Forth解释器,支持基本的算术运算、栈操作和用户自定义词汇。这不仅能帮助我们掌握解释器设计,还能深入学习Rust中的错误处理、数据结构和解析技术。
什么是Forth?
Forth是一种基于栈的编程语言,具有以下特点:
- 基于栈:所有操作都通过操作栈完成
- 逆波兰表示法:操作符后置于操作数(如:1 2 +)
- 交互式:支持即时执行命令
- 可扩展:允许用户定义新词汇(函数)
例如:
1 2 + // 将1和2压入栈,然后执行加法操作
3 4 - // 将3和4压入栈,然后执行减法操作,结果为-1
让我们先看看练习提供的结构和函数签名:
pub type Value = i32;
pub type Result = std::result::Result<(), Error>;
pub struct Forth;
#[derive(Debug, PartialEq)]
pub enum Error {
DivisionByZero,
StackUnderflow,
UnknownWord,
InvalidWord,
}
impl Forth {
pub fn new() -> Forth {
unimplemented!()
}
pub fn stack(&self) -> &[Value] {
unimplemented!()
}
pub fn eval(&mut self, input: &str) -> Result {
unimplemented!("result of evaluating '{}'", input)
}
}
我们需要实现一个完整的Forth解释器,支持基本操作和用户自定义词汇。
设计分析
1. 核心组件
- 栈:存储数据值
- 词汇表:存储内置和用户定义的词汇
- 解析器:解析输入字符串
- 执行器:执行解析后的命令
2. 错误处理
需要处理以下错误情况:
- DivisionByZero:除零错误
- StackUnderflow:栈下溢错误
- UnknownWord:未知词汇错误
- InvalidWord:无效词汇错误
完整实现
1. 基础结构实现
use std::collections::HashMap;
pub type Value = i32;
pub type Result = std::result::Result<(), Error>;
#[derive(Debug, PartialEq)]
pub enum Error {
DivisionByZero,
StackUnderflow,
UnknownWord,
InvalidWord,
}
pub struct Forth {
stack: Vec<Value>,
words: HashMap<String, Vec<String>>,
}
impl Forth {
pub fn new() -> Forth {
Forth {
stack: Vec::new(),
words: HashMap::new(),
}
}
pub fn stack(&self) -> &[Value] {
&self.stack
}
pub fn eval(&mut self, input: &str) -> Result {
let tokens: Vec<&str> = input.split_whitespace().collect();
self.eval_tokens(&tokens)
}
fn eval_tokens(&mut self, tokens: &[&str]) -> Result {
let mut i = 0;
while i < tokens.len() {
let token = tokens[i];
// 检查是否是定义词汇的开始
if token.eq_ignore_ascii_case(":") {
i = self.define_word(&tokens[i+1..])?;
continue;
}
// 检查是否是数字
if let Ok(num) = token.parse::<Value>() {
self.stack.push(num);
i += 1;
continue;
}
// 检查是否是内置命令
if self.eval_builtin(token)? {
i += 1;
continue;
}
// 检查是否是用户定义的词汇
if let Some(word_def) = self.words.get(&token.to_ascii_lowercase()) {
let word_tokens: Vec<&str> = word_def.iter().map(|s| s.as_str()).collect();
self.eval_tokens(&word_tokens)?;
i += 1;
continue;
}
// 未知词汇
return Err(Error::UnknownWord);
}
Ok(())
}
fn define_word(&mut self, tokens: &[&str]) -> Result<usize> {
if tokens.is_empty() {
return Err(Error::InvalidWord);
}
// 获取词汇名称
let word_name = tokens[0];
// 检查名称是否是数字
if word_name.parse::<Value>().is_ok() {
return Err(Error::InvalidWord);
}
// 查找分号
let semicolon_pos = tokens.iter().position(|&t| t == ";").ok_or(Error::InvalidWord)?;
// 获取定义内容
let definition: Vec<String> = tokens[1..semicolon_pos].iter().map(|s| s.to_string()).collect();
// 存储定义
self.words.insert(word_name.to_ascii_lowercase(), definition);
// 返回下一个位置
Ok(semicolon_pos + 1)
}
fn eval_builtin(&mut self, token: &str) -> std::result::Result<bool, Error> {
match token.to_ascii_lowercase().as_str() {
"+" => {
if self.stack.len() < 2 {
return Err(Error::StackUnderflow);
}
let b = self.stack.pop().unwrap();
let a = self.stack.pop().unwrap();
self.stack.push(a + b);
Ok(true)
}
"-" => {
if self.stack.len() < 2 {
return Err(Error::StackUnderflow);
}
let b = self.stack.pop().unwrap();
let a = self.stack.pop().unwrap();
self.stack.push(a - b);
Ok(true)
}
"*" => {
if self.stack.len() < 2 {
return Err(Error::StackUnderflow);
}
let b = self.stack.pop().unwrap();
let a = self.stack.pop().unwrap();
self.stack.push(a * b);
Ok(true)
}
"/" => {
if self.stack.len() < 2 {
return Err(Error::StackUnderflow);
}
let b = self.stack.pop().unwrap();
let a = self.stack.pop().unwrap();
if b == 0 {
return Err(Error::DivisionByZero);
}
self.stack.push(a / b);
Ok(true)
}
"dup" => {
if self.stack.is_empty() {
return Err(Error::StackUnderflow);
}
let val = self.stack[self.stack.len() - 1];
self.stack.push(val);
Ok(true)
}
"drop" => {
if self.stack.is_empty() {
return Err(Error::StackUnderflow);
}
self.stack.pop();
Ok(true)
}
"swap" => {
if self.stack.len() < 2 {
return Err(Error::StackUnderflow);
}
let len = self.stack.len();
self.stack.swap(len - 1, len - 2);
Ok(true)
}
"over" => {
if self.stack.len() < 2 {
return Err(Error::StackUnderflow);
}
let val = self.stack[self.stack.len() - 2];
self.stack.push(val);
Ok(true)
}
_ => Ok(false), // 不是内置命令
}
}
}
测试用例分析
通过查看测试用例,我们可以更好地理解需求:
#[test]
fn no_input_no_stack() {
assert_eq!(Vec::<Value>::new(), Forth::new().stack());
}
新创建的Forth实例应该有空栈。
#[test]
fn numbers_just_get_pushed_onto_the_stack() {
let mut f = Forth::new();
assert!(f.eval("1 2 3 4 5").is_ok());
assert_eq!(vec![1, 2, 3, 4, 5], f.stack());
}
数字应该被压入栈中。
#[test]
fn can_add_two_numbers() {
let mut f = Forth::new();
assert!(f.eval("1 2 +").is_ok());
assert_eq!(vec![3], f.stack());
}
加法运算应该正确执行。
#[test]
fn addition_error() {
let mut f = Forth::new();
assert_eq!(Err(Error::StackUnderflow), f.eval("1 +"));
assert_eq!(Err(Error::StackUnderflow), f.eval("+"));
}
栈下溢时应该返回错误。
#[test]
fn errors_if_dividing_by_zero() {
let mut f = Forth::new();
assert_eq!(Err(Error::DivisionByZero), f.eval("4 0 /"));
}
除零时应该返回错误。
#[test]
fn can_consist_of_built_in_words() {
let mut f = Forth::new();
assert!(f.eval(": dup-twice dup dup ;").is_ok());
assert!(f.eval("1 dup-twice").is_ok());
assert_eq!(vec![1, 1, 1], f.stack());
}
应该支持用户自定义词汇。
#[test]
fn can_use_different_words_with_the_same_name() {
let mut f = Forth::new();
assert!(f.eval(": foo 5 ;").is_ok());
assert!(f.eval(": bar foo ;").is_ok());
assert!(f.eval(": foo 6 ;").is_ok());
assert!(f.eval("bar foo").is_ok());
assert_eq!(vec![5, 6], f.stack());
}
应该支持词汇的重新定义。
#[test]
fn defining_a_number() {
let mut f = Forth::new();
assert_eq!(Err(Error::InvalidWord), f.eval(": 1 2 ;"));
}
不应该允许用数字作为词汇名称。
性能优化版本
考虑性能的优化实现:
use std::collections::HashMap;
pub type Value = i32;
pub type Result = std::result::Result<(), Error>;
#[derive(Debug, PartialEq)]
pub enum Error {
DivisionByZero,
StackUnderflow,
UnknownWord,
InvalidWord,
}
pub struct Forth {
stack: Vec<Value>,
words: HashMap<String, Vec<String>>,
}
impl Forth {
pub fn new() -> Forth {
Forth {
stack: Vec::with_capacity(32), // 预分配容量
words: HashMap::new(),
}
}
pub fn stack(&self) -> &[Value] {
&self.stack
}
pub fn eval(&mut self, input: &str) -> Result {
// 预分配tokens向量
let mut tokens = Vec::with_capacity(input.len() / 2);
for token in input.split_whitespace() {
tokens.push(token);
}
self.eval_tokens(&tokens)
}
fn eval_tokens(&mut self, tokens: &[&str]) -> Result {
let mut i = 0;
while i < tokens.len() {
let token = tokens[i];
// 检查是否是定义词汇的开始
if token.eq_ignore_ascii_case(":") {
i = self.define_word(&tokens[i+1..])?;
continue;
}
// 检查是否是数字
if let Ok(num) = token.parse::<Value>() {
self.stack.push(num);
i += 1;
continue;
}
// 检查是否是内置命令
match self.eval_builtin(token) {
Ok(true) => {
i += 1;
continue;
}
Ok(false) => {
// 继续检查用户定义的词汇
}
Err(e) => return Err(e),
}
// 检查是否是用户定义的词汇
if let Some(word_def) = self.words.get(&token.to_ascii_lowercase()) {
// 为了避免递归调用导致栈溢出,我们直接展开定义
let mut expanded_tokens = Vec::with_capacity(word_def.len());
for word in word_def {
expanded_tokens.push(word.as_str());
}
self.eval_tokens(&expanded_tokens)?;
i += 1;
continue;
}
// 未知词汇
return Err(Error::UnknownWord);
}
Ok(())
}
fn define_word(&mut self, tokens: &[&str]) -> Result<usize> {
if tokens.is_empty() {
return Err(Error::InvalidWord);
}
// 获取词汇名称
let word_name = tokens[0];
// 检查名称是否是数字
if word_name.parse::<Value>().is_ok() {
return Err(Error::InvalidWord);
}
// 查找分号
let semicolon_pos = tokens.iter().position(|&t| t == ";").ok_or(Error::InvalidWord)?;
// 获取定义内容
let definition: Vec<String> = tokens[1..semicolon_pos].iter().map(|s| s.to_string()).collect();
// 存储定义
self.words.insert(word_name.to_ascii_lowercase(), definition);
// 返回下一个位置
Ok(semicolon_pos + 1)
}
fn eval_builtin(&mut self, token: &str) -> std::result::Result<bool, Error> {
match token.to_ascii_lowercase().as_str() {
"+" => {
if self.stack.len() < 2 {
return Err(Error::StackUnderflow);
}
let b = self.stack.pop().unwrap();
let a = self.stack.pop().unwrap();
self.stack.push(a + b);
Ok(true)
}
"-" => {
if self.stack.len() < 2 {
return Err(Error::StackUnderflow);
}
let b = self.stack.pop().unwrap();
let a = self.stack.pop().unwrap();
self.stack.push(a - b);
Ok(true)
}
"*" => {
if self.stack.len() < 2 {
return Err(Error::StackUnderflow);
}
let b = self.stack.pop().unwrap();
let a = self.stack.pop().unwrap();
self.stack.push(a * b);
Ok(true)
}
"/" => {
if self.stack.len() < 2 {
return Err(Error::StackUnderflow);
}
let b = self.stack.pop().unwrap();
let a = self.stack.pop().unwrap();
if b == 0 {
return Err(Error::DivisionByZero);
}
self.stack.push(a / b);
Ok(true)
}
"dup" => {
if self.stack.is_empty() {
return Err(Error::StackUnderflow);
}
let val = self.stack[self.stack.len() - 1];
self.stack.push(val);
Ok(true)
}
"drop" => {
if self.stack.is_empty() {
return Err(Error::StackUnderflow);
}
self.stack.pop();
Ok(true)
}
"swap" => {
if self.stack.len() < 2 {
return Err(Error::StackUnderflow);
}
let len = self.stack.len();
self.stack.swap(len - 1, len - 2);
Ok(true)
}
"over" => {
if self.stack.len() < 2 {
return Err(Error::StackUnderflow);
}
let val = self.stack[self.stack.len() - 2];
self.stack.push(val);
Ok(true)
}
_ => Ok(false), // 不是内置命令
}
}
}
内存优化版本(防止分配攻击)
考虑内存使用的优化实现:
use std::collections::HashMap;
pub type Value = i32;
pub type Result = std::result::Result<(), Error>;
#[derive(Debug, PartialEq)]
pub enum Error {
DivisionByZero,
StackUnderflow,
UnknownWord,
InvalidWord,
}
pub struct Forth {
stack: Vec<Value>,
words: HashMap<String, WordDefinition>,
}
// 使用枚举来存储词汇定义,避免在定义时展开
#[derive(Debug, Clone)]
enum WordDefinition {
BuiltIn(BuiltInWord),
Custom(Vec<Token>),
}
#[derive(Debug, Clone)]
enum Token {
Number(Value),
Word(String),
}
#[derive(Debug, Clone)]
enum BuiltInWord {
Add,
Subtract,
Multiply,
Divide,
Dup,
Drop,
Swap,
Over,
}
impl Forth {
pub fn new() -> Forth {
let mut forth = Forth {
stack: Vec::with_capacity(32),
words: HashMap::new(),
};
// 初始化内置词汇
forth.init_builtins();
forth
}
pub fn stack(&self) -> &[Value] {
&self.stack
}
pub fn eval(&mut self, input: &str) -> Result {
let tokens = self.tokenize(input)?;
self.eval_tokens(&tokens)
}
fn init_builtins(&mut self) {
self.words.insert("+".to_string(), WordDefinition::BuiltIn(BuiltInWord::Add));
self.words.insert("-".to_string(), WordDefinition::BuiltIn(BuiltInWord::Subtract));
self.words.insert("*".to_string(), WordDefinition::BuiltIn(BuiltInWord::Multiply));
self.words.insert("/".to_string(), WordDefinition::BuiltIn(BuiltInWord::Divide));
self.words.insert("dup".to_string(), WordDefinition::BuiltIn(BuiltInWord::Dup));
self.words.insert("drop".to_string(), WordDefinition::BuiltIn(BuiltInWord::Drop));
self.words.insert("swap".to_string(), WordDefinition::BuiltIn(BuiltInWord::Swap));
self.words.insert("over".to_string(), WordDefinition::BuiltIn(BuiltInWord::Over));
}
fn tokenize(&self, input: &str) -> Result<Vec<Token>> {
let mut tokens = Vec::new();
for word in input.split_whitespace() {
if let Ok(num) = word.parse::<Value>() {
tokens.push(Token::Number(num));
} else {
tokens.push(Token::Word(word.to_ascii_lowercase()));
}
}
Ok(tokens)
}
fn eval_tokens(&mut self, tokens: &[Token]) -> Result {
for token in tokens {
match token {
Token::Number(value) => {
self.stack.push(*value);
}
Token::Word(word) => {
if word == ":" {
// 这里应该处理定义,但在tokenize阶段已经处理了
// 实际实现中需要更复杂的解析逻辑
return Err(Error::InvalidWord);
}
self.eval_word(word)?;
}
}
}
Ok(())
}
fn eval_word(&mut self, word: &str) -> Result {
// 处理定义命令
if word == ":" {
return Err(Error::InvalidWord); // 定义应该在更高层处理
}
// 查找词汇定义
match self.words.get(word) {
Some(WordDefinition::BuiltIn(builtin)) => {
self.eval_builtin(builtin.clone())
}
Some(WordDefinition::Custom(tokens)) => {
// 递归执行自定义词汇
for token in tokens {
match token {
Token::Number(value) => {
self.stack.push(*value);
}
Token::Word(word) => {
self.eval_word(word)?;
}
}
}
Ok(())
}
None => Err(Error::UnknownWord),
}
}
fn define_word(&mut self, tokens: &[&str]) -> Result<usize> {
if tokens.is_empty() {
return Err(Error::InvalidWord);
}
// 获取词汇名称
let word_name = tokens[0];
// 检查名称是否是数字
if word_name.parse::<Value>().is_ok() {
return Err(Error::InvalidWord);
}
// 查找分号
let semicolon_pos = tokens.iter().position(|&t| t == ";").ok_or(Error::InvalidWord)?;
// 获取定义内容并转换为Token
let mut definition = Vec::new();
for token in &tokens[1..semicolon_pos] {
if let Ok(num) = token.parse::<Value>() {
definition.push(Token::Number(num));
} else {
definition.push(Token::Word(token.to_ascii_lowercase()));
}
}
// 存储定义
self.words.insert(word_name.to_ascii_lowercase(), WordDefinition::Custom(definition));
// 返回下一个位置
Ok(semicolon_pos + 1)
}
fn eval_builtin(&mut self, builtin: BuiltInWord) -> Result {
match builtin {
BuiltInWord::Add => {
if self.stack.len() < 2 {
return Err(Error::StackUnderflow);
}
let b = self.stack.pop().unwrap();
let a = self.stack.pop().unwrap();
self.stack.push(a + b);
Ok(())
}
BuiltInWord::Subtract => {
if self.stack.len() < 2 {
return Err(Error::StackUnderflow);
}
let b = self.stack.pop().unwrap();
let a = self.stack.pop().unwrap();
self.stack.push(a - b);
Ok(())
}
BuiltInWord::Multiply => {
if self.stack.len() < 2 {
return Err(Error::StackUnderflow);
}
let b = self.stack.pop().unwrap();
let a = self.stack.pop().unwrap();
self.stack.push(a * b);
Ok(())
}
BuiltInWord::Divide => {
if self.stack.len() < 2 {
return Err(Error::StackUnderflow);
}
let b = self.stack.pop().unwrap();
let a = self.stack.pop().unwrap();
if b == 0 {
return Err(Error::DivisionByZero);
}
self.stack.push(a / b);
Ok(())
}
BuiltInWord::Dup => {
if self.stack.is_empty() {
return Err(Error::StackUnderflow);
}
let val = self.stack[self.stack.len() - 1];
self.stack.push(val);
Ok(())
}
BuiltInWord::Drop => {
if self.stack.is_empty() {
return Err(Error::StackUnderflow);
}
self.stack.pop();
Ok(())
}
BuiltInWord::Swap => {
if self.stack.len() < 2 {
return Err(Error::StackUnderflow);
}
let len = self.stack.len();
self.stack.swap(len - 1, len - 2);
Ok(())
}
BuiltInWord::Over => {
if self.stack.len() < 2 {
return Err(Error::StackUnderflow);
}
let val = self.stack[self.stack.len() - 2];
self.stack.push(val);
Ok(())
}
}
}
}
错误处理和边界情况
考虑更多边界情况的实现:
use std::collections::HashMap;
pub type Value = i32;
pub type Result = std::result::Result<(), Error>;
#[derive(Debug, PartialEq)]
pub enum Error {
DivisionByZero,
StackUnderflow,
UnknownWord,
InvalidWord,
}
pub struct Forth {
stack: Vec<Value>,
words: HashMap<String, Vec<ParsedToken>>,
state: ParserState,
}
#[derive(Debug, Clone)]
enum ParsedToken {
Number(Value),
Word(String),
}
#[derive(Debug)]
enum ParserState {
Normal,
Defining { name: String, body: Vec<ParsedToken> },
}
impl Forth {
pub fn new() -> Forth {
Forth {
stack: Vec::with_capacity(32),
words: HashMap::new(),
state: ParserState::Normal,
}
}
pub fn stack(&self) -> &[Value] {
&self.stack
}
pub fn eval(&mut self, input: &str) -> Result {
// 重置状态
self.state = ParserState::Normal;
let tokens = self.tokenize(input)?;
self.eval_tokens(&tokens)
}
fn tokenize(&self, input: &str) -> Result<Vec<ParsedToken>> {
let mut tokens = Vec::new();
for word in input.split_whitespace() {
if let Ok(num) = word.parse::<Value>() {
tokens.push(ParsedToken::Number(num));
} else {
tokens.push(ParsedToken::Word(word.to_string()));
}
}
Ok(tokens)
}
fn eval_tokens(&mut self, tokens: &[ParsedToken]) -> Result {
for token in tokens {
match &self.state {
ParserState::Normal => {
match token {
ParsedToken::Number(value) => {
self.stack.push(*value);
}
ParsedToken::Word(word) => {
if word.eq_ignore_ascii_case(":") {
self.state = ParserState::Defining { name: String::new(), body: Vec::new() };
continue;
}
self.eval_word(word)?;
}
}
}
ParserState::Defining { name, body } => {
if name.is_empty() {
// 这是定义的名称
if let Ok(_) = word.parse::<Value>() {
return Err(Error::InvalidWord);
}
self.state = ParserState::Defining {
name: word.clone(),
body: Vec::new()
};
} else if word == ";" {
// 定义结束
let definition = body.clone();
self.words.insert(name.to_ascii_lowercase(), definition);
self.state = ParserState::Normal;
} else {
// 添加到定义体中
let mut new_body = body.clone();
match token {
ParsedToken::Number(value) => {
new_body.push(ParsedToken::Number(*value));
}
ParsedToken::Word(word) => {
new_body.push(ParsedToken::Word(word.clone()));
}
}
self.state = ParserState::Defining {
name: name.clone(),
body: new_body
};
}
}
}
}
// 检查是否仍在定义状态
if matches!(self.state, ParserState::Defining { .. }) {
return Err(Error::InvalidWord);
}
Ok(())
}
fn eval_word(&mut self, word: &str) -> Result {
// 查找词汇定义
if let Some(definition) = self.words.get(&word.to_ascii_lowercase()) {
// 执行自定义词汇
for token in definition {
match token {
ParsedToken::Number(value) => {
self.stack.push(*value);
}
ParsedToken::Word(word) => {
self.eval_builtin(word)?;
}
}
}
return Ok(());
}
// 执行内置命令
self.eval_builtin(word)
}
fn eval_builtin(&mut self, word: &str) -> Result {
match word.to_ascii_lowercase().as_str() {
"+" => {
if self.stack.len() < 2 {
return Err(Error::StackUnderflow);
}
let b = self.stack.pop().unwrap();
let a = self.stack.pop().unwrap();
self.stack.push(a + b);
Ok(())
}
"-" => {
if self.stack.len() < 2 {
return Err(Error::StackUnderflow);
}
let b = self.stack.pop().unwrap();
let a = self.stack.pop().unwrap();
self.stack.push(a - b);
Ok(())
}
"*" => {
if self.stack.len() < 2 {
return Err(Error::StackUnderflow);
}
let b = self.stack.pop().unwrap();
let a = self.stack.pop().unwrap();
self.stack.push(a * b);
Ok(())
}
"/" => {
if self.stack.len() < 2 {
return Err(Error::StackUnderflow);
}
let b = self.stack.pop().unwrap();
let a = self.stack.pop().unwrap();
if b == 0 {
return Err(Error::DivisionByZero);
}
self.stack.push(a / b);
Ok(())
}
"dup" => {
if self.stack.is_empty() {
return Err(Error::StackUnderflow);
}
let val = self.stack[self.stack.len() - 1];
self.stack.push(val);
Ok(())
}
"drop" => {
if self.stack.is_empty() {
return Err(Error::StackUnderflow);
}
self.stack.pop();
Ok(())
}
"swap" => {
if self.stack.len() < 2 {
return Err(Error::StackUnderflow);
}
let len = self.stack.len();
self.stack.swap(len - 1, len - 2);
Ok(())
}
"over" => {
if self.stack.len() < 2 {
return Err(Error::StackUnderflow);
}
let val = self.stack[self.stack.len() - 2];
self.stack.push(val);
Ok(())
}
_ => Err(Error::UnknownWord),
}
}
}
扩展功能
基于基础实现,我们可以添加更多功能:
use std::collections::HashMap;
pub type Value = i32;
pub type Result = std::result::Result<(), Error>;
#[derive(Debug, PartialEq)]
pub enum Error {
DivisionByZero,
StackUnderflow,
UnknownWord,
InvalidWord,
}
pub struct Forth {
stack: Vec<Value>,
words: HashMap<String, WordDefinition>,
debug_mode: bool,
}
#[derive(Debug, Clone)]
enum WordDefinition {
BuiltIn(fn(&mut Forth) -> Result),
Custom(Vec<String>),
}
impl Forth {
pub fn new() -> Forth {
let mut forth = Forth {
stack: Vec::with_capacity(32),
words: HashMap::new(),
debug_mode: false,
};
// 初始化内置词汇
forth.init_builtins();
forth
}
pub fn stack(&self) -> &[Value] {
&self.stack
}
pub fn eval(&mut self, input: &str) -> Result {
let tokens: Vec<&str> = input.split_whitespace().collect();
self.eval_tokens(&tokens)
}
pub fn set_debug_mode(&mut self, enabled: bool) {
self.debug_mode = enabled;
}
pub fn clear(&mut self) {
self.stack.clear();
// 保留用户定义的词汇
}
fn init_builtins(&mut self) {
self.words.insert("+".to_string(), WordDefinition::BuiltIn(add));
self.words.insert("-".to_string(), WordDefinition::BuiltIn(subtract));
self.words.insert("*".to_string(), WordDefinition::BuiltIn(multiply));
self.words.insert("/".to_string(), WordDefinition::BuiltIn(divide));
self.words.insert("dup".to_string(), WordDefinition::BuiltIn(dup));
self.words.insert("drop".to_string(), WordDefinition::BuiltIn(drop));
self.words.insert("swap".to_string(), WordDefinition::BuiltIn(swap));
self.words.insert("over".to_string(), WordDefinition::BuiltIn(over));
self.words.insert(".".to_string(), WordDefinition::BuiltIn(print_top));
self.words.insert(".s".to_string(), WordDefinition::BuiltIn(print_stack));
}
fn eval_tokens(&mut self, tokens: &[&str]) -> Result {
let mut i = 0;
while i < tokens.len() {
let token = tokens[i];
if self.debug_mode {
println!("Executing: {}", token);
println!("Stack: {:?}", self.stack);
}
// 检查是否是定义词汇的开始
if token.eq_ignore_ascii_case(":") {
i = self.define_word(&tokens[i+1..])?;
continue;
}
// 检查是否是数字
if let Ok(num) = token.parse::<Value>() {
self.stack.push(num);
i += 1;
continue;
}
// 查找并执行词汇
match self.words.get(&token.to_ascii_lowercase()) {
Some(WordDefinition::BuiltIn(func)) => {
func(self)?;
i += 1;
continue;
}
Some(WordDefinition::Custom(definition)) => {
let def_tokens: Vec<&str> = definition.iter().map(|s| s.as_str()).collect();
self.eval_tokens(&def_tokens)?;
i += 1;
continue;
}
None => return Err(Error::UnknownWord),
}
}
Ok(())
}
fn define_word(&mut self, tokens: &[&str]) -> Result<usize> {
if tokens.is_empty() {
return Err(Error::InvalidWord);
}
// 获取词汇名称
let word_name = tokens[0];
// 检查名称是否是数字
if word_name.parse::<Value>().is_ok() {
return Err(Error::InvalidWord);
}
// 查找分号
let semicolon_pos = tokens.iter().position(|&t| t == ";").ok_or(Error::InvalidWord)?;
// 获取定义内容
let definition: Vec<String> = tokens[1..semicolon_pos].iter().map(|s| s.to_string()).collect();
// 存储定义
self.words.insert(word_name.to_ascii_lowercase(), WordDefinition::Custom(definition));
// 返回下一个位置
Ok(semicolon_pos + 1)
}
}
// 内置函数实现
fn add(forth: &mut Forth) -> Result {
if forth.stack.len() < 2 {
return Err(Error::StackUnderflow);
}
let b = forth.stack.pop().unwrap();
let a = forth.stack.pop().unwrap();
forth.stack.push(a + b);
Ok(())
}
fn subtract(forth: &mut Forth) -> Result {
if forth.stack.len() < 2 {
return Err(Error::StackUnderflow);
}
let b = forth.stack.pop().unwrap();
let a = forth.stack.pop().unwrap();
forth.stack.push(a - b);
Ok(())
}
fn multiply(forth: &mut Forth) -> Result {
if forth.stack.len() < 2 {
return Err(Error::StackUnderflow);
}
let b = forth.stack.pop().unwrap();
let a = forth.stack.pop().unwrap();
forth.stack.push(a * b);
Ok(())
}
fn divide(forth: &mut Forth) -> Result {
if forth.stack.len() < 2 {
return Err(Error::StackUnderflow);
}
let b = forth.stack.pop().unwrap();
let a = forth.stack.pop().unwrap();
if b == 0 {
return Err(Error::DivisionByZero);
}
forth.stack.push(a / b);
Ok(())
}
fn dup(forth: &mut Forth) -> Result {
if forth.stack.is_empty() {
return Err(Error::StackUnderflow);
}
let val = forth.stack[forth.stack.len() - 1];
forth.stack.push(val);
Ok(())
}
fn drop(forth: &mut Forth) -> Result {
if forth.stack.is_empty() {
return Err(Error::StackUnderflow);
}
forth.stack.pop();
Ok(())
}
fn swap(forth: &mut Forth) -> Result {
if forth.stack.len() < 2 {
return Err(Error::StackUnderflow);
}
let len = forth.stack.len();
forth.stack.swap(len - 1, len - 2);
Ok(())
}
fn over(forth: &mut Forth) -> Result {
if forth.stack.len() < 2 {
return Err(Error::StackUnderflow);
}
let val = forth.stack[forth.stack.len() - 2];
forth.stack.push(val);
Ok(())
}
fn print_top(forth: &mut Forth) -> Result {
if forth.stack.is_empty() {
return Err(Error::StackUnderflow);
}
let val = forth.stack[forth.stack.len() - 1];
print!("{} ", val);
std::io::stdout().flush().unwrap();
Ok(())
}
fn print_stack(forth: &mut Forth) -> Result {
print!("Stack: ");
for val in &forth.stack {
print!("{} ", val);
}
println!();
std::io::stdout().flush().unwrap();
Ok(())
}
use std::io::Write;
实际应用场景
Forth在实际开发中有以下应用:
- 嵌入式系统:由于其小巧和高效,常用于嵌入式系统
- 实时系统:Forth的交互性和高效性使其适用于实时系统
- 教育工具:用于教授计算机科学和编程概念
- 系统编程:早期的操作系统和引导程序使用Forth
- 硬件控制:用于控制硬件设备和机器人
- 脚本语言:作为应用程序的扩展语言
- 游戏开发:某些游戏使用Forth作为脚本语言
算法复杂度分析
-
时间复杂度:
- 数字压栈:O(1)
- 基本运算:O(1)
- 用户定义词汇:O(n),其中n是词汇定义的长度
- 整体执行:O(m),其中m是所有tokens的数量
-
空间复杂度:
- 栈空间:O(n),其中n是栈中元素数量
- 词汇表:O(w×d),其中w是词汇数量,d是平均定义长度
与其他实现方式的比较
// 使用宏实现的简化版本
macro_rules! forth {
($($token:tt)*) => {
{
let mut stack = Vec::new();
forth_impl!(stack, $($token)*);
stack
}
};
}
macro_rules! forth_impl {
// 数字
($stack:expr, $num:expr $($rest:tt)*) => {
$stack.push($num);
forth_impl!($stack, $($rest)*);
};
// 加法
($stack:expr, + $($rest:tt)*) => {
let b = $stack.pop().unwrap();
let a = $stack.pop().unwrap();
$stack.push(a + b);
forth_impl!($stack, $($rest)*);
};
// 结束
($stack:expr,) => {};
}
// 使用外部库实现
use nom::{IResult, branch::alt, combinator::map, character::complete::{digit1, space1}, multi::many0};
#[derive(Debug, Clone)]
enum ForthToken {
Number(i32),
Word(String),
}
fn parse_number(input: &str) -> IResult<&str, ForthToken> {
map(digit1, |s: &str| {
ForthToken::Number(s.parse().unwrap())
})(input)
}
fn parse_word(input: &str) -> IResult<&str, ForthToken> {
map(nom::character::complete::alpha1, |s: &str| {
ForthToken::Word(s.to_string())
})(input)
}
fn parse_token(input: &str) -> IResult<&str, ForthToken> {
alt((parse_number, parse_word))(input)
}
// 函数式风格实现
use std::collections::HashMap;
pub struct ForthFunctional {
stack: Vec<i32>,
definitions: HashMap<String, Box<dyn Fn(&mut Vec<i32>) -> Result<(), Error>>>,
}
impl ForthFunctional {
pub fn new() -> Self {
let mut forth = ForthFunctional {
stack: Vec::new(),
definitions: HashMap::new(),
};
forth.definitions.insert("+".to_string(), Box::new(|stack: &mut Vec<i32>| {
if stack.len() < 2 {
return Err(Error::StackUnderflow);
}
let b = stack.pop().unwrap();
let a = stack.pop().unwrap();
stack.push(a + b);
Ok(())
}));
forth
}
pub fn eval(&mut self, input: &str) -> Result<()> {
for token in input.split_whitespace() {
if let Ok(num) = token.parse::<i32>() {
self.stack.push(num);
} else if let Some(func) = self.definitions.get(token) {
func(&mut self.stack)?;
} else {
return Err(Error::UnknownWord);
}
}
Ok(())
}
}
总结
通过 forth 练习,我们学到了:
- 解释器设计:掌握了构建简单解释器的基本原理
- 栈操作:理解了基于栈的语言如何工作
- 词法分析:学会了如何解析输入文本
- 错误处理:熟练处理各种边界情况和错误
- 数据结构:了解了如何使用HashMap存储词汇定义
- 内存管理:学会了如何避免内存分配攻击
- 递归处理:理解了如何处理递归定义
这些技能在实际开发中非常有用,特别是在构建DSL(领域特定语言)、脚本引擎和配置处理器时。Forth虽然是一个古老的语言,但其设计思想在现代编程中仍然具有重要价值。通过这个练习,我们不仅实现了Forth解释器,还深入理解了解释器的工作原理。
通过这个练习,我们也看到了Rust在构建安全且高效的语言解释器方面的强大能力。这种结合了安全性和性能的语言特性正是Rust的魅力所在。
AtomGit 是由开放原子开源基金会联合 CSDN 等生态伙伴共同推出的新一代开源与人工智能协作平台。平台坚持“开放、中立、公益”的理念,把代码托管、模型共享、数据集托管、智能体开发体验和算力服务整合在一起,为开发者提供从开发、训练到部署的一站式体验。
更多推荐



所有评论(0)