Creating a programming language is not that hard. In fact, behind the veil of initial difficulty, it turns out that most programmers have written far more complex code at some point in their careers. In this article, I will guide you through the entire process of recreating the BASIC language, one of the most influential languages of all time. Altair BASIC with a few additional improvements, to be precise — the one written by Bill Gates and Paul Allen in 1975. Later known as Microsoft BASIC, it came preinstalled on most personal computers and helped millions of people take their first steps in programming. It played a key role in making personal computing accessible to the broader public, outside of universities or laboratories (and made both of them billionaires).
This is part 1 of this article. Part 2 can be found here
Warning: this is not another 5-minute article explaining only a few basic concepts and code snippets, leaving you with unfulfilled promises and disappointment. I’ll walk you through the entire process, step by step, to the point where we will cover almost all functionalities of the BASIC language — such as variables, arrays, if/else conditions, functions, comments, subroutines, and built-in methods — with some explanation of how and why these things work.
20 LET N = 0
30 INPUT "Enter a number: "; N
40 LET FACTORIAL = 1
50 LET I = 1
60 IF I > N THEN GOTO 100
70 LET FACTORIAL = FACTORIAL * I
80 LET I = I + 1
90 GOTO 60
100 PRINT N, "! = ", FACTORIAL
BASIC differs from modern programming languages in several ways. First, there’s the use of line numbers: the programmer must manually number each line (lines without numbers are executed immediately and are not stored in the program buffer). Numbering lines in increments of 10 is a common convention, but not required. In practice, you might increment by 5 for things like IF conditions or loops, and by 100 or even 1000 for subroutines.
Second, all variable names, keywords, and commands must be written in uppercase. You can’t use underscores or lowercase letters.
Third, the overall simplicity of the language: in this early version of BASIC, there’s no ELSE clause. An IF statement, when true, simply jumps a few lines ahead (using GOTO), effectively skipping the code that would otherwise run if the condition was false.
In our interpreter, we will implement the ELSE keyword to extend the language’s capabilities and improve compatibility with later versions of BASIC.
Most programming languages have similar core architecture. Containing few obligatory elements: lexer, parser, interpreter and data structures for keeping current state of the program. First, the interpreter sends your code, written as plain text to the lexer, which scans it character by character and line by line. Lexer splits text into the smallest meaningful chunks in the language — tokens. A token can be a numeric, a string, keyword (e.g., LET, DEF, GOTO), an identifier, or a symbol such as bracket or a plus sign. The lexer should also throw an error if it encounters an unrecognized or invalid structure in the code. In other words, the lexer is the first line of defense. Acting like a bouncer checking IDs at the door, labeling guests and rising an alarm if someone suspicious is trying to sneak in.
Then tokens are passed to the parser. Lexer breaks down raw code into individual tokens, the parser takes those tokens and tries to make sense of them as a whole. It looks at the sequence and structure, checking whether the tokens form valid expressions or statements. Think of it as a grammar checking. Just as a sentence in English needs a subject and a verb to make sense, a line of code like LET A = 5 needs to follow a specific pattern: a keyword, an identifier, an equals sign, and a value. If the tokens don’t match any known pattern, a called a grammar rule. Parser will raise a syntax error and stop everything right there. One of the most important jobs of a parser is parsing mathematical expressions. At first glance, something like 3 + 4 * 2 + 6seems simple, but your interpreter needs clear instructions on how to evaluate it correctly. Parser must respect operator precedence (eg. multiplication before addition), handle parentheses, and properly group tokens into a nested structure that reflects the actual computation order. In our case we will use recursive descent parsing, but more about it later.
The interpreter is the final stage of the entire process. It’s in charge of actually running your code, based on the grammar rules and structure provided by the parser. For a language as simple as BASIC, we can often combine the parser and the interpreter into one, executing instructions as they are parsed. Just keep in mind that in more complex languages, these are treated as separate entities.
We can start by creating a very simple REPL-alike program:
# Interactive shell - the entry point to our interpreterdef main
puts "Ruby Altair Basic 0.0.1, 1975-#{Time.now.year}"
puts "Type 'HELP' for available commands"
loop do
print '> '
line = gets.chomp
case line
when 'QUIT'
break
when 'CLEAR'
system('clear') if RUBY_PLATFORM =~ /linux|bsd|darwin/
system('cls') if RUBY_PLATFORM =~ /mswin|mingw|cygwin/
when 'HELP'
puts "\nAvailable commands:"
puts ' RUN - Run the program (not implemented yet)'
puts ' NEW - Clear the program (not implemented yet)'
puts ' LIST - List the program (not implemented yet)'
puts ' CLEAR - Clear the screen'
puts ' QUIT - Exit the interpreter'
else
puts "> #{line}"
end
end
end
main
Most interpreters can run code in two modes: REPL (read-eval-print loop) or script mode. Script mode reads code saved from a given file and executes it, while REPL mode executes code on run, reading it directly from user input written on console. Just like a irb in Ruby.
Our fresh interpreter should be able to give a list of comments, read an input and returns it:
ania@Mac ruby_basic % ruby basic.rbRuby Altair Basic 0.0.1, 1975-2025
Type 'HELP' for available commands
> HELP
Available commands:
RUN - Run the program (not implemented yet)
NEW - Clear the program (not implemented yet)
LIST - List the program (not implemented yet)
QUIT - Exit the interpreter
> Hello, World!
> Hello, World!
Now we should to store code in interpreter’s memory. Let’s add a new global hash buffer to store code lines as a keys and an actual code lines as a values. This approach will allow us to quickly reach any line of code (eg by using GOTO command):
# Global variables on top$buffer = {} # Stores the program lines
# Enhanced main function with program management
def main
puts "Ruby Altair Basic 0.0.1, 1975-#{Time.now.year}"
puts "Type 'HELP' for available commands"
loop do
begin
print '> '
line = gets.chomp
case line
when 'QUIT'
break
when 'NEW'
$buffer = {}
when 'LIST'
if $buffer.empty?
puts 'Program is empty'
else
$buffer.keys.sort.each do |num|
puts "#{num} #{$buffer[num]}"
end
end
when 'CLEAR'
system('clear') if RUBY_PLATFORM =~ /linux|bsd|darwin/
system('cls') if RUBY_PLATFORM =~ /mswin|mingw|cygwin/
when 'HELP'
puts "\nAvailable commands:"
puts ' RUN - Run the program (not implemented yet)'
puts ' NEW - Clear the program'
puts ' LIST - List the program'
puts ' CLEAR - Clear the screen'
puts ' QUIT - Exit the interpreter'
else
unless line.strip.empty? # Skip empty lines
parts = line.split(' ', 2)
if parts[0].match?(/^\d+$/) # Line starts with a number
line_num = parts[0].to_i
if parts.length > 1
$buffer[line_num] = parts[1] # Store the line
elsif $buffer.key?(line_num)
$buffer.delete(line_num) # Remove the line if only number given
end
else
puts "> #{line}"
end
end
end
rescue Interrupt
puts "\nInterrupted. Type 'QUIT' to exit."
rescue StandardError => e
puts "Unexpected error: #{e}"
end
end
end
main
Now our buffer will store numbered lines and execute immediately these without numbers. Interrupt exception prevents from script termination with CMD/CTRL + D. LIST is a build in BASIC command to display all currently stores lines of code while NEW cleans buffer:
> 10 LET X = 5> 20 PRINT X
> LIST
10 LET X = 5
20 PRINT X
> NEW
> LIST
Program is empty
Please remember to always clear your buffer by usingNEW command before running a new program. Forgetting to do so might lead to endless confusion.
It’s time to start build our lexer:
$token = '' # stores current token$buffer = {}
# Scan a line of BASIC code to extract the next token
def scan(line)
# Skip whitespace
line.shift while !line.empty? && line[0] == ' '
return $token = '' if line.empty?
if line[0] =~ /\d/ # Numbers
$token = number(line)
elsif line[0] =~ /[A-Z]/ # Keywords and identifiers
$token = get_identifier(line)
elsif line[0] =~ /[+\-*\/\(\)=<>,;:\^&|~]/ # Operators z
$token = line.shift
elsif line[0] == '"' # Strings
$token = string(line)
else
puts "Unexpected character: #{line[0]}"
$token = ''
line.shift
end
end
# parse number
def number(line)
tok = ''
has_decimal = false
# parse digits and at most one decimal point
while !line.empty? && (line[0] =~ /\d/ || (line[0] == '.' && !has_decimal))
has_decimal = true if line[0] == '.'
tok += line.shift
end
if has_decimal
tok.to_f
else
tok.to_i
end
end
# parse a string literal
def string(line)
msg = ''
line.shift # skip opening quote
while !line.empty? && line[0] != '"'
msg += line.shift
end
if line.empty?
puts 'Missing closing quote!'
raise 'Missing closing quote'
else
line.shift # skip closing quote
'"' + msg + '"' # return with quotes for identification
end
end
# parse an identifier
def get_identifier(line)
name = ''
# first position must be capital letter
if !line.empty? && line[0] =~ /[A-Z]/
name += line.shift
# subsequent positions can be capital letters or digits
while !line.empty? && (line[0] =~ /[A-Z0-9]/)
name += line.shift
end
# check for type suffix ($, %, #, !)
if !line.empty? && line[0] =~ /[\$\%\#\!]/
name += line.shift
end
end
name
end
The scan method does two key things: it identifies the next token in the input stream (updating the global $token variable) and it advances the position pointer in the input line by consuming the characters that make up that token.
Our lexer handles the four essential token types of BASIC: numbers (with decimal support), identifiers (variables and keywords), operators, and string literals. Here and next methods current token will be passed and stored in global variable.
Classic BASIC was case-insensitive but typically rendered in uppercase. Our implementation enforces this convention by only recognizing uppercase letters in identifiers. We also support type suffixes for variables (like A$ for strings), which was a common feature in most BASIC dialects. The string handling respects quotes, allowing for spaces and special characters within strings. This tokenizer is deliberately simple, as BASIC’s syntax is relatively straightforward compared to modern languages.
Now let’s implement a simple execution engine that can run PRINT statements:
# additional global variable$current_pos = 0 # current position for PRINT
def execute(num, line)
begin
line = line.chars if line.is_a?(String)
scan(line)
case $token
when 'PRINT'
print_statement(line)
else
puts "Unknown statement: #{$token}"
end
rescue StandardError => e
puts "Line #{num}: Execution failed! #{e}"
end
end
# add this helper method
def normalize_line(line)
line.is_a?(Array) ? line.join.strip : line.to_s.strip
end
def print_statement(line)
line = normalize_line(line).chars
scan(line)
# Track position for future features
$current_pos = 0 if !defined?($current_pos)
new_line = true # Whether to add newline at the end
while true
if $token == ''
break
elsif $token.is_a?(String) && $token[0] == '"'
# String literal
text = $token[1..-2] # Remove quotes
print text
$current_pos += text.length
scan(line)
else
scan(line) # Skip non-string tokens for now
end
# Check for separator
if $token == ','
print ' ' # Add space for comma
$current_pos += 1
scan(line)
elsif $token == ';'
# Semicolon - no space, suppress newline if at end
scan(line)
new_line = ($token != '') # No newline if semicolon at end
else
break
end
end
# Add newline unless suppressed by trailing semicolon
if new_line
puts
$current_pos = 0
end
end
The PRINT statement parser works by scanning through tokens in the line and handling them based on their type. For string literals (enclosed in quotes), it strips the quotes and prints the raw text. It supports BASIC’s print separators: commas insert a space between outputs while semicolons join outputs without spaces. For example, PRINT "HELLO"; "WORLD" outputs the two strings directly adjacent, while PRINT "HELLO", "WORLD" inserts a space between them. A trailing semicolon (like in PRINT "No newline";) suppresses the automatic newline, allowing multiple PRINT statements to build a single line of output. The function tracks the current cursor position in $current_pos, which will be useful for more advanced formatting features we'll implement later.
10 PRINT "HELLO "; "WORLD"HELLO WORLD
The execution engine uses the tokenizer to identify statement types and dispatch them to the appropriate handler functions. This architectural pattern is common in interpreters, and it allows for modular development where we can implement one statement type at a time. The RUN command now actually executes the program by processing lines in numerical order, which is how BASIC programs traditionally flow.
The variable $current_pos tracks the horizontal cursor position during PRINT operations, enabling proper alignment for tabulation and formatted output. This is essential for features like the TAB() function, which positions text at specific column positions, and for maintaining proper spacing when using commas and semicolons as print separators.
Now we can add code to execute the program and update the main function to call run:
def run$buffer.keys.sort.each do |num|
line = $buffer[num]
execute(num, line)
end
end
# update conditions inside main function:
when 'RUN'
run
# and loop within main function to execute code directly:
unless line.strip.empty?
parts = line.split(' ', 2)
if parts[0].match?(/^\d+$/)
line_num = parts[0].to_i
if parts.length > 1
$buffer[line_num] = parts[1]
elsif $buffer.key?(line_num)
$buffer.delete(line_num)
end
else
execute(0, line) # add execute function here
end
end
Important thing to notice: in BASIC lines are always interpreted in increment order, regardless of how they were written in console. That’s why in run method we sort them first.
With this implementation we can finally interpret out first command, both immediately and with RUN command, while stored in buffer:
> 10 PRINT "Hello, World!"> RUN
Hello, World!
> PRINT "Hello, World!"
Hello, World!
In BASIC, all letters in variable names must be uppercase. They may also contain digits, but never as first chars. _ were not allowed, however in some cases special chars such as ! or $ were accepted. In Altair Basic max length of variable name equals 2. This is such an inconvenience for modern programs that we will omit this restriction. LET statement wasn’t mandatory and most programmers quickly stop using it. However, in order to increase readability of code, we will keep impose it anyway.
$buffer = {}$token = ''
$current_pos = 0
$variables = {} # add it to store variable values
# Update execute function to handle LET statement
case $token
when 'PRINT'
print_statement(line)
when 'LET'
let_statement(line)
else
puts "Unknown statement: #{$token}"
end
# add new function to implement LET statement
def let_statement(line)
line_text = normalize_line(line)
line_text.strip!
if !line_text.include?('=')
puts 'Missing "=" in variable definition!'
raise "Missing equals sign"
end
# Split into variable name and expression
parts = line_text.split('=', 2)
var_name = parts[0].strip
expr_text = parts[1].strip
if expr_text.empty?
puts 'Missing variable value!'
raise "Missing value"
else
# For now, we'll just handle simple numeric values
value = expr_text.to_i
$variables[var_name] = value
end
end
# helper function for expression evaluation
def eval_expr(expr)
scan(expr)
expression(expr)
end
Our variables will be stored in global hash with simple name -> value key pairs. let_statement method should correctly parse it that way. For now, this method will parse only numeric values, string will retrive 0 value. More complex mechanism will be implemented in the next chapter.
update print_statement to read variables:
def print_statement(line)line = normalize_line(line).chars
scan(line)
$current_pos = 0 if !defined?($current_pos)
new_line = true
while true
if $token == ''
break
elsif $token.is_a?(String) && $token[0] == '"'
# string handling
text = $token[1..-2] # remove quotation marks
print text
$current_pos += text.length
scan(line)
elsif $token.is_a?(String) && $variables.key?($token)
# variable handling
variable_name = $token
variable_value = $variables[variable_name]
print variable_value
$current_pos += variable_value.to_s.length
scan(line)
elsif $token.is_a?(Integer) || $token.is_a?(Float)
# numeric literal handling
value = $token
print value
$current_pos += value.to_s.length
scan(line)
else
# unsupported token, skip
scan(line)
end
# separator handling
if $token == ','
print ' '
$current_pos += 1
scan(line)
elsif $token == ';'
scan(line)
new_line = ($token != '')
else
break
end
end
if new_line
puts
$current_pos = 0
end
end
# reset stored variables from previous run in main method:
when 'RUN'
$variables = {}
run
For variable tokens, we first check if they exist in our global $variables hash. If found, we retrieve and display the stored value, which could be a number, string, or any other data type. For numeric literals (integers or floating-point numbers), we simply print them directly.
and to test it:
> 10 LET A = 5> 20 PRINT A
> RUN
5
Great! Now our interpreter can both holds and prints variables.
Our interpreter still lacks ability to perform any math operations. Which was (and still is) a crucial core of every solid programming language. We’re going to implement a recursive descent parser to handle arithmetic expressions with proper operator precedence. This means that expressions like 2 + 3 * 4 are evaluated correctly (eg: to 14 not 20) respecting the mathematical order of operations. BASIC supported a range of arithmetic operations including addition, subtraction, multiplication, division, and often exponentiation. Our parser handles all of these with proper nesting of parentheses.
# Expression parser - handles expressions in order of precedence
def expression(line)
bitwise_or(line)
end
def bitwise_or(line)
a = bitwise_and(line)
return nil if a.nil?
while $token == '|' || $token == 'OR'
op = $token
scan(line)
b = bitwise_and(line)
return nil if b.nil?
# bitwise OR for integers, logical OR for others
if op == '|' || op == 'OR'
if a.is_a?(Integer) && b.is_a?(Integer)
a = a | b # bitwise OR
else
a = (a != 0 || b != 0) ? 1 : 0 # logical OR
end
end
end
a
end
def bitwise_and(line)
a = comparison(line)
return nil if a.nil?
while $token == '&' || $token == 'AND'
op = $token
scan(line)
b = comparison(line)
return nil if b.nil?
# bitwise AND for integers, logical AND for others
if op == '&' || op == 'AND'
if a.is_a?(Integer) && b.is_a?(Integer)
a = a & b # bitwise AND
else
a = (a != 0 && b != 0) ? 1 : 0 # logical AND
end
end
end
a
end
def comparison(line)
a = add_sub(line)
return nil if a.nil?
if COMPARISON_OPERATORS.key?($token)
op = $token
operator_func = COMPARISON_OPERATORS[op]
scan(line)
b = add_sub(line)
return nil if b.nil?
a = operator_func.call(a, b) ? 1 : 0
end
a
end
def add_sub(line)
a = term(line)
return nil if a.nil?
while $token == '+' || $token == '-'
op = $token
scan(line)
b = term(line)
return nil if b.nil?
if op == '+'
a += b
else # op == '-'
a -= b
end
end
a
end
def term(line)
a = power(line)
return nil if a.nil?
while $token == '*' || $token == '/'
op = $token
scan(line)
b = power(line)
return nil if b.nil?
if op == '*'
a *= b
else # op == '/'
if b == 0
puts 'Division by zero error!'
return nil
end
a = a.to_f / b.to_f
end
end
a
end
def power(line)
a = factor(line)
return nil if a.nil?
if $token == '^'
scan(line)
b = power(line) # for chained operations (2^3^2)
return nil if b.nil?
a **= b
end
a
end
and add COMPARISON_OPERATORS constant on top of the file, right below global variables declarations. It’s just a hash with lambdas, containing rules on how our interpreter should behave meeting one of these operators:
COMPARISON_OPERATORS = {
'<>' => ->(a, b) { a != b },
'<=' => ->(a, b) { a <= b },
'>=' => ->(a, b) { a >= b },
'<' => ->(a, b) { a < b },
'>' => ->(a, b) { a > b },
'=' => ->(a, b) { a == b },
}
<> sign is an equivalent to modern day !=. Also, BASIC doesn’t have distinction between = and ==. Comparison or assignment depends on context:
10 LET X = 520 X = 10 <- assignment
30 IF X = 5 THEN PRINT "OK" <- comparsion
Our factor function will handle quite a lot of highest precedence elements (numbers, variables, parenthesized expressions, unary operations, function calls). So we should split it into a smaller chunks:
def factor(line)return parse_number(line) if $token.is_a?(Integer) || $token.is_a?(Float)
return parse_string(line) if $token.is_a?(String) && $token.start_with?('"')
return parse_not(line) if $token == 'NOT' || $token == '~'
return parse_parenthesized_expr(line) if $token == '('
return parse_negative(line) if $token == '-'
return parse_variable(line) if $token.is_a?(String) && !$token.empty?
puts "Undefined token in factor: #{$token}"
nil
end
def parse_number(line)
value = $token
scan(line)
value
end
def parse_string(line)
value = $token[1..-2] # remove question marks
scan(line)
value
end
def parse_not(line)
scan(line)
a = factor(line)
return nil if a.nil?
if a.is_a?(Integer)
~a
else
(a != 0 ? 0 : 1)
end
end
def parse_parenthesized_expr(line)
scan(line)
a = expression(line)
return nil if a.nil?
if $token != ')'
puts 'Missing closing parenthesis!'
return nil
end
scan(line)
a
end
def parse_negative(line)
scan(line)
a = factor(line)
a.nil? ? nil : -a
end
def parse_variable(line)
identifier = $token
scan(line)
return $variables[identifier] if $variables.key?(identifier)
puts "Variable \"#{identifier}\" is not defined!"
nil
end
That’s a lot of new logic. Let’s break it down:
The refined parser implements a seven-level hierarchy of operations following extended precedence rules:
def expression(line)bitwise_or(line) # Entry point - lowest precedence
end
The complete precedence chain (from lowest to highest) is:
- bitwise_or() - Handles OR operations (|, OR) - lowest precedence
- bitwise_and() - Handles AND operations (&, AND)
- comparison() - Manages comparison operations (>, <, >=, <=, =, <>)
- add_sub() - Handles addition and subtraction (+, -)
- term() - Manages multiplication and division (*, /)
- power() - Handles exponentiation (^) with right-associativity
- factor() - Parses atomic elements (numbers, variables, parenthesized expressions, etc.)
This hierarchy ensures that expressions like A OR B AND C > D + E * F^2 are evaluated with proper precedence: exponentiation first, then multiplication, addition, comparison, logical AND, and finally logical OR.
Bitwise vs. Logical Operations
Our parser implements smart dual-mode operations for AND and OR:
def bitwise_and(line)a = comparison(line)
return nil if a.nil?
while $token == '&' || $token == 'AND'
op = $token
scan(line)
b = comparison(line)
return nil if b.nil?
if a.is_a?(Integer) && b.is_a?(Integer)
a = a & b # Bitwise AND for integers
else
a = (a != 0 && b != 0) ? 1 : 0 # Logical AND for other types
end
end
a
end
This behavior allows 5 & 3 to perform bitwise operations (resulting in 1), while X AND Y performs logical operations based on truthiness. The same dual behavior applies to OR operations.
Floating-Point Division
def term(line)a = power(line)
return nil if a.nil?
while $token == '*' || $token == '/'
op = $token
scan(line)
b = power(line)
return nil if b.nil?
if op == '*'
a *= b
else # op == '/'
if b == 0
puts 'Division by zero error!'
return nil
end
a = a.to_f / b.to_f # Always return floating-point result
end
end
a
end
By converting both operands to floating-point numbers with to_f, division operations like 5 / 3 correctly return 1.6666666667 instead of the integer result 1. This matches classic BASIC behavior where division always produces decimal results.
Comparison Operations Integration
comparison() function leverages our existing COMPARISON_OPERATORS constant to handle all comparison logic consistently:
def comparison(line)a = add_sub(line)
return nil if a.nil?
if COMPARISON_OPERATORS.key?($token)
op = $token
operator_func = COMPARISON_OPERATORS[op]
scan(line)
b = add_sub(line)
return nil if b.nil?
a = operator_func.call(a, b) ? 1 : 0
end
a
end
This approach is returning standardized boolean results (1 for true, 0 for false)
Right-Associative Exponentiation
Exponentiation maintains mathematical convention through recursive self-calls:
def power(line)a = factor(line)
return nil if a.nil?
if $token == '^'
scan(line)
b = power(line) # Recursive call ensures right associativity
return nil if b.nil?
a **= b
end
a
en
This ensures expressions like 2^3^2 evaluate as 2^(3^2) = 512, not (2^3)^2 = 64.
Modular Factor Processing
The factor() function dispatches to specialized handlers based on token type:
def factor(line)return parse_number(line) if $token.is_a?(Integer) || $token.is_a?(Float)
return parse_string(line) if $token.is_a?(String) && $token.start_with?('"')
return parse_not(line) if $token == 'NOT' || $token == '~'
return parse_parenthesized_expr(line) if $token == '('
return parse_negative(line) if $token == '-'
return parse_variable(line) if $token.is_a?(String) && !$token.empty?
# ...
end
Comprehensive Expression Evaluation Example
To illustrate the complete system, let’s trace through a simpler but comprehensive expression: A + B * C > 5 where A=2, B=3, C=4:
- expression() calls bitwise_or()
- bitwise_or() calls bitwise_and() (no OR operators found)
- bitwise_and() calls comparison() (no AND operators found)
- comparison() calls add_sub() to get the left operand of >
- add_sub() calls term() to get A (2)
- add_sub() sees the + operator and saves it
- add_sub() calls term() again for the right operand
- This term() calls power() to get B (3)
- This term() sees the * operator and saves it
- term() calls power() again for the right operand
- This power() calls factor() to get C (4)
- power() finds no ^ operator, so returns 4
- term() computes 3 * 4 = 12 and returns this value
- add_sub() computes 2 + 12 = 14 and returns this value
- Back in comparison(), it sees the > operator
- comparison() calls add_sub() for the right operand to get 5
- comparison() evaluates 14 > 5 using the comparison function, which returns true
- comparison() converts the boolean result to 1 (true in BASIC)
- bitwise_and() and bitwise_or() find no operators, so return 1
- The final result is 1 (true)
It demonstrates how the parser correctly follows the order of operations: first multiplication (B * C), then addition (A + result), and finally comparison (result > 5), while properly converting the boolean result to BASIC's numeric representation where 1 means true and 0 means false.
By breaking down expression parsing into specialized functions for each precedence level, our interpreter can evaluate arbitrarily complex formulas while respecting all the standard rules of operator precedence and associativity — a core requirement for any programming language implementation.
Let’s update last part of our let_statement method to implement new mechanic, as previously it allowed only for simple numerics to be parsed:
if expr_text.empty?puts 'Missing variable value!'
raise "Missing value"
else
# Convert expression to char array for parsing
expr_list = expr_text.chars
scan(expr_list)
value = expression(expr_list)
$variables[var_name] = value
end
and change print_statement function to read and parse expressions properly:
def print_statement(line)line = normalize_line(line).chars
scan(line)
while true
if $token == ''
break
elsif $token.is_a?(String) && $token[0] == '"'
text = $token[1..-2] # remove quotes
print text
$current_pos += text.length
scan(line)
else
# for any non-string token, treat it as the start of an expression
result = expression(line)
unless result.nil?
print result
$current_pos += result.to_s.length
end
end
# check for separator
if $token == ',' || $token == ';'
scan(line) # Skip separator
else
break
end
end
# add newline at the end
puts
$current_pos = 0
end
From now on we should be able to perform arithmetic operations properly:
> 10 LET A = 520 LET B = 10
30 LET C = A * B + 2
40 PRINT "A + B = "; A + B
50 PRINT "C = "; C
60 PRINT "3 * (B - A) = "; 3 * (B - A)
> RUN
A + B = 15
C = 52
3 * (B - A) = 15
> PRINT (5 * 9) - 12 / 3.5 + 9
50.57142857142857
Still few things are missing. We cannot perform perform arithmetic operations on variables by themself:
10 LET X = 1020 X = X + 12 - X * 3
30 PRINT X
> RUN
Unknown statement: X
10
To fix it let’s change execute function:
def execute(num, line)begin
line_str = normalize_line(line)
line = line_str.chars if line.is_a?(String)
# Save original line for possible assignment detection
original_line = line.dup
scan(line)
case $token
when 'PRINT'
print_statement(line)
when 'LET'
let_statement(line)
else
# Check if this starts with an existing variable name
if $token.is_a?(String) && $variables.key?($token) &&
line_str.include?('=')
# This is an operation on an existing variable
let_statement(original_line)
else
puts "Unknown statement: #{$token}"
end
end
rescue StandardError => e
puts "Line #{num}: Execution failed! #{e}"
end
end
This enhancement allows operations on existing variables without requiring the LET keyword. We check if the line starts with a token that matches an existing variable name and contains an equals sign. If so, we treat it as a variable assignment.
It makes common operations like incrementing counters much more convenient:
10 LET X = 10 Initial definition requires LET20 X = X + 10 Can update the variable directly
30 PRINT X Should display 20
To make sure everything works just as intended we can run some additional tests:
10 LET A = 520 LET B = 3
30 PRINT "Addition: "; A + B
40 PRINT "Subtraction: "; A - B
50 PRINT "Multiplication: "; A * B
60 PRINT "Division: "; A / B
70 PRINT "Mixed: "; A + B * 2
80 PRINT "Mixed with parens: "; (A + B) * 2
> RUN
Addition: 8
Subtraction: 2
Multiplication: 15
Division: 1.6666666666666667
Mixed: 11
Mixed with parens: 16
exponentiation and precedence:
10 LET X = 220 LET Y = 3
30 PRINT "X^Y = "; X^Y
40 PRINT "2^3^2 = "; 2^3^2
50 PRINT "Calculation: "; 2 + 3 * 4^2
60 PRINT "Complex: "; (X + Y) * (X - Y) / (X^Y)
70 PRINT "With negatives: "; -X^2 + Y
80 PRINT "Order of operations: "; 10 - 2 * 3 + 4 / 2
> RUN
X^Y = 8
2^3^2 = 512
Calculation: 50
Complex: -0.625
With negatives: 7
Order of operations: 6.0
bitwise and logical operations:
10 LET A = 520 LET B = 3
30 PRINT "Bitwise AND (5 & 3): "; A & B
40 PRINT "Bitwise OR (5 | 3): "; A | B
50 LET X = 0
60 LET Y = 1
70 PRINT "Logical AND (0 AND 1): "; X AND Y
80 PRINT "Logical OR (0 OR 1): "; X OR Y
90 PRINT "Complex logic: "; (A > B) AND (A + B > 7)
100 PRINT "Bitwise with arithmetic: "; (A & B) + (A | B)
> RUN
Bitwise AND (5 & 3): 1
Bitwise OR (5 | 3): 7
Logical AND (0 AND 1): 0
Logical OR (0 OR 1): 1
Complex logic: 1
Bitwise with arithmetic: 8
unary operations:
10 LET X = 520 PRINT "Unary minus: "; -X
30 PRINT "Double negative: "; -(-X)
40 PRINT "Logical NOT of X: "; NOT X
50 PRINT "Logical NOT of 0: "; NOT 0
> RUN
Unary minus: -5
Double negative: 5
Logical NOT of X: -6
Logical NOT of 0: -1
That was long and challenging section. However within less than 200 lines of code we’ve made fully working parsing engine, capable of solving complex arithmetic, logical and bitwise operations!
The GOTO statement is perhaps the most infamous BASIC command, allowing unconditional jumps to any line number. This enables looping and basic decision making, though it can lead to “spaghetti code” if used too often. The IF statement adds conditional execution, evaluating a boolean expression and only executing the subsequent command if the condition is true. Altair BASIC from 1975 didn’t have ELSE statements implemented yet. If programmer wanted to mimic it, he could use GOTO instea. Skipping certain lines of codes if condition wasn’t fulfilled:
10 LET N = 520 IF N < 10 THEN GOTO 50
30 PRINT "Number is 10 or greater"
40 GOTO 60
50 PRINT "Number is less than 10"
60 END
Good news is we will include ELSE statement the same way as it worker in latter BASIC editions, such as MS BASIC. Making pain of coding like it’s 1975 completely arbitrary:)
10 LET N = 520 IF N < 10 THEN PRINT "Number is less than 10" ELSE PRINT "Number is 10 or greater"
30 END
Our implementation includes a state machine to track line execution order, with a $goto flag to indicate when normal sequential execution should be interrupted. The comparison operators (=, <, >, etc.) are implemented as a lookup table of functions for flexibility. While GOTO is now considered poor practice in modern programming, it was central to BASIC’s design.
# Add more global variables$line_number = 0 # Current line number
$goto = false # GOTO flag
# Implement GOTO statement
def goto_statement(line)
line = normalize_line(line).chars
scan(line)
target_line = expression(line)
$line_number = target_line.to_i
$goto = true
end
# Update the run function to handle GOTO
def run
line_iterator = $buffer.keys.sort.each
begin
loop do
if $goto == false
$line_number = line_iterator.next
else
$goto = false
current_iterator = $buffer.keys.sort.each
current_line = current_iterator.next
while $line_number != current_line
current_line = current_iterator.next
end
line_iterator = current_iterator
end
line = $buffer[$line_number]
execute($line_number, line)
end
rescue StopIteration
# End of program
rescue StandardError => e
puts "Program terminated with error: #{e}"
end
end
run function maintain sequential execution until a GOTO is encountered, at which point it resets the iterator to find and continue execution from the target line. The global $goto flag acts as a signal between the statement handler and the execution loop.
goto_statement simply evaluates the expression after GOTO (which is typically a line number but could be a computed expression), sets the global line number to this value, and raises the goto flag. The run loop then handles the actual line jumping.
StopIteration in the run function plays a key role in controlling program termination. This exception is automatically raised by the line_iterator.next() call when the end of the program is reached (i.e., there are no more lines to execute), allowing for a clean exit from the main loop. Additionally, we intentionally use this exception in the future END statement to immediately halt program execution regardless of the current position.
Now let’s implement IF statement:
def if_statement(num, line)text = normalize_line(line)
# find THEN
then_index = text.index('THEN')
if then_index.nil?
puts 'Missing "THEN" after condition!'
raise "Missing THEN keyword"
end
# split into condition and action
condition = text[0...then_index].strip
action = text[(then_index + 4)..-1].strip
# evaluate condition
if evaluate_condition(condition)
# execute the action
execute(num, action)
end
end
def evaluate_condition(condition)
COMPARISON_OPERATORS.each do |op, func|
next unless condition.include?(op)
left, right = condition.split(op, 2)
return func.call(
eval_expr(left.strip.chars),
eval_expr(right.strip.chars)
)
end
# If no operator, check if value is non-zero
eval_expr(condition.chars) != 0
end
# helper function for expression evaluation
def eval_expr(expr)
scan(expr)
expression(expr)
end
# and finally update execute function to handle IF and GOTO
case $token
when 'PRINT'
print_statement(line)
when 'LET'
let_statement(line)
when 'GOTO'
goto_statement(line)
when 'IF'
if_statement(num, line)
else # rest of your code
We’ve split if statement logic into a couple smaller functions in order to keep our code cleaner. COMPARSION_OPERATORS constant was already declared in previous section.
The eval_expr function serves as a convenient tool for evaluating the value of expressions passed as text. It takes a string (or a character array), converts it into a unified character array format, initiates the tokenization process by calling scan(), and then passes control to the expression parser hierarchy starting with expression().
This allows us to easily compute the value of any expression provided as text — for example, a condition in an IF statement or an argument for GOTO — without needing to duplicate the parser code. It acts as a central point for expression evaluation, simplifying the implementation of many BASIC statements that require the evaluation of mathematical or logical expressions.
now we can test if everything works fine:
> 10 LET X = 120 PRINT "X = "; X
30 LET X = X + 1
40 IF X < 5 THEN GOTO 20
50 PRINT "Loop completed!"
60 IF X > 3 THEN PRINT "X is greater than 3"
70 LET Y = 10
80 IF Y = 10 THEN GOTO 100
90 PRINT "This will be skipped"
100 PRINT "End of program"
> RUN
X = 1
X = 2
X = 3
X = 4
Loop completed!
X is greater than 3
End of program
With IF and GOTO we can even impelement a typical loop behaviour and iterate over variables, sweet! But don’t worry, proper loops will come up soon. Now let’s add ELSE statement:
def if_statement(num, line)text = normalize_line(line).chars
# Find THEN and ELSE keywords
then_index = text.index('THEN')
else_index = text.index('ELSE', then_index + 4) if then_index
if then_index.nil?
puts 'Missing "THEN" after condition!'
raise "Missing THEN keyword"
end
condition = text[0...then_index].strip
if else_index
# We have both THEN and ELSE
then_action = text[(then_index + 4)...else_index].strip
else_action = text[(else_index + 4)..-1].strip
if evaluate_condition(condition)
execute(num, then_action)
else
execute(num, else_action)
end
else
# Only THEN, no ELSE
action = text[(then_index + 4)..-1].strip
if evaluate_condition(condition)
execute(num, action)
end
end
end
To add support for ELSE statement, we need to modify if_statement function. To check if ELSE keyword is after THEN, and to execute given code in case of a false condition. To test this we can try:
10 LET X = 2020 IF X > 10 THEN PRINT "LARGE" ELSE PRINT "SMALL"
> RUN
LARGE
So far we’ve gave our interpreter ability to print output, to declare and use variables, parse and solve complex mathematical and logical expression and to use conditional if/else statements with GOTO keyboard. Respecting rules of BASIC line numeration. That is a solid base and even at this point we are capable of running countless number of useful programs.
In the next part we will greatly expand features of this interpreter. Implementing proper loops, arrays, comments, functions, subroutines or ability to get user’s input.
Click here to read part 2 of this article
Full version of code for this article can be found here
.png)

