From: Luke Lau Date: Mon, 3 Jun 2019 14:45:10 +0000 (+0100) Subject: Add AST and parsing X-Git-Url: http://git.lukelau.me/?a=commitdiff_plain;h=5c4c0171f43b2d66cec3a882cdf2ecd904c83a1a;hp=30a26b7d2b0e17ea523ee34cb5d37242a38882df;p=kaleidoscope-hs.git Add AST and parsing The first step in our compiler is to convert the source code into some sort of data structure that we can work with. This data structure usually ends up being a tree structure: nodes of expressions built up of other expressions. For example, (1 + (3 - 2)) would be a tree where 1, 3 and 2 are leaf nodes and +/- are parent nodes. It represents the syntax of the program, but is abstract as it doesn't contain the details about it like whitespace or parenthesis: an abstract syntax tree, if you will. In the original LLVM Kaleidoscope tutorial, this abstract syntax tree (AST) is usually built by first lexing the program into separate tokens like identifiers and keywords, and then parsing it to build up the tree structure. We're not going to do that in this tutorial, and instead opt for the Haskell-ier way with parser combinators. Parser combinators allow us to lex and parse at the same time by simply specifying what we expect to parse. We'll be using the ReadP monad, which is also used for the Read class: In fact we'll just be able to parse our program by calling 'read'! The P in ReadP stands for precedence, and you'll see later on we'll be able to add some tricks to prefer certain patterns over others when parsing. We'll also be writing all our parsing with do notation, which I think you'll agree feels very natural to use. --- diff --git a/AST.hs b/AST.hs new file mode 100644 index 0000000..b57d7cb --- /dev/null +++ b/AST.hs @@ -0,0 +1,65 @@ +module AST where + +import Data.Char +import Text.Read +import Text.ParserCombinators.ReadP hiding ((+++), choice) + +data Expr = Num Float + | Var String + | BinOp BinOp Expr Expr + | Call String [Expr] + deriving Show + +data BinOp = Add | Sub | Mul | Cmp Ordering + deriving Show + +instance Read Expr where + readPrec = parens $ choice [ parseNum + , parseVar + , parseCall + , parseBinOp "<" 10 (Cmp LT) + , parseBinOp "+" 20 Add + , parseBinOp "-" 20 Sub + , parseBinOp "*" 40 Mul + ] + where parseNum = Num <$> readPrec + parseVar = Var <$> lift (munch1 isAlpha) + parseBinOp s prc op = prec prc $ do + a <- step readPrec + lift $ do + skipSpaces + string s + skipSpaces + b <- readPrec + return (BinOp op a b) + parseCall = do + func <- lift (munch1 isAlpha) + params <- lift $ between (char '(') (char ')') $ + sepBy (readS_to_P reads) + (skipSpaces >> char ',' >> skipSpaces) + return (Call func params) + +data Prototype = Prototype String [String] + deriving Show + +instance Read Prototype where + readPrec = lift $ do + name <- munch1 isAlpha + params <- between (char '(') (char ')') $ + sepBy (munch1 isAlpha) skipSpaces + return (Prototype name params) + +data AST = Function Prototype Expr + | Extern Prototype + | TopLevelExpr Expr + deriving Show + +instance Read AST where + readPrec = parseFunction +++ parseExtern +++ parseTopLevel + where parseFunction = do + lift $ string "def" >> skipSpaces + Function <$> readPrec <*> readPrec + parseExtern = do + lift $ string "extern" >> skipSpaces + Extern <$> readPrec + parseTopLevel = TopLevelExpr <$> readPrec diff --git a/Main.hs b/Main.hs index 76a9bdb..ec0de8c 100644 --- a/Main.hs +++ b/Main.hs @@ -1 +1,10 @@ -main = pure () +import AST +import System.IO +import Text.Read +main = do + hPutStr stderr "ready> " + ast <- (readMaybe <$> getLine) :: IO (Maybe AST) + case ast of + Just x -> hPrint stderr x + Nothing -> hPutStrLn stderr "Couldn't parse" + main