Home > Programming Languages Design > Compilers (Lexical Analyzer) – Part 2

Compilers (Lexical Analyzer) – Part 2

السلام عليكم و رحمه الله و بركاته

Share Article on FaceBook

After giving a simple  introduction to Compilers Let’s start talking about

the first block which is the Scanner(Lexical Analyzer).

A typical scanner Must:

  1. recognizes the keywords of the language (these are the reserved words that have a special meaning in the language, such as the word class in Java);
  2. recognizes special characters, such as ( and ), or groups of special characters, such as := and ==;
  3. recognizes identifiers, integers, reals, decimals, strings, etc;
  4. ignores whitespaces (tabs and blanks) and comments;
  5. recognizes and processes special directives (such as the #include "file" directive in C) and macros.

So what’s the challenge here ?

naive scanner groups input characters into lexical words (a lexical word can be either a sequence of alphanumeric characters without whitespaces or special characters, or just one special character), and then tries to associate a token (ie. number, keyword, identifier, etc) to this lexical word by performing a number of string comparisons.

This becomes very expensive when there are many keywords and/or many special lexical patterns in the language.

So to build an efficient scanner we will use regular expressions(RE) and finite automata(FA).

See you With the Next Part About RE & FA.

  1. Ahmad Atta
    November 8, 2009 at 6:31 pm | #1

    Nice work, FeRaS. keep writing, man. i don’t encourage you, i really need the following parts :D :D

    • ferasferas
      November 8, 2009 at 8:39 pm | #2

      thx alot a.atta for ur comment :D (first comment on the Blog LOL !),
      wish me luck to read more currently stopped for some time for sorrow :( !.

  1. No trackbacks yet.