From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: 103376,e037e74567eca23d X-Google-Attributes: gid103376,public X-Google-Language: ENGLISH,ASCII-7-bit Path: g2news1.google.com!news2.google.com!fu-berlin.de!uni-berlin.de!not-for-mail From: Nick Roberts Newsgroups: comp.lang.ada Subject: Re: Question about Ada.Unchecked_Conversion Date: Sat, 30 Oct 2004 17:30:07 +0100 Message-ID: <2uhtslF2bkb2mU1@uni-berlin.de> References: <2uf53sF2b165eU1@uni-berlin.de> <35f054ea.0410300223.773b722b@posting.google.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Trace: news.uni-berlin.de jGNudgwHIAvXe5sj3dFfMAVH1bYPR+yxf2VUv5450kIqoFozs= User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.2) Gecko/20040803 X-Accept-Language: en-us, en In-Reply-To: <35f054ea.0410300223.773b722b@posting.google.com> Xref: g2news1.google.com comp.lang.ada:5917 Date: 2004-10-30T17:30:07+01:00 List-Id: skidmarks wrote: > I don't have the full picture of what you're trying to do so my answer > may be off the mark. I think you're dead right in principle, and slightly off only in a few minor details. > When I build a custom lexor I include a (usually) 256 cell array. The > cell is mapped one-to-one with (any) 256 representation of the > character set in use, and 256 has the property of guaranteeing nor > constraint errors. The contents of each cell is a enumeration > representing the purpose of the mapped character. My lexer is built > around this. Exactly how I do it. Which is not to say that it is necessarily the best possible way, but I have used it with success in the past. > For example: > > type enum { ign, sep, op, sym, num, hex, ... ); type Character_Category is ( Ignore, Separator, Operator, Symbolic, [and so on] ); > type enum_Map is array ( Integer range 1 .. 256 ) of enum; > > Map : constant enum_Map := ( ign, ... sep, ... ); Cat_Of: constant array (Character) of Character_Category := ( ',' | ';' | ':' | '(' | ')' => Separator, '+' | '-' | '*' | '/' => Operator, [and so on], others => Ignore ); > begin -- > > if ( enum_Map( Character'Pos(char) ) = <> ) then <> end if; if Cat_Of(Char) = Separator then ... > or > > case enum_Map( Character'Pos(char) ) is > when ign => <> > when sep => <> > when op => <> > when sym => <> > when num => <> case Cat_Of(Char) is when Ignore => null; when Separator => if Char = ')' then End_of_List := True; elsif Char /= ',' then raise Syntax_Error; end if; if Id = "" then raise Syntax_Error; end if; Append( Id_List, Id ); Clear( Id ); Get_Next_Character; [and so on] > and so on. > > I use a Moore Machine for my finite state machine, the arcs > representing the transition states (ign, sep, ...) and actions > associated with taking the transition. -- Nick Roberts