MC: String to Real with a Brain

The Challenge

From: "Jonathan M. Katz" <jmkatz-72@worldnet.att.net>
Subject: MC: string -> real (with a brain)
Date: 27 May 1998 00:00:00 GMT


STRING -> REAL (with a brain)

Suppose a program uses the INPUT command to prompt the user to enter 
a real number. Of course, there is no guarantee that the user will 
actually enter a real number, and a STR-> or EVAL command (with no 
error trapping) immediately following a lousy string might cause some 
ugly things to happen, especially within a program. So...

THE CHALLENGE

Write a --> pure User-RPL <-- program (NOT programs) that will take a 
single string (the user's input, assumed to be < 100 chars) and return 
one of the following:

	a. the _first_ real number found within the string
	b. zero

	Level 1		-->	Level 1
	-------------------------------
	"string"		x (as above)

THE CATCH

As stated above, pure User-RPL is a must. But there's an additional 
catch: since I wish to use the best result within a program I am 
writing for my 48SX, the solution must not depend on any G/GX-only 
commands, nor must it require the use of "secondary" programs (like 
DOLIST or DOSUBS patches, etc.) that an S/SX would need to "mimic" 
a G/GX command. And no using commands found only in external libraries.

THE TEST CASES

Here are some "pathological" (i.e. deliberately evil) cases, plus 
the relatively normal cases to use when testing & computing score:

	Level 1		-->	   Level 1
	-------------------------------
1.	 "3.1415" 		 3.1415		<-- Test only
2.	"-3.1415"		-3.1415		<-- Test only
3.	" 3.1415 "		 3.1415		<-- Test only

4.	" -3.14 15 "		-3.14		<-- Test only
5.	" -3.1 FOUR 15 "	-3.1		<-- Test only
6.	" - 3.1 FOUR 15 "	 3.1		<-- Test only

7.	" A926B5 "	       926		<-- Test & score
8.	" A92 6B5 "		92		<-- Test & score
9.	" A-9-26B5 "		-9		<-- Test & score

10.	" .X.358 "		 0.358		<-- Test & score
11.	" .X.3 58 "		 0.3		<-- Test & score
12.	" .X-..3.58 "		 0.3		<-- Test & score
13.	" .X.-.3.58 "		-0.3		<-- Test & score
14.	" .X.-3.58 "		-3.58		<-- Test & score

15.	"ABCD-&^%$.?:"		 0		<-- Test only

16.     "3e14"                  3.E14           <-- Test only
17.     "e3vp7"                 1.E3            <-- Test only
18.     "-e3vp7"               -1.E3            <-- Test only

19.     " .E.358 "              0.358           <-- Test & score
20.     " .E.3 58 "             0.3             <-- Test & score
21.     " .X.-3.5E8 "          -3.5E8           <-- Test & score

The cases 7-14 are the basic "pathological" input types, but the 
program should be able to handle strings with ugly contents like 
delimiters (such as () {} [] '' # etc.), function names and math 
symbols (TAN, ^, %), whitespace (including CR & CR+LF) and any 
other ASCII "chaff" possible. Basically, this program must be a 
dummy-proof string --> real cracker.

THE JUDGING

As usual, best bytes*time*clk_Hz wins.  Present a table of run times 
for the cases 7-14 listed above. If you comment your program, you'll 
get a pat on the head :) but comments are not necessary. Please submit 
all answers to this ng.

Gentlemen & ladies, start your pencils!

katz

From: "Jonathan M. Katz" <jmkatz-72@worldnet.att.net>
Subject: Re: MC: string -> real (with a brain)
Date: 30 May 1998 00:00:00 GMT

Bruce Horrocks wrote:

> None of the examples includes a negative exponent (1E-2 = 0.01) 
> nor an explicitly positive exponent (1e+2), nor leading zeroes, 
> both mantissa and exponent. Since OBJ-> on the SX allows 
> "000.1E+0010" = 1000000000 I presume that the brainy version 
> should as well.

You are correct sir! My examples are by no means intended to serve 
as a complete set of performance guidelines; it's pretty tough to 
try & think up all of the crazy or ugly looking inputs that should 
yield a nonzero result... out-thinking idiots is no job for dummies! 
:)

Basically, if the calc can handle the numeric input "string" when 
typed in regular stack mode, then the MC program should be able to 
crack the same number from a messy input string.

katz

The Entries
Scroll down a few lines if you really want to see the entries

 

 

 

 

 

 

From: "Gerald Squelart" <squelart@aemiaif.lip6.fr>
Subject: Re: string -> real (with a brain)
Date: 29 May 1998 00:00:00 GMT

G'day!

OK, here is my solution (without E yet!).

I use a state machine (so it would be very easy to do it in ASM). I can post
some more information if you'd like...

<<
  1 CHR 0 CHR + + 0 DUP 1
  WHILE DUP
  REPEAT 4 PICK ROT 1 + SWAP OVER DUP SUB
    IF DUP NUM
    THEN ".-0576493182" SWAP POS 1 + DUP 4 < * 8 * ROT +
    { -6 6 5 5 5 6 8 8 1 1 1 1 0 0 0 0 -4 3 -4 -4 0 7 0 0 -2 -2 -2 -2 0 0 0
0 }
    SWAP GET
      IF DUP 0 <
      THEN NEG ROT DROP OVER SWAP
      END
    ELSE 5 DROPN "0" 1 2 0
    END
  END
  DROP 1 - SUB STR->
>>
BYTES #4E66h 295.5
ON-D A -> SPD 3.92

Results in time*bytes*spd

 Level 1 --> Level 1
 -------------------------------
1. "3.1415" 3.1415 <-- Test only - OK
2. "-3.1415" -3.1415 <-- Test only - OK
3. " 3.1415 " 3.1415 <-- Test only - OK

4. " -3.14 15 " -3.14 <-- Test only - OK
5. " -3.1 FOUR 15 " -3.1 <-- Test only - OK
6. " - 3.1 FOUR 15 " 3.1 <-- Test only - OK

7. " A926B5 " 926 <-- Test & score - 548.5
8. " A92 6B5 " 92 <-- Test & score - 472.8
9. " A-9-26B5 " -9 <-- Test & score - 475.2

10. " .X.358 " 0.358 <-- Test & score - 707.4
11. " .X.3 58 " 0.3 <-- Test & score - 550.9
12. " .X-..3.58 " 0.3 <-- Test & score - 712.4
13. " .X.-.3.58 " -0.3 <-- Test & score - 719.5
14. " .X.-3.58 " -3.58 <-- Test & score - 864.1


15. "ABCD-&^%$.?:" 0 <-- Test only - OK

I'll try to post the solution with 'E' with week-end...

CU,
Gerald.

From: "Gerald Squelart" <squelart@aemiaif.lip6.fr>
Subject: Re: string -> real (with a brain), solution+explanation (w/ E)
Date: 05 Jun 1998 00:00:00 GMT

G'Day!

Finally, here is my solution, handling E (not e nor +, sorry).

<<
  " 0 " + 0 DUP 1
  WHILE DUP IP
  REPEAT 4 PICK ROT 1 + SWAP OVER DUP SUB
    ".-E0123456789" SWAP POS 1 + DUP 5 < * 11 * ROT +
    { -6  6  5  5  5  6  8  8 11 11 11
       1  1  1  1  0  0  0  0 .1 .2  0
      -4  3 -4 -4  0  7  0  0 .1 .2  0
      -2 -2 -2 -2  0  0  0  0 10 .2  0
       1  1  1  1  9  9  9  9 .1 .2  0 }
    SWAP GET
    IF DUP 0 <
    THEN NEG ROT DROP OVER SWAP
    END
  END
  DROP 10 * - 1 - SUB STR->
>>
BYTES #4862h 417.5
ON-D A -> SPD 3.92

Results in time(s)*bytes*spd
1. "3.1415" 3.1415 <-- Test only - OK
2. "-3.1415" -3.1415 <-- Test only - OK
3. " 3.1415 " 3.1415 <-- Test only - OK

4. " -3.14 15 " -3.14 <-- Test only - OK
5. " -3.1 FOUR 15 " -3.1 <-- Test only - OK
6. " - 3.1 FOUR 15 " 3.1 <-- Test only - OK

7. " A926B5 " 926 <-- Test & score - 504
8. " A92 6B5 " 92 <-- Test & score - 435
9. " A-9-26B5 " -9 <-- Test & score - 443

10. " .X.358 " 0.358 <-- Test & score - 651
11. " .X.3 58 " 0.3 <-- Test & score - 508
12. " .X-..3.58 " 0.3 <-- Test & score - 657
13. " .X.-.3.58 " -0.3 <-- Test & score - 665
14. " .X.-3.58 " -3.58 <-- Test & score - 806

15. "ABCD-&^%$.?:" 0 <-- Test only - OK

16. "3e14" 3.E14 <-- Test only - OK
17. "e3vp7" 1.E3 <-- Test only - OK
18. "-e3vp7" -1.E3 <-- Test only - OK

19. " .E.358 " 0.358 <-- Test & score - 646
20. " .E.3 58 " 0.3 <-- Test & score - 510
21. " .X.-3.5E8 " -3.5E8 <-- Test & score - 873

That's it for today!

So, now I'm waiting for some competition :-O

Have a good week-end,
Gerald.

From: <boettcher@cmu.edu> (Peter W Boettcher)
Subject: Re: Need subroutine
Date: Fri,  6 Mar 1998 00:48:57 -0500

Whoops!  I somehow missed the builtin MIN.  Now it's:

\<< 
  OVER SWAP - ABS DUP
  \<< MIN \>>
  STREAM POS GET
\>>
# 9A3h
48

It now takes ~5s on a 200 element random list.

Pete Boettcher
boettcher@cmu.edu


The Winner (and an assembly language version)

Download the final User RPL solution

From: "Gerald Squelart" <squelart@aemiaif.lip6.fr>
Subject: Re: string -> real (with a brain), asm
Date: 08 Jun 1998 00:00:00 GMT

Yahoo! (tm) :-)

What do I win? apart from your consideration :-O

By the way, I've said it would be easy to adapt to asm (and Francisco Guzman
asked for it)...

In fact, the algorithm to find where is the number in the string is easy to
port, but then there is some work to convert the string to a real number.

I've made the conversion routine for EQW (in the Meta Kernel), I give it
here (after this message), I leave the integration as an exercise :-)
No support will be available from me, HP, MDG, da Vinci, Sparcom, BB
Marketing, Maubert, or anyone else, unless you kindly ask in this newsgroup
:-O

Gerald.

Jonathan M. Katz wrote in message <6lfglq$cu5@bgtnsc02.worldnet.att.net>...

>Well, it appears that Gerald is the winner by default! I guess
>this task was a little daunting, I sure expected it to be. Much
>thanks to you Gerald, congrats!
>
>katz



*********************************
This routine reads a well-formed real number (it will crash in there's
something wrong).
D0 points to the string, the first byte gives the number of characters
(Pascal-like).
The answer is in B, field W.
It's in MK-Masd syntax:
    *foo is a label
    -> means GOYES
    SKIPNC { xy } means GONC foo xy *foo
    SKIPYES { xx } means GOYES foo xx *foo
Some French words :-)
    moins means minus
    fin means end

Of course, it's (C)MDG. You may use this code if you say where it come from,
or if you hide it very well inside your programs...
<ADV>
Buy the Meta Kernel!
http://www-miaif.lip6.fr/gerald/english/mk.html
</ADV>

*STRtoR
B=0 W
SETDEC B-1 X SETHEX
C=0 S
C+4 S
ST=0 7
C=DAT0 B
D=C B
D0+2
D-1 B
A=DAT0 B
D0+2
LC 2D
?A#C B -> .NOTMOINS
SETDEC B=-B-1 S SETHEX
*.LOOP0
D-1 B SKIPNC
{ B=0 W RTN }
A=DAT0 B
D0+2
*.NOTMOINS
LC 30
?A#C B -> .NOT0
?ST=0 7 SKIPYES
{ SETDEC B-1 X SETHEX }
GOTO .LOOP0
*.NOT0
LC 2E
?A#C B -> .FIN0
ST=1 7
GOTO .LOOP0
*.LOOPN
D-1 B SKIPNC
{
 ?C=0 S RTNYES
 *.BSLM1
 BSL M
 C+1 S
 GONC .BSLM1
 RTN
}
A=DAT0 B
D0+2
*.FIN0
LC 45
?A=C B -> .EXP
LC 2E
?A#C B SKIPYES
{ ST=1 7 GOTO .LOOPN }
LC 30
C=A-C B
BSL M
C+1 S
P=C 0 C=P 3
P=3 B=C P P=0
?ST=1 7 SKIPYES
{ SETDEC B+1 X SETHEX }
GOTO .LOOPN
*.EXP
?C=0 S SKIPYES
{
 *.BSLM2
 BSL M
 C+1 S
 GONC .BSLM2
}
D-1 B RTNC
DSRC
ST=0 7
D=0 X
A=DAT0 B
D0+2
LC 2D
?A#C B -> .NOTMOINSEXP
ST=1 7
*.LOOPEXP
D-1 S GOC .FINEXP
A=DAT0 B
D0+2
*.NOTMOINSEXP
LC 30
C=A-C B
DSL X
D=C P
GOTO .LOOPEXP
*.FINEXP
?D=0 X RTNYES
C=D X
SETDEC
C+C X SKIPNC
{
 SETHEX
 ?ST=1 7 SKIPYES
 {
  *.MAXR
  LC 999999999999499
  C=B S
  B=C W
  RTN
 }
 B=0 W
 RTN
}
?ST=1 7 SKIPYES
{
 C=B X
 C+C X SKIPC
 {
  C=D X
  B+C X
  C=B X
  C+C X
  SETHEX
  RTNNC
  GOTO .MAXR
 }
 C=D X
 B+C X
 SETHEX
 RTN
}
D=-D X
C=B X
C+C X SKIPNC
{
 C=D X
 B+C X
 C=B X
 C+C X
 SETHEX
 RTNC
 B=0 W
 RTN
}
C=D X
B+C X
SETHEX
RTN

What? No comments?
Why do you want comments???
Real programmers never comment their source :-)


Final Thoughts

From: "Jonathan M. Katz" <jmkatz-72@worldnet.att.net>
Subject: Re: string -> real (with a brain), asm
Date: 09 Jun 1998 00:00:00 GMT

Gerald Squelart wrote:

> Why do you use the no-E version instead of the with-E?

Well, I invoke the string->real code immediately after getting an 
input string from the user. To make a long story short, here's the 
whole process in pseudo-code:

<<  ... "How many new items in this list?" { "" alpha } INPUT

    ( string->real routine here )

    0 MAX m MIN IP ... >>

So, as you can see, I basically limit the addition of new list items 
to an integer value between 0 and m (currently on the order of 15-20). 
Since I have no desire to let a silly user add 6.0223E23 new items 
(gadzooks!), I don't need any "E" handling for this particular 
application. 

In fact, I didn't really need any sign or decimal handling, either, 
but when I decided to post the original MC idea, I thought "Why make 
a routine that chokes on reals? If it can 'crack' a string to get a 
real number, that's *much* more useful than just cracking an unsigned 
integer." So that's where the sign and decimal data came from, and 
once I posted the MC, you had the good sense to ask about exponents. 
So there you have it...

In any case, I have modified your great program slightly, so that it 
now can handle +, -, ., and E correctly. Here it is:

\<< " 0 " + 0 DUP 1
  WHILE DUP IP
  REPEAT 4 PICK ROT 1 + SWAP OVER DUP SUB
    "+-.E0123456789" SWAP POS 1 + DUP 6 < * 11 * ROT +
    { -6  6  5  5  5  6  8  8 11 11 11
       1  1  1  1  0  0  0  0 .1 .2  0
      -2 -2 -2 -2  0  0  0  0 10 .2  0
      -2 -2 -2 -2  0  0  0  0 10 .2  0
      -4  3 -4 -4  0  7  0  0 .1 .2  0
       1  1  1  1  9  9  9  9 .1 .2  0 }
    SWAP GET
    IF DUP 0 <
    THEN NEG ROT DROP OVER SWAP
    END
  END
  DROP 1 - SUB STR\->
\>>

BYTES: #9BB7h 449

After I sat & SST-ed through your prog. for half an hour, it's 
operation was clear to me, so modification was a snap! For anybody 
that wants a "case-insensitive" version (i.e. "e" and "E" handled 
the same way), just change the string to 

"+-.Ee0123456789"

and duplicate the last row of the large list, just as rows 3 & 4 
are duplicates. But since I have access to a routine that will 
automatically format a string all caps, I don't need to do this.

Once again, thank you Gerald, and great job with the MC! It's fun 
to learn new programming techniques on this ng, especially slick 
ones like a finite-state machine.

katz