MC: String to Real with a Brain
The Challenge
From: "Jonathan M. Katz" <jmkatz-72@worldnet.att.net> Subject: MC: string -> real (with a brain) Date: 27 May 1998 00:00:00 GMT STRING -> REAL (with a brain) Suppose a program uses the INPUT command to prompt the user to enter a real number. Of course, there is no guarantee that the user will actually enter a real number, and a STR-> or EVAL command (with no error trapping) immediately following a lousy string might cause some ugly things to happen, especially within a program. So... THE CHALLENGE Write a --> pure User-RPL <-- program (NOT programs) that will take a single string (the user's input, assumed to be < 100 chars) and return one of the following: a. the _first_ real number found within the string b. zero Level 1 --> Level 1 ------------------------------- "string" x (as above) THE CATCH As stated above, pure User-RPL is a must. But there's an additional catch: since I wish to use the best result within a program I am writing for my 48SX, the solution must not depend on any G/GX-only commands, nor must it require the use of "secondary" programs (like DOLIST or DOSUBS patches, etc.) that an S/SX would need to "mimic" a G/GX command. And no using commands found only in external libraries. THE TEST CASES Here are some "pathological" (i.e. deliberately evil) cases, plus the relatively normal cases to use when testing & computing score: Level 1 --> Level 1 ------------------------------- 1. "3.1415" 3.1415 <-- Test only 2. "-3.1415" -3.1415 <-- Test only 3. " 3.1415 " 3.1415 <-- Test only 4. " -3.14 15 " -3.14 <-- Test only 5. " -3.1 FOUR 15 " -3.1 <-- Test only 6. " - 3.1 FOUR 15 " 3.1 <-- Test only 7. " A926B5 " 926 <-- Test & score 8. " A92 6B5 " 92 <-- Test & score 9. " A-9-26B5 " -9 <-- Test & score 10. " .X.358 " 0.358 <-- Test & score 11. " .X.3 58 " 0.3 <-- Test & score 12. " .X-..3.58 " 0.3 <-- Test & score 13. " .X.-.3.58 " -0.3 <-- Test & score 14. " .X.-3.58 " -3.58 <-- Test & score 15. "ABCD-&^%$.?:" 0 <-- Test only 16. "3e14" 3.E14 <-- Test only 17. "e3vp7" 1.E3 <-- Test only 18. "-e3vp7" -1.E3 <-- Test only 19. " .E.358 " 0.358 <-- Test & score 20. " .E.3 58 " 0.3 <-- Test & score 21. " .X.-3.5E8 " -3.5E8 <-- Test & score The cases 7-14 are the basic "pathological" input types, but the program should be able to handle strings with ugly contents like delimiters (such as () {} [] '' # etc.), function names and math symbols (TAN, ^, %), whitespace (including CR & CR+LF) and any other ASCII "chaff" possible. Basically, this program must be a dummy-proof string --> real cracker. THE JUDGING As usual, best bytes*time*clk_Hz wins. Present a table of run times for the cases 7-14 listed above. If you comment your program, you'll get a pat on the head :) but comments are not necessary. Please submit all answers to this ng. Gentlemen & ladies, start your pencils! katz
From: "Jonathan M. Katz" <jmkatz-72@worldnet.att.net> Subject: Re: MC: string -> real (with a brain) Date: 30 May 1998 00:00:00 GMT Bruce Horrocks wrote: > None of the examples includes a negative exponent (1E-2 = 0.01) > nor an explicitly positive exponent (1e+2), nor leading zeroes, > both mantissa and exponent. Since OBJ-> on the SX allows > "000.1E+0010" = 1000000000 I presume that the brainy version > should as well. You are correct sir! My examples are by no means intended to serve as a complete set of performance guidelines; it's pretty tough to try & think up all of the crazy or ugly looking inputs that should yield a nonzero result... out-thinking idiots is no job for dummies! :) Basically, if the calc can handle the numeric input "string" when typed in regular stack mode, then the MC program should be able to crack the same number from a messy input string. katz
The Entries
Scroll down a few lines if you really want to see the entries
From: "Gerald Squelart" <squelart@aemiaif.lip6.fr> Subject: Re: string -> real (with a brain) Date: 29 May 1998 00:00:00 GMT G'day! OK, here is my solution (without E yet!). I use a state machine (so it would be very easy to do it in ASM). I can post some more information if you'd like... << 1 CHR 0 CHR + + 0 DUP 1 WHILE DUP REPEAT 4 PICK ROT 1 + SWAP OVER DUP SUB IF DUP NUM THEN ".-0576493182" SWAP POS 1 + DUP 4 < * 8 * ROT + { -6 6 5 5 5 6 8 8 1 1 1 1 0 0 0 0 -4 3 -4 -4 0 7 0 0 -2 -2 -2 -2 0 0 0 0 } SWAP GET IF DUP 0 < THEN NEG ROT DROP OVER SWAP END ELSE 5 DROPN "0" 1 2 0 END END DROP 1 - SUB STR-> >> BYTES #4E66h 295.5 ON-D A -> SPD 3.92 Results in time*bytes*spd Level 1 --> Level 1 ------------------------------- 1. "3.1415" 3.1415 <-- Test only - OK 2. "-3.1415" -3.1415 <-- Test only - OK 3. " 3.1415 " 3.1415 <-- Test only - OK 4. " -3.14 15 " -3.14 <-- Test only - OK 5. " -3.1 FOUR 15 " -3.1 <-- Test only - OK 6. " - 3.1 FOUR 15 " 3.1 <-- Test only - OK 7. " A926B5 " 926 <-- Test & score - 548.5 8. " A92 6B5 " 92 <-- Test & score - 472.8 9. " A-9-26B5 " -9 <-- Test & score - 475.2 10. " .X.358 " 0.358 <-- Test & score - 707.4 11. " .X.3 58 " 0.3 <-- Test & score - 550.9 12. " .X-..3.58 " 0.3 <-- Test & score - 712.4 13. " .X.-.3.58 " -0.3 <-- Test & score - 719.5 14. " .X.-3.58 " -3.58 <-- Test & score - 864.1 15. "ABCD-&^%$.?:" 0 <-- Test only - OK I'll try to post the solution with 'E' with week-end... CU, Gerald.
From: "Gerald Squelart" <squelart@aemiaif.lip6.fr> Subject: Re: string -> real (with a brain), solution+explanation (w/ E) Date: 05 Jun 1998 00:00:00 GMT G'Day! Finally, here is my solution, handling E (not e nor +, sorry). << " 0 " + 0 DUP 1 WHILE DUP IP REPEAT 4 PICK ROT 1 + SWAP OVER DUP SUB ".-E0123456789" SWAP POS 1 + DUP 5 < * 11 * ROT + { -6 6 5 5 5 6 8 8 11 11 11 1 1 1 1 0 0 0 0 .1 .2 0 -4 3 -4 -4 0 7 0 0 .1 .2 0 -2 -2 -2 -2 0 0 0 0 10 .2 0 1 1 1 1 9 9 9 9 .1 .2 0 } SWAP GET IF DUP 0 < THEN NEG ROT DROP OVER SWAP END END DROP 10 * - 1 - SUB STR-> >> BYTES #4862h 417.5 ON-D A -> SPD 3.92 Results in time(s)*bytes*spd 1. "3.1415" 3.1415 <-- Test only - OK 2. "-3.1415" -3.1415 <-- Test only - OK 3. " 3.1415 " 3.1415 <-- Test only - OK 4. " -3.14 15 " -3.14 <-- Test only - OK 5. " -3.1 FOUR 15 " -3.1 <-- Test only - OK 6. " - 3.1 FOUR 15 " 3.1 <-- Test only - OK 7. " A926B5 " 926 <-- Test & score - 504 8. " A92 6B5 " 92 <-- Test & score - 435 9. " A-9-26B5 " -9 <-- Test & score - 443 10. " .X.358 " 0.358 <-- Test & score - 651 11. " .X.3 58 " 0.3 <-- Test & score - 508 12. " .X-..3.58 " 0.3 <-- Test & score - 657 13. " .X.-.3.58 " -0.3 <-- Test & score - 665 14. " .X.-3.58 " -3.58 <-- Test & score - 806 15. "ABCD-&^%$.?:" 0 <-- Test only - OK 16. "3e14" 3.E14 <-- Test only - OK 17. "e3vp7" 1.E3 <-- Test only - OK 18. "-e3vp7" -1.E3 <-- Test only - OK 19. " .E.358 " 0.358 <-- Test & score - 646 20. " .E.3 58 " 0.3 <-- Test & score - 510 21. " .X.-3.5E8 " -3.5E8 <-- Test & score - 873 That's it for today! So, now I'm waiting for some competition :-O Have a good week-end, Gerald.
From: <boettcher@cmu.edu> (Peter W Boettcher) Subject: Re: Need subroutine Date: Fri, 6 Mar 1998 00:48:57 -0500 Whoops! I somehow missed the builtin MIN. Now it's: \<< OVER SWAP - ABS DUP \<< MIN \>> STREAM POS GET \>> # 9A3h 48 It now takes ~5s on a 200 element random list. Pete Boettcher boettcher@cmu.edu
The Winner (and an assembly language version)
Download the final User RPL solution
From: "Gerald Squelart" <squelart@aemiaif.lip6.fr> Subject: Re: string -> real (with a brain), asm Date: 08 Jun 1998 00:00:00 GMT Yahoo! (tm) :-) What do I win? apart from your consideration :-O By the way, I've said it would be easy to adapt to asm (and Francisco Guzman asked for it)... In fact, the algorithm to find where is the number in the string is easy to port, but then there is some work to convert the string to a real number. I've made the conversion routine for EQW (in the Meta Kernel), I give it here (after this message), I leave the integration as an exercise :-) No support will be available from me, HP, MDG, da Vinci, Sparcom, BB Marketing, Maubert, or anyone else, unless you kindly ask in this newsgroup :-O Gerald. Jonathan M. Katz wrote in message <6lfglq$cu5@bgtnsc02.worldnet.att.net>... >Well, it appears that Gerald is the winner by default! I guess >this task was a little daunting, I sure expected it to be. Much >thanks to you Gerald, congrats! > >katz ********************************* This routine reads a well-formed real number (it will crash in there's something wrong). D0 points to the string, the first byte gives the number of characters (Pascal-like). The answer is in B, field W. It's in MK-Masd syntax: *foo is a label -> means GOYES SKIPNC { xy } means GONC foo xy *foo SKIPYES { xx } means GOYES foo xx *foo Some French words :-) moins means minus fin means end Of course, it's (C)MDG. You may use this code if you say where it come from, or if you hide it very well inside your programs... <ADV> Buy the Meta Kernel! http://www-miaif.lip6.fr/gerald/english/mk.html </ADV> *STRtoR B=0 W SETDEC B-1 X SETHEX C=0 S C+4 S ST=0 7 C=DAT0 B D=C B D0+2 D-1 B A=DAT0 B D0+2 LC 2D ?A#C B -> .NOTMOINS SETDEC B=-B-1 S SETHEX *.LOOP0 D-1 B SKIPNC { B=0 W RTN } A=DAT0 B D0+2 *.NOTMOINS LC 30 ?A#C B -> .NOT0 ?ST=0 7 SKIPYES { SETDEC B-1 X SETHEX } GOTO .LOOP0 *.NOT0 LC 2E ?A#C B -> .FIN0 ST=1 7 GOTO .LOOP0 *.LOOPN D-1 B SKIPNC { ?C=0 S RTNYES *.BSLM1 BSL M C+1 S GONC .BSLM1 RTN } A=DAT0 B D0+2 *.FIN0 LC 45 ?A=C B -> .EXP LC 2E ?A#C B SKIPYES { ST=1 7 GOTO .LOOPN } LC 30 C=A-C B BSL M C+1 S P=C 0 C=P 3 P=3 B=C P P=0 ?ST=1 7 SKIPYES { SETDEC B+1 X SETHEX } GOTO .LOOPN *.EXP ?C=0 S SKIPYES { *.BSLM2 BSL M C+1 S GONC .BSLM2 } D-1 B RTNC DSRC ST=0 7 D=0 X A=DAT0 B D0+2 LC 2D ?A#C B -> .NOTMOINSEXP ST=1 7 *.LOOPEXP D-1 S GOC .FINEXP A=DAT0 B D0+2 *.NOTMOINSEXP LC 30 C=A-C B DSL X D=C P GOTO .LOOPEXP *.FINEXP ?D=0 X RTNYES C=D X SETDEC C+C X SKIPNC { SETHEX ?ST=1 7 SKIPYES { *.MAXR LC 999999999999499 C=B S B=C W RTN } B=0 W RTN } ?ST=1 7 SKIPYES { C=B X C+C X SKIPC { C=D X B+C X C=B X C+C X SETHEX RTNNC GOTO .MAXR } C=D X B+C X SETHEX RTN } D=-D X C=B X C+C X SKIPNC { C=D X B+C X C=B X C+C X SETHEX RTNC B=0 W RTN } C=D X B+C X SETHEX RTN What? No comments? Why do you want comments??? Real programmers never comment their source :-)
Final Thoughts
From: "Jonathan M. Katz" <jmkatz-72@worldnet.att.net> Subject: Re: string -> real (with a brain), asm Date: 09 Jun 1998 00:00:00 GMT Gerald Squelart wrote: > Why do you use the no-E version instead of the with-E? Well, I invoke the string->real code immediately after getting an input string from the user. To make a long story short, here's the whole process in pseudo-code: << ... "How many new items in this list?" { "" alpha } INPUT ( string->real routine here ) 0 MAX m MIN IP ... >> So, as you can see, I basically limit the addition of new list items to an integer value between 0 and m (currently on the order of 15-20). Since I have no desire to let a silly user add 6.0223E23 new items (gadzooks!), I don't need any "E" handling for this particular application. In fact, I didn't really need any sign or decimal handling, either, but when I decided to post the original MC idea, I thought "Why make a routine that chokes on reals? If it can 'crack' a string to get a real number, that's *much* more useful than just cracking an unsigned integer." So that's where the sign and decimal data came from, and once I posted the MC, you had the good sense to ask about exponents. So there you have it... In any case, I have modified your great program slightly, so that it now can handle +, -, ., and E correctly. Here it is: \<< " 0 " + 0 DUP 1 WHILE DUP IP REPEAT 4 PICK ROT 1 + SWAP OVER DUP SUB "+-.E0123456789" SWAP POS 1 + DUP 6 < * 11 * ROT + { -6 6 5 5 5 6 8 8 11 11 11 1 1 1 1 0 0 0 0 .1 .2 0 -2 -2 -2 -2 0 0 0 0 10 .2 0 -2 -2 -2 -2 0 0 0 0 10 .2 0 -4 3 -4 -4 0 7 0 0 .1 .2 0 1 1 1 1 9 9 9 9 .1 .2 0 } SWAP GET IF DUP 0 < THEN NEG ROT DROP OVER SWAP END END DROP 1 - SUB STR\-> \>> BYTES: #9BB7h 449 After I sat & SST-ed through your prog. for half an hour, it's operation was clear to me, so modification was a snap! For anybody that wants a "case-insensitive" version (i.e. "e" and "E" handled the same way), just change the string to "+-.Ee0123456789" and duplicate the last row of the large list, just as rows 3 & 4 are duplicates. But since I have access to a routine that will automatically format a string all caps, I don't need to do this. Once again, thank you Gerald, and great job with the MC! It's fun to learn new programming techniques on this ng, especially slick ones like a finite-state machine. katz