Download Reference Manual
The Developer's Library for D
About Wiki Forums Source Search Contact

Ticket #1387 (new defect)

Opened 15 years ago

Last modified 14 years ago

tango.text.Regex is broken

Reported by: HeiHon Assigned to: larsivi
Priority: major Milestone: 1.0
Component: Tango Version: 0.99.7 Dominik
Keywords: Regex Cc:

Description

tango.text.Regex is broken, even basic regexes fail.

I've hacked up some basic tests and tango.regex fails, where e.g. dwin.text.pcre.RegExp? (and of course perl) succeeds.

I did not test phobos (yet) - I want to use Tango ;-)

Please see attached ZIP:

regex.conf - regex text file used by all tests regextango.d - testing tango regex against regex.conf regexdwin.d - testing dwin pcre regex against regex.conf regex.d - testing tango and dwin against regex.conf regex.pl - testing perl against regex.conf

I used tango/trunk Rev. 4145 vs. dwin Rev. 232 (thanks yidabu ;-) vs. perl >= 5.6.1. OS: Windows XP

To compile (with dsss 0.78): dsss build -release -O $(FileNameExt?)

Attachments

regex.zip (4.6 kB) - added by HeiHon on 12/03/08 00:55:24.
ZIP with test files

Change History

12/03/08 00:55:24 changed by HeiHon

  • attachment regex.zip added.

ZIP with test files

12/03/08 01:16:14 changed by HeiHon

Sorry, I forgot: (1) Thanks a lot, all you tango guys :) (2) dmd 1.037/dsss 0.78/tango Rev. 4145/Windows XP Home SP 3

01/07/09 22:36:49 changed by larsivi

  • owner changed from kris to jascha.

03/29/09 13:16:23 changed by larsivi

  • milestone changed from 0.99.8 to 0.99.9.

04/12/09 20:14:29 changed by mp4

Here is the result of the test:

*** Numerical Matches ***

Tango: Matching regex '\d' against text 'abc 1234 xy 56 def'

OK: Matches '[1][2][3][4][5][6]'

pcre: Matching regex '\d' against text 'abc 1234 xy 56 def'

OK: Matches '[1][2][3][4][5][6]'

Tango: Matching regex '\d\d?' against text 'abc 1234 xy 56 def'

OK: Matches '[12][34][56]'

pcre: Matching regex '\d\d?' against text 'abc 1234 xy 56 def'

OK: Matches '[12][34][56]'

Tango: Matching regex '\d{1,2}' against text 'abc 1234 xy 56 def'

OK: Matches '[12][34][56]'

pcre: Matching regex '\d{1,2}' against text 'abc 1234 xy 56 def'

OK: Matches '[12][34][56]'

Tango: Matching regex '\d{1,3}' against text 'abc 1234 xy 56 def'

--> FAIL: Should match '[123][4][56]' but matches '[1234 xy 56]'

pcre: Matching regex '\d{1,3}' against text 'abc 1234 xy 56 def'

OK: Matches '[123][4][56]'

Tango: Matching regex '\d{2,3}' against text 'abc 1234 xy 56 def'

OK: Matches '[123][56]'

pcre: Matching regex '\d{2,3}' against text 'abc 1234 xy 56 def'

OK: Matches '[123][56]'

Tango: Matching regex '\d{2,4}' against text 'abc 1234 xy 56 def'

OK: Matches '[1234][56]'

pcre: Matching regex '\d{2,4}' against text 'abc 1234 xy 56 def'

OK: Matches '[1234][56]'

Tango: Matching regex '\d{1,4}' against text 'abc 1234 xy 56 def'

--> FAIL: Should match '[1234][56]' but matches '[1234 xy 56]'

pcre: Matching regex '\d{1,4}' against text 'abc 1234 xy 56 def'

OK: Matches '[1234][56]'

Tango: Matching regex '\d{3,4}' against text 'abc 1234 xy 56 def'

OK: Matches '[1234]'

pcre: Matching regex '\d{3,4}' against text 'abc 1234 xy 56 def'

OK: Matches '[1234]'

Tango: Matching regex '\d+' against text 'abc 1234 xy 56 def'

--> FAIL: Should match '[1234][56]' but matches '[1234 xy 56]'

pcre: Matching regex '\d+' against text 'abc 1234 xy 56 def'

OK: Matches '[1234][56]'

*** Matching Spaces ***

Tango: Matching regex '\s' against text 'abc def'

OK: Matches '[ ]'

pcre: Matching regex '\s' against text 'abc def'

OK: Matches '[ ]'

Tango: Matching regex '\s+' against text 'abc def'

OK: Matches '[ ]'

pcre: Matching regex '\s+' against text 'abc def'

OK: Matches '[ ]'

*** Matching Character Sets ***

Tango: Matching regex '[a-z]' against text 'abc 123 def'

OK: Matches '[a][b][c][d][e][f]'

pcre: Matching regex '[a-z]' against text 'abc 123 def'

OK: Matches '[a][b][c][d][e][f]'

Tango: Matching regex '[a-z]' against text 'abc 123 def'

OK: Matches '[ ][1][2][3][ ]'

pcre: Matching regex '[a-z]' against text 'abc 123 def'

OK: Matches '[ ][1][2][3][ ]'

Tango: Matching regex '[a-z]+' against text 'abc 123 def'

--> FAIL: Should match '[abc][def]' but matches '[abc 123 def]'

pcre: Matching regex '[a-z]+' against text 'abc 123 def'

OK: Matches '[abc][def]'

Tango: Matching regex '[a-z]+' against text 'abc 123 def'

OK: Matches '[ 123 ]'

pcre: Matching regex '[a-z]+' against text 'abc 123 def'

OK: Matches '[ 123 ]'

Tango: Matching regex '[a-z]{1,3}' against text 'abc 123 def'

--> FAIL: Should match '[abc][def]' but matches '[abc 123 def]'

pcre: Matching regex '[a-z]{1,3}' against text 'abc 123 def'

OK: Matches '[abc][def]'

Tango: Matching regex '[a-z]{1,3}' against text 'abc 123 def'

--> FAIL: Should match '[ 12][3 ]' but matches '[ 123 ]'

pcre: Matching regex '[a-z]{1,3}' against text 'abc 123 def'

OK: Matches '[ 12][3 ]'

Tango: Matching regex '[a-z]{1,4}' against text 'abc 123 def'

--> FAIL: Should match '[abc][def]' but matches '[abc 123 def]'

pcre: Matching regex '[a-z]{1,4}' against text 'abc 123 def'

OK: Matches '[abc][def]'

Tango: Matching regex '[a-z]{1,4}' against text 'abc 123 def'

--> FAIL: Should match '[ 123][ ]' but matches '[ 123 ]'

pcre: Matching regex '[a-z]{1,4}' against text 'abc 123 def'

OK: Matches '[ 123][ ]'

Tango: Matching regex '[a-z]{2,4}' against text 'abc 123 def'

--> FAIL: Should match '[abc][def]' but matches '[abc 123 def]'

pcre: Matching regex '[a-z]{2,4}' against text 'abc 123 def'

OK: Matches '[abc][def]'

Tango: Matching regex '[a-z]{2,4}' against text 'abc 123 def'

--> FAIL: Should match '[ 123]' but matches '[ 123 ]'

pcre: Matching regex '[a-z]{2,4}' against text 'abc 123 def'

OK: Matches '[ 123]'

05/21/09 10:39:25 changed by larsivi

Note that the [a-z]{x,y} case appears to be covered by #959. It appears that the matching priorities are wrong with reluctant patterns. This may be a good starting hint for those interested in looking at the parser?

02/07/10 21:42:20 changed by larsivi

  • milestone changed from 0.99.9 to 1.0.

06/06/10 19:58:55 changed by kris

  • owner changed from jascha to larsivi.