|
Home About Us A-Z Index Search * Contact Us Register Login Press ShopThe Open Brand -- Problem Reporting and Interpretations System |
Problem Report 2471 Details
Show help | Quick Search | Submit a Test Suite Support Request | Click here to view your privileges
This page provides all information on Problem Report 2471.
Report 2471 Actions
Problem Report Number 2471 Submitter's Classification Test Suite problem State Resolved Resolution Rejected (REJ) Problem Resolution ID REJ.X.0671 Raised 2005-07-26 07:07 Updated 2005-07-28 18:36 Published 2005-07-28 18:36 Product Standard Internationalised System Calls and Libraries Extended (UNIX 95) Certification Program The Open Brand certification program Test Suite VSX4 version 4.6.4LT Test Identification XPG4.os/genuts/regcomp tst 43 Linked Problem Reports REJ.X.0669 Problem Summary XPG4.os/genuts/regcomp back-references issue in test 43 Problem Text Our test result gives on our platform:
Using string='abcdcdc' against expr='\(ab\(xyz\)*\)\(c\(d\)*\)\{0,3\}'
Should rm_so = 0, actual = 0 a in pos 0
Should rm_eo = 7, actual = 7 c in pos 6 so \0 = abcdcdc
Should rm_so = 0, actual = 0 a in pos 0
Should rm_eo = 2, actual = 2 b in pos 1 so \1 = ab.....
Should rm_so = -1, actual = -1 no match
Should rm_eo = -1, actual = -1 no match so \2 = <nothing>
Should rm_so = 6, actual = 6 c in pos 6
Should rm_eo = 7, actual = 7 LAST CHAR so \3=......c
because we return the last ( clause 1 ) match but actually 3
matches
Match index 0 ..cdcdc
1 ....cdc
2 ......c
all match
4 0 Should rm_so = -1, actual = 5 d in pos 5
4 0 Should rm_eo = -1, actual = 6 LAST char so \4=
^.....d.
Despite rule 2 says "if followed by a {}", it also says "and matched 0
times".
Since \4 participated in \3 the 1st 2 matches ( out
of 3 ) , it is not 0 so rule
1 apply. So \4 was in \3 for match 0 and 1 and
therefore match more than 0 times.
I think this is an interpretation issue and we believe rule 1 apply. Not
sure this is really a TSD
buy filing it as such. I am listing REJ.X.0669 as reference to what I
believe the same issue.
For references, rules 1 and 2 are:
1. If subexpression i in a regular expression is not
contained within another subexpression, and it
participated in
the match several times, then the byte offsets in
pmatch[i]
will delimit the last such match.
2. If subexpression i is not contained within another
subexpression, and it did not participate in an
otherwise"
successful match, the byte offsets in pmatch[i] will be
-1. A
subexpression does not participate in the match when: *
or \{
\} appears immediately after the subexpression in a
basic
regular expression, or *, ?, or { } appears immediately
after
the subexpression in an extended regular expression, and
the
subexpression did not match (matched 0 times)
or:
| is used in an extended regular expression to select
this
subexpression or another, and the other subexpression
matched.Test Output After regexec() on basic regular expression
"\(ab\(xyz\)*\)\(c\(d\)*\)\{0,3\}" for input string "abcdcdc"
with nmatch set to 5,
pmatch offsets were incorrect for sub-expression 4
Expected rm_so = -1, actual = 5
Expected rm_eo = -1, actual = 6Review Information
Review Type TSMA Review Start Date 2005-07-26 07:07 Last Updated 2005-07-27 18:54 Completed 2005-07-27 18:54 Status Complete Review Recommendation Rejected (REJ) Review Response The submitter is trying to apply rules 1 and 2 to subexpression 4
incorrectly. These two rules do not apply directly to subexpression 4
because they state "If subexpression i in a regular expression is not
contained within another subexpression", but subexpression 4 is
contained within another subexpression. They only apply indirectly, in
the way specified in rule 3, i.e. "as described in 1. and 2. above, but
within the substring reported in pmatch[j] rather than the whole string".
The match reported in the test output (rm_so=5, rm_eo=6) is the last
match within the whole string, it is not a match within the substring
reported in pmatch[3] as required by rule 3. Subexpression 4 matches 0
times within the subtring reported in pmatch[3], and therefore the
returned offsets should be -1 as expected by the test.
The submitter mentions REJ.X.0669 which was a UNIX03 interpretation
request based on a conflict between the regcomp() description and some
new text in XBD added in SUSv3. Since this new text did not appear in
SUSv1 it is not relevant to UNIX95. (In any case, if it had been
relevant then the reason for rejection of REJ.X.0669 would also have
applied.)
This issue has been discussed in depth by the Austin Group in the past
and the consensus is that the behaviour required for regcomp() by
POSIX.2-1992 and POSIX.1-2001/SUSv3 is the behaviour expected by the
test, despite it being known that there are several existing systems
which do not implement the required behaviour.
Review Type SA Review Start Date 2005-07-27 17:54 Last Updated 2005-07-28 00:26 Completed 2005-07-28 00:26 Status Complete Review Resolution Rejected (REJ) Review Conclusion It is recommended that this be rejected. For rationale see the TSMA
review comments.
Problem Reporting System Options:
- View Report 2471
- List All PRs
- Search Reports
- Email the System Administrator
- View the The Open Brand Interpretations Database User Manual
Contact the Certification Authority