|
Home About Us A-Z Index Search * Contact Us Register Login Press ShopThe Open Brand -- Problem Reporting and Interpretations System |
Problem Report 2462 Details
Show help | Quick Search | Submit a Test Suite Support Request | Click here to view your privileges
This page provides all information on Problem Report 2462.
Report 2462 Actions
Problem Report Number 2462 Submitter's Classification Test Suite problem State Resolved Resolution Rejected (REJ) Problem Resolution ID REJ.X.0669 Raised 2005-05-17 01:29 Updated 2005-06-07 20:23 Published 2005-06-07 20:23 Product Standard Internationalised System Calls and Libraries Extended V3 (UNIX 03) Certification Program The Open Brand certification program Test Suite VSX4 version 4.6.4 Test Identification XPG4.os/genuts/regcomp/T.regcomp 43 Specification Base Definitions Issue 6 Problem Summary regcomp #43 does not allow behavior specified in XBD6 Problem Text According to Austin Group Interpretation Request #036:
There seems to be a discrepancy between the description of BREs in
XBD6 and the description of regcomp() in XSH6 as regards the treatment
of nested subpatterns with a following * repeater.
The example that prompted this is whether:
echo aba | sed 's/\(a\(b\)*\)*/<\1|\2>/'
should output <a|b> or <a|>.
According to the description of BREs:
"If the subexpression referenced by the back-reference matches
more than one string because of an asterisk ( '*' ) or an interval
expression (see item (5)), the back-reference shall match the last
(rightmost) of these strings."
which seems to imply that the correct output is <a|b> since the last
string that the second subpattern matches is the b that it got from
the first iteration of the first subpattern. (In the second iteration
of the first subpattern the second subpattern doesn't match anything.)
Several sed implementations do output <a|b>.
However, the description of regcomp() says:
"3. If subexpression i is contained within another subexpression
j, and i is not contained within any other subexpression that
is contained within j, and a match of subexpression j is
reported in pmatch[j], then the match or non-match of
subexpression i reported in pmatch[i] shall be as described in
1. and 2. above, but within the substring reported in pmatch[j]
rather than the whole string. The offsets in pmatch[i] are
still relative to the start of string."
Since the second subpattern in the example does not match anything
within the last string matched by the first subpattern, this implies
that regcomp() would report the second subpattern as a non-match.
The functionality being tested in regcomp #43 is the same functionality
described in this interpretation request. The response for the
interpretation was:
The standard is unclear on this issue, and no conformance
distinction can be made between alternative implementations based
on this. This is being referred to the sponsor.
The test is testing according to the specification of the regcomp()
in XSH6 but not allowing the specification in XBD6. Since the
interpretation says "no comformance distinction can be made", the tests
cannot require conformance to one or the other but should allow for
either behavior.Test Output ****************************************************************************************************************
/tset/XPG4.os/genuts/regcomp/T.regcomp 43 Failed
Test Description:
If regcomp() and regexec() are supported:
When subexpression i in a regular expression compiled by a
call to
regcomp() and compared against a string using regexec() is
contained within subexpression j and is not contained within any
other subexpression that is contained within subexpression j
and a
match of subexpression j is reported in pmatch[j] and
subexpression i does not participate in that match, then the
offsets in pmatch[i] are set to -1.
Testing Requirement(s):
Test the following cases:
+ In a basic regular expression, subexpressions followed
by *
and \{ \} which match zero times.
+ In an extended regular expression, subexpressions followed
by ?, * and { } which match zero times.
+ In an extended regular expression, a subexpression on one
side of a |, the other side of which matches.
Otherwise:
A call to either regcomp() or regexec() returns REG_ENOSYS and
sets errno to ENOSYS.
Test Strategy:
FOR basic regular expressions
CREATE nested sub-expressions, where non-matching
sub-expression
is followed by * and \{0,\}
COMPILE regular expressions using regcomp()
CALL regexec() with string to match whole pattern
VERIFY regexec() returned zero.
FOR each sub-expression
VERIFY that rm_so and rm_eo are set correctly
FOR extended regular expressions
CREATE nested sub-expressions, where non-matching
sub-expression
is followed by ?, *, {0,} or is either side of "|"
COMPILE regular expressions using regcomp()
CALL regexec() with string to match whole pattern
VERIFY regexec() returned zero.
FOR each sub-expression
VERIFY that rm_so and rm_eo are set correctly
Test Information:
After regexec() on basic regular expression
"\(ab\(xyz\)*\)\(c\(d\)*\)\{0,3\}" for input string "abcdcdc"
with nmatch set to 5,
pmatch offsets were incorrect for sub-expression 4
Expected rm_so = -1, actual = 5
Expected rm_eo = -1, actual = 6
After regexec() on extended regular expression
"(ab(xy)?(z)*)(c(d){0,})+" for input string "abcdcdc"
with nmatch set to 6,
pmatch offsets were incorrect for sub-expression 5
Expected rm_so = -1, actual = 5
Expected rm_eo = -1, actual = 6
After regexec() on extended regular expression "ab((cd)|c)*" for
input string "abcdcdc"
with nmatch set to 3,
pmatch offsets were incorrect for sub-expression 2
Expected rm_so = -1, actual = 4
Expected rm_eo = -1, actual = 6
****************************************************************************************************************Review Information
Review Type TSMA Review Start Date 2005-05-17 01:29 Last Updated 2005-05-17 19:07 Completed 2005-05-17 19:07 Status Complete Review Recommendation Rejected (REJ) Review Response Austin Group interpretation AI-036 specifically concerns the requirements
for back-references in regular expressions, for example the \1 and \2 in
the BRE \(a\(b\)*\)*\1\2, (and therefore also the use of
back-references
in the replacement string of a substitution). The interpretation
identifies an inconsistency between the back-reference requirements stated
in XBD6 for nested subpatterns and the requirements stated on the
regcomp() page in XSH6 for values returned in pmatch[] by regexec().
The result of interpretation AI-036 is that no conformance distinction
can be made concerning the behaviour of back-references in BREs (and
substitution replacement strings). It does not affect the conformance
requirements for values returned in pmatch[], which are clear and are
not called into question by the interpretation.
Effectively the interpretation allows implementations where the
behaviour of back-references in a BRE with nested subpatterns is
inconsistent with the information returned in pmatch[] in certain
circumstances. For example calls to regcomp() and regexec() for a
particular BRE could behave such that a \1 in the BRE matched a
non-empty string but a no-match indication was returned in pmatch[1].
But it is the \1 match that is allowed to vary here, not the
information returned in pmatch[1].
Note also that interpretation AI-036 only applies to BREs. There is
no inconsistency between the XBD6 requirements for EREs and the XSH6
requirements for regexec() because the standard does not require EREs
to support back-references.
The test failure in this PR shows that the implementation does not
meet the requirements concerning values returned in pmatch[] by
regexec() for nested subpatterns in both BREs and EREs.
Review Type SA Review Start Date 2005-05-17 18:07 Last Updated 2005-05-18 06:05 Completed 2005-05-18 06:05 Status Complete Review Resolution Rejected (REJ) Review Conclusion This PR is rejected as a TSD. The SA agrees with the TSMA who believes
that the cited Austin Group Interpretation Request #036 is not
relevant to the test purpose, and that the test is valid as written.
Problem Reporting System Options:
- View Report 2462
- List All PRs
- Search Reports
- Email the System Administrator
- View the The Open Brand Interpretations Database User Manual
Contact the Certification Authority