Home • About Us • A-Z Index • Search * Contact Us • Register • Login • Press • Shop

The Open Brand -- Problem Reporting and Interpretations System

Problem Report 1278 Details

Show help | Quick Search | Submit a Test Suite Support Request | Click here to view your privileges
This page provides all information on Problem Report 1278.

Report 1278 Actions

Problem Report Number 1278

Submitter's Classification Test Suite problem

State Resolved

Resolution Test Suite Deficiency (TSD)

Problem Resolution ID TSD.X.0560

Raised 1970-01-01 08:00

Updated 2003-03-13 08:00

Published 1998-09-22 08:00

Product Standard Sockets (UNIX 95)

Certification Program The Open Brand certification program

Test Suite VSU version 4.1.1

Test Identification CAPIsockets/fconnect1.c 4

Problem Summary TSD4U.00258 This test may fail because of a race condition.

Problem Text
PROBLEM:

This test is testing whether connect returns "ETIMEDOUT" when
blocked I/O is terminated because of timeout expiration. In our case,
it sometimes finishes in an unresolved state because of a connect
failure in AF_UNIX code around line 1510 due to the fact that
the UNIX Domain socket is removed before the parent process can
connect to it. However, the problem isn't caused by the misbehavior
of connect; rather, it's caused by the dependencies created
between the AF_INET child and AF_UNIX child through semaphores.
Basically, at the end of the AF_INET test case, the parent process
is supposed to change semop to 1 for semnum#2 (the one AF_INET
child is sleeping for), and thus wake up this child. However, right
after doing so, the parent will continue and create the AF_UNIX
child. After the AF_UNIX child gets the CPU, it will also try
to wait for the parent on the semnum#2. This is where the race
between 2 child processes exist. The test result depends on who
sees the parent's change to semnum#2 first. If AF_INET child sees
it first, the test will pass; on the other hand, if AF_UNIX child
sees it first and is able to execute the socket close before the
parent process can execute the connect call, the test may fail.
Because of this sensitivity to process scheduling, the test does not
fail consistently. In fact, it usually passes. We have seen the
problem most often on uniprocessor systems, probably
because process scheduling is more serialized.

*****************************************************************************
EXAMPLES:

The following are behaviors of the parent, child#1(forked from AF_INET
case) and child#2(forked from AF_UNIX case) put into chronological
order that we observe for both PASS and UNRESOLVED cases

Note:
post---means "avs_post_event" is being called, and sem_op is 1.
wait---means "avs_wait_event" is being called, and sem_op is -1.

For the PASS case, this is what's happening,

Parent: Child #1 (AF_INET): Child #2 (AF_UNIX)
------- ------------------- ------------------
1. fork Child #1 0. start
1. post Event #4
2. wait for Event #4
3. post Event #2
4. fork Child #2 0. start
2. wait for Event #2 1. create socket (bind)
3. exit 2. post Event #4
5. wait for Event #4
6. access socket (connect)
7. post Event #2 3. wait for Event #2
8. exit 4. remove socket (close)
5. exit

For the UNRESOLVED case,

Parent: Child #1 (AF_INET): Child #2 (AF_UNIX)
------- ------------------- ------------------
1. fork Child #1 0. start
1. post Event #4
2. wait for Event #4
3. post Event #2
4. fork Child #2 0. start
1. create socket (bind)
2. post Event #4
3. wait for Event #2
4. remove socket (close)
5. exit
5. wait for Event #4
6. access socket (connect; fails)
7. ....
8. error exit 2. wait for Event #2 (times out)
3. error exit

*****************************************************************************
ANALYSIS:

As we can see from the unresolved case, after the parent changes
the semop to 1 for semnum 2, child 2, instead of child 1, can get
the control of CPU until it reaches step 3. At this point, it can
wrongfully assume that the parent has already posted the event,
and thus not wait for the semaphore. Consequently, it will continue
its operation and close and remove the socket while the parent is
still assuming that the child is still active, and it can connect
to the Unix Domain socket created by the child.

*****************************************************************************
SUGGESTED WORK AROUND:

Use a different set of semaphores, either semid or semnum
(in struct sembuf), for each parent/child pair---for example, for
each AF_INET and AF_UNIX path. This change will keep the intention
of the test case without causing the unintended dependencies.

Test Output
520|1 4 19712 1 1|SPEC1170TESTSUITE CASE 4
520|1 4 19712 1 2|If the implementation supports a communications domain
520|1 4 19712 1 3|and a connection-oriented socket type:
520|1 4 19712 1 4|A call to int connect(int socket, const struct
520|1 4 19712 1 5|sockaddr *address, size_t address_len) when blocking
520|1 4 19712 1 6|is terminated due to expiration of the timeout shall
520|1 4 19712 1 7|abort the connection attempt, set errno to ETIMEDOUT,
520|1 4 19712 1 8|and return -1.
520|1 4 19734 1 1|PREP: Get VSU_CONNECT_TIMEOUT configuration variable
520|1 4 19734 1 2|TEST: AF_INET SOCK_STREAM
520|1 4 19734 1 3|PREP: Create test sockaddr_in: address = 15.0.69.35,
port = 4357
520|1 4 19735 1 1|PREP: Child: create socket
520|1 4 19735 1 2|PREP: Child: bind to socket
520|1 4 19735 1 3|PREP: Child: listen on socket with backlog of 2
520|1 4 19735 1 4|PREP: Child: tell parent ready
520|1 4 19735 1 5|PREP: Child: wait for parent to complete
520|1 4 19734 2 1|PREP: Create three sockets
520|1 4 19734 2 2|PREP: Wait for child to be ready
520|1 4 19734 2 3|PREP: Connect twice to fill queue
520|1 4 19734 2 4|PREP: Set long alarm
520|1 4 19734 2 5|TEST: connect blocks until timeout elapses
520|1 4 19734 2 6|TEST: Return value
520|1 4 19734 2 7|TEST: errno value
520|1 4 19734 2 8|TEST: Blocked for correct interval
520|1 4 19734 2 9|CLEANUP: Close sockets, end child
520|1 4 19734 2 10|INFO: Notify waiting child
520|1 4 19734 2 11|TEST: AF_UNIX SOCK_STREAM
520|1 4 19734 2 12|PREP: Create test sockaddr_un: path =

Review Information

Review Type TSMA Review

Start Date null

Completed null

Status Complete

Review Recommendation No Resolution Given

Review Response
We agree this is a test suite deficiency in the test
suite version(s) listed.

Review Type SA Review

Start Date null

Completed null

Status Complete

Review Resolution Test Suite Deficiency (TSD)

Review Conclusion
This is an agreed Test Suite Deficiency.

Problem Reporting System Options:

View Report 1278

List All PRs

Search Reports

Email the System Administrator

View the The Open Brand Interpretations Database User Manual

Contact the Certification Authority