I'm trying to use locks and condition variable from the locks module, the documentation I found here seems to be quite poor, then I assumed that condition variables works like std::condition_variable in C++ that I've already used in the past. In particular I assumed that the wait function needs to be called on a Cond while holding a lock and this causes the thread to be stopped and the locks released until another thread calls signal on the Cond. This seems not to be the case since this very basic example results in a deadlock.

What's wrong with it and how does Cond works?

import locks
import deques
import cpuinfo


var
  thr = newSeq[Thread[int]](countProcessors())
  lock: Lock
  cond : Cond

proc threadFunc(i: int) {.thread.} =
    withLock(lock):
        cond.wait(lock)
        echo i

initLock(lock)
initCond(cond)

withLock(lock):
    for i in 0..<thr.len:
        createThread(thr[i], threadFunc, i)

for i in 0..<thr.len:
    cond.signal()

for i in 0..<thr.len:
    joinThreads(thr)

2017-10-04 18:42:17

The implementation of condition variables, on linux at least, is pthreads. But the implementation is tangential in this case.

I think this example would deadlock in C++ as well.

You signal() all the threads in the main thread only once, and then wait for them.

It's possible that the thread scheduler will context switch to one of the other threads between each iteration of the signal() loop, in which case your code would work, but that is unlikely.

More likely, One thread will win the lock from that signal() loop, and the rest will go back to sleep. After that initial loop, there is no mechanism to wake the threads up again.

You need to either put a signal() call in threadFunc (cooperative locking), or have the main thread continue calling signal() instead of waiting.


note: that the semantics of signal() are to wake a single thread waiting on the conditional (notify_one in std::condtion_variable or pthread_cond_signal in pthreads)

It is not broadcast semantics. I.E. wakes all threads (notify_all in std::condition_variable or pthread_cond_broadcast in pthreads).

Though it might be a nice to add that api ... hint to @araq and core devs lol.

2017-10-05 01:53:15

I think the problem was that when the main thread enetered the signal loop, no thread had actually started executing yet so thaht when the signal was called there were no threads waiting on the Cond, I added a sleep(1000) between the createThread loop and the signal loop and the code seems to work.

import locks
import cpuinfo
import strutils
import os


var
  thr = newSeq[Thread[int]](countProcessors())
  lock: Lock
  cond : Cond

var counter = 0
proc threadFunc(i: int) {.thread.} =
    counter += 1
    withLock(lock):
        cond.wait(lock)
        counter -= 1
        echo "$# $#" % [$i, $counter]

initLock(lock)
initCond(cond)

withLock(lock):
    for i in 0..<thr.len:
        createThread(thr[i], threadFunc, i)

sleep(1000)

for i in 0..<thr.len:
    cond.signal()

for i in 0..<thr.len:
    joinThreads(thr)

More likely, One thread will win the lock from that signal() loop, and the rest will go back to sleep. After that initial loop, there is no mechanism to wake the threads up again

Do you mean that it is possible that I call signal from the main thread with, say, two threads waiting on a Cond, then I call signal again from the main thread on the same cond, if the scheduler does not context switch to one of the two waiting thread between the two calls just one thread is woken up and the second one remains waiting?

2017-10-05 07:04:37

This is an example of why multithreading is hard

The Stack Overflow link from @cdunn2001 is good. It explains the concept better than me. I will try anyway:

You can't rely on all the threads being in a certain state at the same time. When you create those other threads in the creation loop, they start running on their own timeline. those threads are not guaranteed to have all gotten to the condition wait in their code by the time the main loop finishes.

If a thread isn't in the condition variable wait queue when the signal is called, it will never wake up, the signal calls are not "saved" in any way.

You "guaranteed" that all the threads are waiting on the condition variable wait queue by putting the sleep() call in. You basically gave the thread scheduler extra time to run all those other threads to the correct point.


Do you mean that it is possible that I call signal from the main thread with, say, two threads waiting on a Cond, then I call signal again from the main thread on the same cond, if the scheduler does not context switch to one of the two waiting thread between the two calls just one thread is woken up and the second one remains waiting?

No, It's more subtle than that. A condition variable is just a queue of threads. A cond.wait(), push's the thread onto the queue. A cond.signal() pops a thread off the queue.

In the two threads waiting on a Cond example that you described, There are two threads on the queue, and two calls to cond.signal(), causing both threads to be popped off the queue. The key is that both threads are actually in the Cond queue, not "almost" in the Cond queue

That's just the queue part, next you have the lock. Both threads are now awake, but both must "race" to win the lock. One thread will win the lock, and the other will lose, and be placed on the (different) queue waiting for the lock. (standard lock semantics) Which will work out just fine. No race condition or dead lock.

2017-10-05 21:13:22
From my experience, the condition will be handled only once no matter how many it's signaled. 2017-10-06 02:45:40

Hi community, just my 2cents on that topic:

I wrote a simple test (modified from the threads module doc) to explore the signalling behaviour. If signal is called first, the signal is not lost and a later wait() is not locking (on my machine). would be nice to share experience on other platforms..

here is the code:

# example taken from the module threads doc and modified for one child thread
# to explore the semantics of lock and signal

# code tested only on windows10 with a single core machine
# compiled with --d:release --threads:on
# Nim Compiler Version 0.17.3 (2017-09-16) [Windows: amd64]
# Copyright (c) 2006-2017 by Andreas Rumpf
# git hash: 12af4a3f8ac290f5c6f7d208f1a1951a141449ff
# active boot switches: -d:release

# if the signal is called before the wait is executed
# the signal is not lost, and the code is not locking

import locks,os

# globals
var
  thr: Thread[int] # don´t know if there is a untyped thread possible
  cLock: Lock
  lockCond: Cond

proc threadFunc (param : int) {.thread.} =
    signal(lockCond) # first signal call
    signal(lockCond) # if you signal twice the second signal is lost (no queue)
    echo "childthread: signal executed"

initLock(cLock)
initCond(lockCond)

createThread(thr, threadFunc,0)

echo "mainthread:begin"
sleep(5000) # ensure that the signal of the childthread is called first
echo "start waiting"
wait(lockCond,cLock)
# wait(lockCond,cLock)
# uncomment to check if there is a queue behind; on windows the (expected)
# behaviour is a deadlock for the second wait (second signal call is lost)
echo "end waiting"

joinThreads(thr) # should not block if the childthread is already finished
deinitLock(cLock)
deinitCond(lockCond)
echo "end"

2017-10-06 21:23:24

@Mikra,

There is an error in your code. You have to acquire the lock before waiting on a condition variable, or the behavior is undefined.


@see:

http://en.cppreference.com/w/cpp/thread/condition_variable/wait

"Calling this function if lock.mutex() is not locked by the current thread is undefined behavior. "

http://pubs.opengroup.org/onlinepubs/7908799/xsh/pthread_cond_wait.html

"The pthread_cond_wait() and pthread_cond_timedwait() functions are used to block on a condition variable. They are called with mutex locked by the calling thread or undefined behaviour will result."

https://msdn.microsoft.com/en-us/library/windows/desktop/ms682052(v=vs.85).aspx

The Windows implementation doesn't explicitly state the behavior, but all the examples have a call to 'EnterCriticalSection(&CritSection);' before they wait on the condition.

2017-10-08 05:04:14

@Mikra, On Mac OSX 10.12 (Sierra), I tested your code and a modified version like this:

withLock(cLock):
  wait(lockCond,cLock)
  wait(lockCond,cLock)

The program consistently deadlocks. The signals are consistently lost for me. Which is what I expect to happen. ¯\_(ツ)_/¯

2017-10-08 05:20:27

Hi @rayman2220, thank you very much for your valuable response. You are right; I missed that "withLock(cLock):" before the wait(lockCond,cLock). Thanks for correcting me. Also thanks for the links, but I wonder how unspecified is defined ...

Just for clarification: your example also deadlocks with just one wait(lockCond,cLock)? Or I missed something? At least one signal shouldn`t lost because you can´t ensure that the waiting-thread is "faster"(waiting before signaling) than the signaling one. Because of that I added the extra "sleep(5000)" to test this specific condition.

2017-10-08 19:33:56
<<<••12••>>>