Things I learned while programming asynchronous, concurrent code in Swift

7-8-2016

Having written about 80.000 lines of Swift code for Warp, I have learned a thing or two about the best way to deal with concurrency in Swift.

You must not make assumptions about how callback blocks are called.

Usually you want a callback function to do some work in a certain queue. Although you can design your functions to accept a DispatchQueue which the function can use to dispatch the callback on, this adds complexity. Instead I chose to allow callbacks to be called on any queue/thread, and make no assumptions whatsoever about it. This means that many callbacks start by dispatching a new block to a known queue (either the main queue or some specific background queue).

Another issue with callbacks is that some functions run them synchronously, and some others will dispatch them asynchronously. Some functions may call back synchronously if they detect an error right away, but otherwise call back asynchronously. While you can make this explicit, I chose not to make any assumptions about this, and design my callbacks in such a way that it doesn’t matter whether they are callled synchronously or asynchronously.

It is very easy, and often very undesired, to call callback functions more than once.

Calling callbacks more than once often leads to issues and a lot of confusion. Use the following to annotate callbacks that may only ever be called once (this will cause a fatalError on the second time the callback is called):

/** Wrap a block so that it can be called only once. Calling the returned block twice results in a fatal error. After
the first call but before returning the result from the wrapped block, the wrapped block is dereferenced. */
public func once<P, R>(_ block: ((P) -> (R))) -> ((P) -> (R)) {
#if DEBUG
var blockReference: ((P) -> (R))? = block
let mutex = Mutex()
return {(p: P) -> (R) in
let block = mutex.locked { () -> ((P) -> (R)) in
assert(blockReference != nil, "callback called twice!")
let r = blockReference!
blockReference = nil
return r
}
return block(p)
}
#else
return block
#endif
}
// Use as follows:
func longRunningOperation(_ callback: (String) -> ()) {
callback("Hello")
callback("World") // this will fail!
}
longRunningOperation(once { result in
print(result)
})
view raw once.swift hosted with ❤ by GitHub

Forgetting to call a callback is another source of bugs, but unfortunately much harder to check.

Some things really only should be called on the main thread.

Almost all user-interface related components of Cocoa/AppKit cannot be called from threads other than the main thread. Unfortunately, programming with blocks, you never really know which thread and queue a callback is coming from. Hence, you should always explicitly dispatch code that must run on the main thread to the main queue. Additionally, I found it very helpful to add assertions to check whether code that should run on the main queue/thread is running on the main queue. I use the following shorthand, which conveniently prints the offending location in case it inadvertently gets called off the main thread. Additionally I often use the following shorthand function for dispatching blocks to the main thread (note, you might want to add error rethrowing):

public func assertMainThread(_ file: StaticString = #file, line: UInt = #line) {
assert(Thread.isMainThread, "Code at \(file):\(line) must run on main thread!")
}
public func asyncMain(_ block: () -> ()) {
DispatchQueue.main.async(execute: block)
}

In debug mode, you could even add checks that ensure that blocks dispatched to the main thread don’t go wild and block the main thread for too long.

Swift.print will mangle log messages from different threads.

You should probably use one of the new os_log_* functions in macOS 10.12, but the following will do as long as you don’t log a lot (it has the overhead of dispatching a block and adding more work to the main queue):

public func trace(_ message: String, file: StaticString = #file, line: UInt = #line) {
#if DEBUG
DispatchQueue.main.async {
print(message)
}
#endif
}
view raw trace.swift hosted with ❤ by GitHub

If you’re not careful, UI objects may be deinitialized on non-main threads.

If you capture references to UI objects on closures that execute in the background, they may get deinitialized on another thread than the main thread. This causes all sorts of issues (most notably you will see Cocoa error messages involving ‘bear traps’…). The solution is simple: always make sure that UI objects are only weakly captured in closures that may run on other threads than the main thread. In order to manipulate UI objects from such a closure, dispatch a new closure on the main queue, which first attempts to make the weak reference into a strong one, and then performs your UI actions.

If you do call back to UI (for e.g. progress reports), make sure to throttle.

If you have a background operation that provides status reports to the UI, you should take care to ensure that it doesn’t make too many reports, or it will slow down the user interface significantly for no added benefit. I use the following code to throttle such reporting callbacks (if the callback frequency exceeds a certain interval, the function will only allow one callback to go through every interval):

For simple things, queues are overkill.

Queues are a nice way to serialize access to certain things, but in many cases it is a bit overkill for simple things – a normal mutex will do just fine if all you want to do is e.g. increment a counter. The Swift syntax allows these to be used quite elegantly, e.g. I use the following:

/** A pthread-based recursive mutex lock. */
public class Mutex {
private var mutex: pthread_mutex_t = pthread_mutex_t()
public init() {
var attr: pthread_mutexattr_t = pthread_mutexattr_t()
pthread_mutexattr_init(&attr)
pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_RECURSIVE)
let err = pthread_mutex_init(&self.mutex, &attr)
pthread_mutexattr_destroy(&attr)
switch err {
case 0:
// Success
break
case EAGAIN:
fatalError("Could not create mutex: EAGAIN (The system temporarily lacks the resources to create another mutex.)")
case EINVAL:
fatalError("Could not create mutex: invalid attributes")
case ENOMEM:
fatalError("Could not create mutex: no memory")
default:
fatalError("Could not create mutex, unspecified error \(err)")
}
}
private final func lock() {
let ret = pthread_mutex_lock(&self.mutex)
switch ret {
case 0:
// Success
break
case EDEADLK:
fatalError("Could not lock mutex: a deadlock would have occurred")
case EINVAL:
fatalError("Could not lock mutex: the mutex is invalid")
default:
fatalError("Could not lock mutex: unspecified error \(ret)")
}
}
private final func unlock() {
let ret = pthread_mutex_unlock(&self.mutex)
switch ret {
case 0:
// Success
break
case EPERM:
fatalError("Could not unlock mutex: thread does not hold this mutex")
case EINVAL:
fatalError("Could not unlock mutex: the mutex is invalid")
default:
fatalError("Could not unlock mutex: unspecified error \(ret)")
}
}
deinit {
assert(pthread_mutex_trylock(&self.mutex) == 0 && pthread_mutex_unlock(&self.mutex) == 0, "deinitialization of a locked mutex results in undefined behavior!")
pthread_mutex_destroy(&self.mutex)
}
@discardableResult public final func locked<T>(_ block: @noescape () throws -> (T)) throws -> T {
return try self.tryLocked(block)
}
@discardableResult public final func locked<T>(_ block: @noescape () -> (T)) -> T {
return try! self.tryLocked(block)
}
/** Execute the given block while holding a lock to this mutex. */
@discardableResult public final func tryLocked<T>(_ block: @noescape () throws -> (T)) throws -> T {
self.lock()
defer {
self.unlock()
}
let ret: T = try block()
return ret
}
}
// Use as follows:
var counter = 0
let counterMutex = Mutex()
func incrementCounter() -> Int{
return counterMutex.locked {
let oldValue = counter
counter += 1
return oldValue
}
}
view raw mutex.swift hosted with ❤ by GitHub

The mutex ‘locked’ block intuitively represents a transaction that should be executed as a whole and uninterrupted. In debug mode, your mutex implementation could also check how long it took to obtain the lock – if this exceeds a certain threshold and happened on the main thread, it errors (you should never block the main thread for too long!).