@Araq:
Win7 Professional 64-bit  (Dell laptop Intel i7 2.8GHz  8Gb RAM)

Nim v0.15.3  Windows: amd64
2016-12-05 22:37:07
On my laptop, I get ~6 milliseconds for the Go code snippet in the linked post, and ~0.3 milliseconds for the Nim snippet posted by Araq 2016-12-05 23:50:06

@Varriount

Does it vary much over multiple runs?

2016-12-05 23:59:17
@jlp765 It varied ±1 millisecond for the Go snippet, and ±0.1 millisecond for the Nim snippet. 2016-12-06 00:54:06

Here's an amendment to my previous timing. I compiled the Go snippit with standard arguments (I don't know if there's a 'release' mode) and Nim with '-d:release', then ran each executable 20 times.

Nim
---
Mean:     0.08501 ms
Median:   0.07966 ms
Std Dev.: 0.03227 ms
Lowest:   0.053365 ms
Highest:  0.175821 ms


Go
---
Mean:     5.949 ms
Median:   5.81 ms
Std Dev.: 0.5054 ms
Lowest:   5.359414 ms
Highest:  7.220875 ms

As you can see, Go's garbage collector takes quite a bit longer than Nim's... Although it does have the benefit of being able to handle multiple threads (I think).

2016-12-06 02:18:51

@Varriount

Is your conclusion really fair?

The Go benchmark seems to include no special tweaking. Araqs version switched of cycle detection. That may be fine for this small example. But when I have a real word complex program, can I be sure that I can switch of cycle detection? For all my tests with cycle detection enabled I got pause times of about 8ms, which is similar to that what Dom got and what Go offer.

2016-12-06 11:10:34
@Varriount what version of Go are you using? 2016-12-06 11:12:59
Nim CG also is thread independent 2016-12-06 11:14:20

Turning on cycle detection doesn't seem to affect the pause times for me. I still get sub-millisecond pauses for Araq's Nim snippet.

This is the snippet I'm using:

# Compile and run with 'nim c -r -d:useRealtimeGC -d:release main.nim'

import strutils
#import times

include "$lib/system/timers"

const
  windowSize = 200000
  msgCount   = 1000000

type
  Msg = seq[byte]
  Buffer = seq[Msg]

var worst: Nanos

proc mkMessage(n: int): Msg =
  result = newSeq[byte](1024)
  for i in 0 .. <result.len:
    result[i] = byte(n)

proc pushMsg0(b: var Buffer, highID: int) =
  # warmup:
  let m = mkMessage(highID)
  shallowCopy(b[highID mod windowSize], m)

proc pushMsg1(b: var Buffer, highID: int) =
  # with benchmarking:
  let start = getTicks()
  
  let m = mkMessage(highID)
  shallowCopy(b[highID mod windowSize], m)
  
  let elapsed = getTicks() - start
  if elapsed > worst:
    worst = elapsed

proc main() =
  # Don't use GC_disable() and GC_step(). Instead use GC_setMaxPause().
  # GC_disableMarkAndSweep()
  GC_setMaxPause(300)
  
  var b = newSeq[Msg](windowSize)
  # we need to warmup Nim's memory allocator so that not most
  # of the time is spent in mmap()... Hopefully later versions of Nim
  # will be smarter and allocate larger pieces from the OS:
  for i in 0 .. <msgCount:
    pushMsg0(b, i)
  
  # now after warmup, we can measure things:
  for i in 0 .. <msgCount:
    pushMsg1(b, i)
  
  echo("Worst push time: ", worst, " nano seconds")

when isMainModule:
  main()

2016-12-06 13:07:22
Well you can always switch it off and see what happens, "complex real world example" or not. You can also disable it selectively per thread or per time critical code part. you can also break up cycles manually via ptr and that's where Nim should be improving. Swift too embraces this model.
2016-12-06 13:07:56