jyrgenn: Blurred head shot from 2007 (Default)
jyrgenn ([personal profile] jyrgenn) wrote2016-09-05 10:32 am
Entry tags:

Go: what is faster, buf.WriteString("...") or str += "..."?

More or less since I started programming in Go, I have wanted to know which method I should use for collecting strings: (a) use a bytes.Buffer and WriteString(), or (b) just "add" together a string with +=. The convenience of the latter is appealing, but how would that be performance-wise? So I finally checked it.
package main

import (
	"bytes"
	"fmt"
	"os"
	"time"
)

var startT time.Time

func main() {
	for reps := 10000; reps <= 100000; reps += 10000 {
		fmt.Printf("\nreps: %d\n", reps)
		snippet := os.Args[1]

		startT = time.Now()
		buf1 := bytes.NewBufferString("")
		for i := 0; i < reps; i++ {
			buf1.WriteString(snippet)
		}
		seconds1 :=
			float64(time.Now().Sub(startT)) / float64(time.Second)
		len1 := buf1.Len()
		fmt.Printf("last of %d: %s; %g s\n", len1,
			buf1.Bytes()[len1-10:len1-1], seconds1)

		startT = time.Now()
		buf2 := ""
		for i := 0; i < reps; i++ {
			buf2 += snippet
		}
		seconds2 :=
			float64(time.Now().Sub(startT)) / float64(time.Second)
		len2 := len(buf2)
		fmt.Printf("last of %d: %s; %g s\n", len2,
			buf2[len2-10:len2-1], seconds2)
	}
}

The result was more clear-cut than I had expected:
$ ./strcatz fldsjbcsldfbcdkhasfacde

reps: 10000
last of 230000: dkhasfacd; 0.001061327 s
last of 230000: dkhasfacd; 0.501937235 s

reps: 20000
last of 460000: dkhasfacd; 0.001232219 s
last of 460000: dkhasfacd; 2.42185104 s

reps: 30000
last of 690000: dkhasfacd; 0.00211587 s
last of 690000: dkhasfacd; 6.120059 s

reps: 40000
last of 920000: dkhasfacd; 0.002452257 s
last of 920000: dkhasfacd; 13.718863728 s

reps: 50000
last of 1150000: dkhasfacd; 0.0048127 s
last of 1150000: dkhasfacd; 18.529621865 s

reps: 60000
last of 1380000: dkhasfacd; 0.004334798 s
last of 1380000: dkhasfacd; 24.74539053 s

reps: 70000
last of 1610000: dkhasfacd; 0.005095205 s
last of 1610000: dkhasfacd; 32.982584273 s

reps: 80000
last of 1840000: dkhasfacd; 0.009039379 s
last of 1840000: dkhasfacd; 44.176404262 s

reps: 90000
last of 2070000: dkhasfacd; 0.008148565 s
last of 2070000: dkhasfacd; 53.003958242 s

reps: 100000
last of 2300000: dkhasfacd; 0.008536743 s
last of 2300000: dkhasfacd; 67.390456565 s

The += method is not only slower to begin with, but also goes up more than linear, which is not quite surprising. I do find it surprising, though, that the difference is so large, in the order of 10000. So, I guess, that question is answered: In a place where performance matters at all, don't use += for repeated string concatenation.