<aside> 📘 Series:
Relates:
In the second post, I showed you the behavior of the garbage collector and how to use the tooling to see the latencies that the collector inflicts on your running application. I walked you through running a real web application and showed you how to generate GC traces and application profiles. Then I showed you how to interpret the output from these tools so you can find ways to improve the performance of your applications.
The final conclusion of that post was the same as the first: if you reduce stress on the heap you will reduce the latency costs and therefore increase the application’s performance. The best strategy for being sympathetic with the garbage collector is to reduce the number or the amount of allocation per work being performed. In this post, I will show how the pacing algorithm is capable of identifying the optimal pace for a given workload over time.
I am going to use the code located at this link.
https://github.com/ardanlabs/gotraining/tree/master/topics/go/profiling/trace
This program identifies the frequency at which a particular topic can be found in a collection of RSS news feed documents. The trace program contains different versions of a find algorithm to test different concurrency patterns. I will concentrate on the freq
, freqConcurrent
and freqNumCPU
versions of the algorithms.
Note: I am running the code on a Macbook Pro with an Intel i9 processor with 12 hardware threads using go1.12.7. You will see different results on different architectures, operating systems and versions of Go. The core results of the post should remain the same.
I will start with the freq
version first. It represents a non-concurrent sequential version of the program. This will provide a baseline for the concurrent versions that follow.
Listing 1
func freq(topic string, docs []string) int {
var found int
for _, doc := range docs {
file := fmt.Sprintf("%s.xml", doc[:8])
f, err := os.OpenFile(file, os.O_RDONLY, 0)
if err != nil {
log.Printf("Opening Document [%s] : ERROR : %v", doc, err)
return 0
}
defer f.Close()
data, err := ioutil.ReadAll(f)
if err != nil {
log.Printf("Reading Document [%s] : ERROR : %v", doc, err)
return 0
}
var d document
if err := xml.Unmarshal(data, &d); err != nil {
log.Printf("Decoding Document [%s] : ERROR : %v", doc, err)
return 0
}
for _, item := range d.Channel.Items {
if strings.Contains(item.Title, topic) {
found++
continue
}
if strings.Contains(item.Description, topic) {
found++
}
}
}
return found
}
Listing 1 shows the freq
function. This sequential version ranges over the collection of file names and performs four actions: open, read, decode and search. It does this for each file, one at a time.
When I run this version of freq
on my machine, I get the following results.
Listing 2
$ time ./trace
2019/07/02 13:40:49 Searching 4000 files, found president 28000 times.
./trace 2.54s user 0.12s system 105% cpu 2.512 total
You can see by the output of time, the program is taking ~2.5 seconds to process the 4000 files. It would be nice to see what percent of time was spent in garbage collection. You can do that by seeing a trace of the program. Since this is a program that starts and completes, you can use the trace package to generate a trace.
Listing 3