TL;DR
Serialization using runtime reflection is slow! Use reflection to generate faster serialization code when performance is needed.
BenchmarkReflectionMarshal
BenchmarkReflectionMarshal-12 302158 3333 ns/op
BenchmarkHardcodedMarshal
BenchmarkHardcodedMarshal-12 863041 1222 ns/op
BenchmarkGeneratedMarshal
BenchmarkGeneratedMarshal-12 1000000 1072 ns/op
GO TAGS
Go tags are a common way to add some additional metadata to fields of a struct. For example, the next snippet is a common way used by JSON Serialization and Deserialization (SerDe):
type Trade struct {
Id int64 `json:"id"`
DateTime time.Time `json:"date_time"`
Symbol string `json:"symbol,omitempty"`
Price float64 `json:"price"`
Amount float64 `json:"amount"`
}
We can leverage some control over what/how a field is serialized, pretty straightforward!
without tags with tags
{ {
"Id": 0, "id": 0,
"DateTime": "2022-04-30T10:09:22.686057+02:00", "date_time": "2022-04...",
"Symbol": "BTC", "symbol": "BTC",
"Price": 1, "price": 1,
"Amount": 2 "amount": 2
} }
LEVERAGING GO TAGS
How are these tags used by the json.Unmarshal functionality that GO provides out of the box and can we make use of our own tags. For an in-depth json/ struct handling I am refering to https://go.dev/blog/json.
Another use case is leveraging the tags for validation: struct validation
Another use case is leveraging the tags for validation: struct validation
Multiple tags can be used by separating the tags with a space:
Price float64 `json:"price" validate:"gte=0"`
As it turns out, most of the functionality depends on the reflection of the struct during runtime.
REFLECTION PERFORMANCE
Reflection is used to introspect structs values, it is a versatile tool to get the metadata from a struct, including the tag names. Reflection comes with a performance penalty as we will demonstrate with a toy sample application that does a simple serialization of a struct with our own tag name.hard-coded reference
As a simple(no recursion) serialization we create the following hardcoded reference:
func MarshalHardcoded(object model.Trade) string {
return fmt.Sprintf("id:%d\ndate_time:%s\nsymbol:%s\nprice:%.2f\namount:%.2f\n",
object.Id,
object.DateTime.Format(time.RFC3339),
object.Symbol,
object.Price,
object.Amount)
}
which results in the following output:
id:1
date_time:2022-04-30T10:55:25+02:00
symbol:BTC
price:60000.00
amount:0.00
pretty simple serialization and not even the most performant one but it serves our purpose.
reflection
Now the Reflection part we determine our tagName to map, not adding pointers or recursion to it, it would make the example a bit more convoluted.
const tagName = "map"
func MarshalReflection(st any) string {
// ValueOf returns a new Value initialized to the concrete value
// stored in the interface i. ValueOf(nil) returns the zero Value.
rv := reflect.ValueOf(st)
// TypeOf returns the reflection Type that represents the dynamic type of i.
// If i is a nil interface value, TypeOf returns nil.
rt := reflect.TypeOf(st)
// ValueOf returns a new Value initialized to the concrete value
// stored in the interface i. ValueOf(nil) returns the zero Value.
rv = reflect.ValueOf(st)
buf := new(bytes.Buffer)
// NumField returns a struct type's field count.
// It panics if the type's Kind is not Struct.
for i := 0; i < rt.NumField(); i++ {
tf := rt.Field(i)
vf := rv.Field(i)
tag := tf.Tag.Get(tagName)
// skip nil value
if vf.IsZero() {
continue
}
switch vf.Interface().(type) {
case int64:
buf.WriteString(fmt.Sprintf("%s:%d\n", tag, vf.Int()))
case time.Time:
buf.WriteString(fmt.Sprintf("%s:%s\n", tag,
vf.Interface().(time.Time).Format(time.RFC3339)))
case string:
buf.WriteString(fmt.Sprintf("%s:%s\n", tag, vf.String()))
case float64:
buf.WriteString(fmt.Sprintf("%s:%.2f\n", tag, vf.Float()))
}
}
return buf.String()
}
PERFORMANCE
For performance measurement, we use the go benchmarking functionality.
func BenchmarkReflectionMarshal(b *testing.B) {
tr := model.Trade{
Id: 1,
DateTime: time.Now(),
Symbol: "BTC",
Price: 60000.00,
Amount: 1.0,
}
for i := 0; i < b.N; i++ {
MarshalReflection(tr)
}
}
func BenchmarkHardcodedMarshal(b *testing.B) {
tr := model.Trade{
Id: 1,
DateTime: time.Now(),
Symbol: "BTC",
Price: 60000.00,
Amount: 1.0,
}
for i := 0; i < b.N; i++ {
MarshalHardcoded(tr)
}
}
with the following outcome:
goos: darwin
goarch: amd64
pkg: tag-names
cpu: Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
BenchmarkReflectionMarshal
BenchmarkReflectionMarshal-12 348876 3343 ns/op
BenchmarkHardcodedMarshal
BenchmarkHardcodedMarshal-12 844834 1242 ns/op
So the reflection, in this case, is about 2.7 x slower than its hardcoded counterpart
GO TEMPLATING
Reflection is slower in any language, let's generate code with reflection instead
//go:generate go run gen.go
package main
import (
"bytes"
"fmt"
"log"
"os"
"reflect"
"strings"
"tag-names/model"
"text/template"
"time"
)
const tagName = "map"
type MetaStruct struct {
Package string
Name string
Type string
Marshal string
}
func main() {
println("Generating some awesome go code")
templates := template.Must(template.New("templates").ParseGlob("../codegen_templates/*"))
createFile("../marshalling", "trader.go", "marshal.tpl", model.Trade{}, templates)
}
func createFile(outputPath, outputFileName, templateFileName string, object any, templates *template.Template) {
out, err := os.Create(fmt.Sprintf("%s/%s", outputPath, outputFileName))
if err != nil {
log.Printf("%v", err)
}
defer out.Close()
err = templates.ExecuteTemplate(out, templateFileName, createMetaStruct(object))
}
the first line lets us generate a model.Trade marshaling routine, by typing:
go generate ./...
, we fill our metastructure with reflection.
The template looks like this :
The template looks like this :
package marshalling
// this code is generated by go generate
// DO NOT EDIT!
import (
"{{.Package}}"
"fmt"
"time"
)
func Marshal{{.Name}}(object {{.Type}}) string{
return {{.Marshal}}
}
Resulting in :
package marshalling
// this code is generated by go generate
// DO NOT EDIT!
import (
"tag-names/model"
"fmt"
"time"
)
func MarshalTrade(object model.Trade) string{
return fmt.Sprintf("id:%d\ndate_time:%s\nsymbol:%s\nprice:%.2f\namount:%.2f\n", object.Id, object.DateTime.Format(time.RFC3339), object.Symbol, object.Price, object.Amount)
}
This will perform as fast as the hardcoded(by hand) one :-)
CONCLUSION
In certain use-cases where serialization performance is of importance consider generating code beforehand for the serialization part when possible.Of course, the code is available on Github: https://github.com/rutjes-dev/tag-names
0 Comments
Post a Comment