Automata based member name lookup for improve string key deserialization by neuecc · Pull Request #112 · MessagePack-CSharp/MessagePack-CSharp · GitHub
Skip to content

Automata based member name lookup for improve string key deserialization#112

Merged
neuecc merged 13 commits into
masterfrom
name-automata-lookup
Aug 26, 2017
Merged

Automata based member name lookup for improve string key deserialization#112
neuecc merged 13 commits into
masterfrom
name-automata-lookup

Conversation

@neuecc

@neuecc neuecc commented Aug 24, 2017

Copy link
Copy Markdown
Member

Currently benchmark

Deserialization micro benchamark with BenchmarkDotNet. Target object has 9 members, value are zero.

// [MessagePackObject][ProtoContract] and key, tags...
public class SerializerTarget
{
    public int MyProperty1 { get; set; }
    public int MyProperty2 { get; set; }
    public int MyProperty3 { get; set; }
    public int MyProperty4 { get; set; }
    public int MyProperty5 { get; set; }
    public int MyProperty6 { get; set; }
    public int MyProperty7 { get; set; }
    public int MyProperty8 { get; set; }
    public int MyProperty9 { get; set; }
}

[Benchmark(Baseline = true)]
public SerializerTarget IntKey()
{
    return MessagePackSerializer.Deserialize<SerializerTarget>(intKeyObject);
}
Method Mean Error Scaled Gen 0 Allocated
IntKey 100.5 ns NA 1.00 0.0132 56 B
StringKey 519.0 ns NA 5.17 0.0124 56 B
Typeless_IntKey 221.0 ns NA 2.20 0.0131 56 B
Typeless_StringKey 636.4 ns NA 6.33 0.0124 56 B
MsgPackCliMap 2,047.4 ns NA 20.38 0.1431 608 B
MsgPackCliArray 578.1 ns NA 5.75 0.0410 176 B
ProtobufNet 329.1 ns NA 3.28 0.0319 136 B
JsonNetString 3,852.3 ns NA 38.34 0.6790 2864 B
JsonNetByteArray 4,053.0 ns NA 40.34 1.4267 6000 B
JilString 655.1 ns NA 6.52 0.0362 152 B
JilByteArray 1,694.2 ns NA 16.86 0.8450 3552 B

IntKey, StringKey, Typeless_IntKey, Typeless_StringKey are MessagePack for C# options. JsonNetString/JilString is deserialized from string. JsonNetByteArray/JilByteArray is deserialized from UTF8 byte[]. Deserialization is normally read from Stream. Thus, it will be restored from byte[] instead of string.

MessagePack for C# IntKey is fastest. StringKey is slower than IntKey because matching from the character string is required. All MessagePack for C# options achive zero memory allocation on deserialization process.

StringKey deserialization is not so fast

The difference with IntKey is the cost of looking up a dictionary.
Jil uses automata-based-member-name-lookups technique.
It is good, can be improve string key deserialization performance.

Here is initial design's performance of automata.

Method Mean Error Scaled Gen 0 Allocated
IntKey 107.6 ns NA 1.00 0.0132 56 B
Automata 378.5 ns NA 3.52 0.0129 56 B
Hashtable 580.3 ns NA 5.39 0.0124 56 B

I will do inlining of IL code generation, making it faster.

@neuecc

neuecc commented Aug 25, 2017

Copy link
Copy Markdown
Member Author

I've finished inlining phase 1.

image

I can optimize more, so it will be more faster.

Currently generated code is here(disassemble by ILSpy).
(This code is slightly broken, should fix it)

image

This is search "FooBarBaz", "FooBarPoo", "AprilJuneJuly" member name.

@neuecc

neuecc commented Aug 26, 2017

Copy link
Copy Markdown
Member Author

latest result.
I've got great performance for string key.

image

@neuecc

neuecc commented Aug 26, 2017

Copy link
Copy Markdown
Member Author

@neuecc neuecc changed the title [WIP]Automata based member name lookup for improve string key deserialization Automata based member name lookup for improve string key deserialization Aug 26, 2017
@neuecc neuecc merged commit bcedbce into master Aug 26, 2017
@neuecc neuecc deleted the name-automata-lookup branch August 26, 2017 20:53
AArnott added a commit to AArnott/MessagePack-CSharp that referenced this pull request Mar 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant