Skip to content

Write_Performance

Jim Fulton edited this page Jan 23, 2017 · 2 revisions

Write Performance

In addition to storing native (pickle) data, newt converts native data to JSON and saves it. I used zodbshootout to evaluate the impact of saving this extra data. The tests were done:

  • A Mac with a 2.3 Ghz Intel I7 quad core processor with 16GB of RAM and an SSD disk.
  • Python 3.5
  • History-free RelStorage configuration. This is much faster than history-preserving (and should be the default, and is the default configuration for Newt DB.)

Basic Newt (JSON index)

For the initial test, I compared RelStorage with Newt with the default JSON index. Here are the results for various shootout settings. The values are number of objects read or written, so larger numbers are better.

concurrency: 2, object size, 96

What RS Newt
"Add 1000 Objects", 13027, 5182
"Update 1000 Objects", 12140, 5074
"Read 1000 Cold Objects", 10567, 10283

concurrency: 4, object size, 96

What RS Newt
"Add 10 Objects", 2411, 1699
"Update 10 Objects", 2465, 1819
"Read 10 Cold Objects", 9034, 8881

concurrency: 4, object size, 999

What RS Newt
"Add 10 Objects", 1831, 1298
"Update 10 Objects", 1862, 1372
"Read 10 Cold Objects", 7548, 7404

concurrency: 8, object size, 999

What RS Newt
"Add 10 Objects", 1218, 1110
"Update 10 Objects", 1146, 1041
"Read 10 Cold Objects", 10052, 9795

concurrency: 16, object size, 999

What RS Newt
"Add 10 Objects", 1555, 1061
"Update 10 Objects", 1939, 1071
"Read 10 Cold Objects", 9202, 7977

Some things to note:

  • At concurrency 16, the Mac's CPU was often fully consumed.

  • Newt is significantly slower than regular Postgres.

  • Times can vary quite a bit, so only really large differences should be considered to be significant. In the results above, Newt reads seem to be slightly slower, but results in the next section show the opposite and the regular RelStorage results vary quite a bit from those above.

  • The default zodbshootout settings use vary large transactions consisting of extremely small objects. In my experience, transactions typically involve smaller numbers of larger objects.

    Note that ZODB applications that make aggressive use of ZODB-based catalogs will often involved larger numbers of objects due to the many BTrees that must be updated. Newt applications won't use ZODB-based catalogs and will tend to have much smaller transactions.

  • Newt adds 2 additional components of load:

    • On the client: computation of a JSON representation
    • On the server: saving data to an additional table and updating a JSON index.

    For throughput, client computations aren't as problematic, because they can happen in parallel. Zodbshootout measures throughput and we can see smaller differences between Newt and regular RelStorage as concurrency increases, until CPU resources are exhausted.

Newt with a text index

Most applications are likely to use additional indexes, especially text indexes. A text index was added on the data attribute of the test data to look at the impact of the additional indexes:

concurrency: 4, object size, 96

What RS Newt
"Add 10 Objects", 1751, 1312
"Update 10 Objects", 1780, 1382
"Read 10 Cold Objects", 8825, 8704

concurrency: 4, object size, 999

What RS Newt
"Add 10 Objects", 1571, 728
"Update 10 Objects", 1613, 640
"Read 10 Cold Objects", 7040, 7530

concurrency: 8, object size, 999

What RS Newt
"Add 10 Objects", 1170, 579
"Update 10 Objects", 1138, 543
"Read 10 Cold Objects", 9702, 9759

concurrency: 16, object size, 999

What RS Newt
"Add 10 Objects", 1184, 450
"Update 10 Objects", 1076, 489
"Read 10 Cold Objects", 9245, 10236

Some things to note:

  • Adding a text index significantly reduced newt performance.

Conclusion

  • Basic newt adds reduced performance. Much of this is on the client, which is mitigated by parallelism with more clients.

    IMO, the slowdown isn't enough to be of significant concern for the basic case.

  • Adding a text index slows things even further. Presumably adding additional indexes would slow processing further. For applications that need high write throughput and especially slow write latency, this could be a problem.

    Something to keep in mind, however, is that if the alternative are ZODB-based catalogs, these additional indexes, implemented as catalogs are likely to:

    • Require much more client-site computation,
    • Cause many more objects to be written, and
    • Put more load on client caches, causing a much more client load, and
    • Likely cause many more conflicts to be resolved and transactions to be retried.

    It would be interesting to compare updating catalog-based indexes with Postgres-based indexes.

Newt will provide the ability to generate JSON and update indexes asynchronously. This should reduce write latency, but perhaps not help throughput, since the same work will need to be done.

Clone this wiki locally