Extend cachbench with value validation #131

igchor · 2022-04-15T08:58:13Z

The main purpose of this patch is to better simulate workloads in
cachebench. Setting validateValue to true allows to see performance
impact of using different mediums for memory cache.

haowu14 · 2022-04-15T18:05:21Z

Do you mind sharing the use case scenario for this diff? If the intention is to validate that the value written is the same as value read, wouldn't consistency checking be enough?

igchor · 2022-04-18T15:33:46Z

The main intention is to see how much overhead reading a value introduces when using different memory mediums (e g., DRAM vs. CXL-mem or PMEM). We would like to measure this overhead for different value sizes.

For check consistency, there is some additional overhead and, I think, also some additional requirements on values.

sathyaphoenix · 2022-04-19T17:16:46Z

@haowu14 We discussed this in the call with Intel to merge PRs that are independent and useful. In this case, like @igchor pointed out using the consistency checking mode to read the value adds other overheads in addition and the intent here is to trigger reading the item in pmem like scenarios where bandwidth is a concern for byte addressable medium.

sathyaphoenix · 2022-04-25T17:10:39Z

cachelib/cachebench/cache/Cache.h

@@ -352,6 +361,9 @@ class Cache {
  // tracker for consistency monitoring.
  std::unique_ptr<ValueTracker> valueTracker_;

+  // expected value of all items in Cache.
+  std::optional<std::string> expectedValue_;


This sounds like we are validating the value, but we are only validating a few bytes in each item. So either it is named incorrectly or we are intending to do something, but the code does not do all of it.

What's the intent with adding this value validation ? My understanding was that this was to read the item's payload in entirety so that we can benchmark any DRAM or PMem bandwidth numbers. If that's the case, why are we reading only a subset of the item ?

Your understanding is correct. I believe we are verifying all bytes of the value right now. expectedValue_ is being set to hardcodedString_ which is 4MB. The std::min<size_t>(expected.size(), getSize(it)) part is just to make sure that we are not reading outside of the item. This is analogous to: https://github.com/facebook/CacheLib/blob/main/cachelib/cachebench/cache/Cache-inl.h#L624

Is this reasoning correct?

expectedValue_ is being set to hardcodedString_ which is 4MB.

@igchor The value is set by the caller (stressor) in a certain way, and we are leaking that into the cache. The fact that expectedValue[0 - item.getSize() -1] is the expected value is assumed in the cache.

Taking a step back, we are not really building this feature to check item's content. That functionality is already handled by the consistency checking feature. We want this feature to simply read the item payload. Can this be done by the CacheStressor instead ? We also don't need to compare it against any expected value. As in, the result of that comparison is simply irrelevant. So there is no need to abort if the value is incorrect.

You are right that we are not really building this to check item's content. We came up with the idea of using memcmp to make sure that the code which reads the value is not optimized out by the compiler.

I don't think we can easily move the checking to the CacheStressor (all the latency checking code is done inside Cache). However, we can rewrite validateValue to not do any comparison. We can just iterate over the value and add a volatile somewhere to make sure the code is not optimized out (e.g. calculate the number of zero-bytes in the value and assign this number to a local volatile var). Would this be acceptable?

@igchor yeah, that seems more reasonable, simpler and we don't need to think about terminating the run based on correctness. There is probably a way to do it without the volatile (check out folly::doNotOptimizeAway).

Ok, I've changed the implementation. Now, I use std::accumulate on the value and folly::doNotOptimize on the sum. Let me know if you think this is OK.

sathyaphoenix · 2022-04-25T17:11:44Z

cachelib/cachebench/cache/Cache.h

@@ -227,9 +230,15 @@ class Cache {
  // @param keys  list of keys that the stressor uses for the workload.
  void enableConsistencyCheck(const std::vector<std::string>& keys);

+  // enables validating all values on find. Each value is compared to
+  // expected Value.
+  void enableValueValidating(const std::string& expectedValue);


It sounds like this is intended to be called before the cache is used for the stress test. Why not make it part of the cache initialization instead ?

In case of consistency checking, the enable happens later since the keys are generated by inserting into the cache and then we enable the consistency checking for the keys added to the cache. It is a bit convoluted, but we don't need to do the same.

Inside Cache ctor we do not know what is the expectedValue yet. This is why I call enableValueValidating from CacheStressor which is responsible for generating the values (hardcodedString_ for now).

I removed this function and added bool param to the Cache ctor - is it OK?

sathyaphoenix

@igchor the base rev seems incorrect. PTAL and update. Once that is done, I'll try to import it.

sathyaphoenix · 2022-05-02T22:06:03Z

cachelib/cachebench/cache/Cache-inl.h

@@ -50,8 +50,10 @@ uint64_t Cache<Allocator>::fetchNandWrites() const {
 template <typename Allocator>
 Cache<Allocator>::Cache(const CacheConfig& config,
                        ChainedItemMovingSync movingSync,
-                        std::string cacheDir)
+                        std::string cacheDir,
+                        bool touchValue)


can touchValue be part of the CacheConfig ?

nvm. I understand why it might not be a good idea to do that. We are really bending our back to add this code so that it fits under the place where find() api latency is measured and it is looking a bit convoluted. But that's fine for now.

sathyaphoenix · 2022-05-02T22:08:34Z

cachelib/cachebench/cache/Cache-inl.h

+  XDCHECK(touchValueEnabled());
+
+  auto ptr = reinterpret_cast<const uint8_t*>(getMemory(it));
+  auto sum = std::accumulate(ptr, ptr + getSize(it), 0ULL);


leave a comment here that accumulate is intended to access all bytes of the item and nothing more.

sathyaphoenix · 2022-05-02T22:17:33Z

cachelib/cachebench/cache/Cache-inl.h

@@ -408,13 +419,20 @@ typename Cache<Allocator>::ItemHandle Cache<Allocator>::find(Key key,
    // find from cache and wait for the result to be ready.
    auto it = cache_->findImpl(key, mode);
    it.wait();
+


This code looks different from

CacheLib/cachelib/cachebench/cache/Cache-inl.h

Line 405 in 69738cf

tracker = util::LatencyTracker(cacheFindLatency_);

Are you on a stale base commit for this PR ?

Done - rebased.

facebook-github-bot · 2022-05-12T22:49:08Z

@sathyaphoenix has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

igchor · 2022-06-02T14:42:01Z

Do you have any more concerns about this PR?

therealgymmy · 2022-06-14T23:37:37Z

@igchor: can you rebase to latest trunk? We recently made some API change to CacheBench to use Read/Write handles. I will re-import again.

The main purpose of this patch is to better simulate workloads in cachebench. Setting touchValue to true allows to see performance impact of using different mediums for memory cache.

facebook-github-bot · 2022-06-15T08:52:39Z

@igchor has updated the pull request. You must reimport the pull request before landing.

igchor · 2022-06-15T08:56:54Z

@therealgymmy I've rebased, thanks

facebook-github-bot · 2022-06-15T17:51:52Z

@therealgymmy has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 15, 2022

sathyaphoenix requested changes Apr 25, 2022

View reviewed changes

igchor force-pushed the validate_value branch from 74af02b to 23684c2 Compare April 29, 2022 15:25

sathyaphoenix reviewed May 2, 2022

View reviewed changes

igchor force-pushed the validate_value branch from 23684c2 to 9353b6d Compare May 4, 2022 10:35

victoria-mcgrath mentioned this pull request May 5, 2022

Upstreaming plan (5/4/2022) pmem/CacheLib#68

Closed

Extend cachbench with touch value

b72d9da

The main purpose of this patch is to better simulate workloads in cachebench. Setting touchValue to true allows to see performance impact of using different mediums for memory cache.

igchor force-pushed the validate_value branch from 9353b6d to b72d9da Compare June 15, 2022 08:52

facebook-github-bot closed this in 3b33452 Jun 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extend cachbench with value validation #131

Extend cachbench with value validation #131

igchor commented Apr 15, 2022

haowu14 commented Apr 15, 2022

igchor commented Apr 18, 2022

sathyaphoenix commented Apr 19, 2022

sathyaphoenix Apr 25, 2022

igchor Apr 26, 2022

sathyaphoenix Apr 27, 2022 •

edited

Loading

igchor Apr 27, 2022

sathyaphoenix Apr 27, 2022

igchor Apr 29, 2022

sathyaphoenix Apr 25, 2022

sathyaphoenix Apr 25, 2022

igchor Apr 27, 2022

igchor Apr 29, 2022

sathyaphoenix left a comment

sathyaphoenix May 2, 2022

sathyaphoenix May 2, 2022

sathyaphoenix May 2, 2022

igchor May 4, 2022

sathyaphoenix May 2, 2022

igchor May 4, 2022

facebook-github-bot commented May 12, 2022

igchor commented Jun 2, 2022

therealgymmy commented Jun 14, 2022

facebook-github-bot commented Jun 15, 2022

igchor commented Jun 15, 2022

facebook-github-bot commented Jun 15, 2022

Extend cachbench with value validation #131

Extend cachbench with value validation #131

Conversation

igchor commented Apr 15, 2022

haowu14 commented Apr 15, 2022

igchor commented Apr 18, 2022

sathyaphoenix commented Apr 19, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sathyaphoenix Apr 27, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sathyaphoenix left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

facebook-github-bot commented May 12, 2022

igchor commented Jun 2, 2022

therealgymmy commented Jun 14, 2022

facebook-github-bot commented Jun 15, 2022

igchor commented Jun 15, 2022

facebook-github-bot commented Jun 15, 2022

sathyaphoenix Apr 27, 2022 •

edited

Loading