If-None-Match header

Craig Willis's Avatar

Craig Willis

09 Sep, 2016 09:55 AM

I'm looking at improving performance of my app, and in your changelog, I noticed this:

http://support.cheddargetter.com/kb/getting-started-19/change-log

API response caching
We now support the If-Modified-Since and If-None-Match headers. If you've got a cached version, CG can send back a 304 Not Modified without a body. Big speed and bandwidth improvements!
Currently only implemented on plans/get. Coming soon to customers/get.
Works with little config in the python wrapper, sharpy with local cache. Possibly others. Please let us know if you've contributed the local caching functionality for other wrappers. We'll distribute the kudos.

Should this be working?

I'm sending a request to /xml/plans/get/productCode/7d0abe0f-0402-428f-a267-278c3e4d3ade_GBP.

The first response returns the following:

HTTP/1.1 200 OK
Server: Apache
Vary: Accept-Encoding
Cache-Control: private
Content-Type: application/xml
Content-Encoding: gzip
Date: Fri, 09 Sep 2016 09:24:51 GMT
Node: CHED01VMW02
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Pragma: no-cache
Etag: d1feeab306b3f65ade382c09a7c3ee6a
Connection: Keep-Alive
Last-Modified: Fri, 09 Sep 2016 09:24:38 GMT
Content-Length: 1440

My next request looks like:

GET /xml/plans/get/productCode/7d0abe0f-0402-428f-a267-278c3e4d3ade_GBP
Connection: Keep-Alive,Keep-Alive
If-None-Match: d1feeab306b3f65ade382c09a7c3ee6a
Keep-Alive: 600
Host: cheddargetter.com
Cookie: CHEDDARGETTER=tcbqth8q5hf9hla7g1qn7la4j1
Accept-Encoding: gzip, deflate

And rather than a 304 as expected, I'm getting a 200.

Am I doing something wrong there?

Also, is this still only on plans/get, or have you enabled it anywhere else?

  1. Support Staff 1 Posted by Marc Guyer on 09 Sep, 2016 01:52 PM

    Marc Guyer's Avatar

    Hi Craig -- I see this working in Chrome, looking at the dev tools. At a glance, the difference appears to me to be that Chrome is also issuing the If-Modified-Since header:

    GET /xml/plans/get/productCode/7d0abe0f-0402-428f-a267-278c3e4d3ade_GBP HTTP/1.1
    Host: cheddargetter.com
    Connection: keep-alive
    Cache-Control: max-age=0
    Upgrade-Insecure-Requests: 1
    User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36
    Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
    Accept-Encoding: gzip, deflate, sdch, br
    Accept-Language: en-US,en;q=0.8
    Cookie: [redacted]
    If-None-Match: d1feeab306b3f65ade382c09a7c3ee6a
    If-Modified-Since: Fri, 09 Sep 2016 13:36:28 GMT
    
    HTTP/1.1 304 Not Modified
    Server: Apache
    Vary: Accept-Encoding
    Cache-Control: private
    Date: Fri, 09 Sep 2016 13:36:44 GMT
    Expires: Thu, 19 Nov 1981 08:52:00 GMT
    Etag: d1feeab306b3f65ade382c09a7c3ee6a
    Connection: Keep-Alive
    

    I checked our caching implementation and confirmed that the If-Modified-Since header is also required.

  2. 2 Posted by Craig Willis on 09 Sep, 2016 02:16 PM

    Craig Willis's Avatar

    Thanks,

    With If-Modified-Since, I'm now seeing the 304 requests, some of the time. It seems to be cacheable for at most a minute or so, even though the etag doesn't change. After a short period we start getting 200's again, until we update the date. I was hoping to be able to cache this for much longer periods, I can't imagine this changing often at all.

    I do notice the Last Modified in the headers seems to be very recent, much more recent that the last modification to the plans.

    Also, surprisingly, the 304 are taking longer than just straight get request!

    The first full get request I do usually takes 500ms, and then subsequent ones (in a short window) take about 230 ms.

    When I get a 304 request, I'm seeing a fairly standard 450~ ms. regardless of how close the requests are.

    Is this expected?

  3. Support Staff 3 Posted by Marc Guyer on 09 Sep, 2016 02:39 PM

    Marc Guyer's Avatar

    With If-Modified-Since, I'm now seeing the 304 requests, some of the time. It seems to be cacheable for at most a minute or so, even though the etag doesn't change. After a short period we start getting 200's again, until we update the date. I was hoping to be able to cache this for much longer periods, I can't imagine this changing often at all.

    Our config for that endpoint is to cache for 24 hours. Without additional detail about your experience (specific headers, etc), that's the best information I can provide at this time.

    I do notice the Last Modified in the headers seems to be very recent, much more recent that the last modification to the plans.

    It could very well be that something in a plan is being modified. Perhaps some functionality in the platform that was added since the caching component. We'd have to dig in to find out more.

    The first full get request I do usually takes 500ms, and then subsequent ones (in a short window) take about 230 ms.

    A typical request is expected to take ~500ms. It could be that successive requests being faster is a result of being loaded from our server-side cache and served as a 200 with the body due to request header values.

    When I get a 304 request, I'm seeing a fairly standard 450~ ms. regardless of how close the requests are.

    When the response is 304, I'm seeing a consistent time of ~140-180ms.

  4. 4 Posted by Craig Willis on 09 Sep, 2016 02:58 PM

    Craig Willis's Avatar

    Marc,

    Request 1 - Starts caching - saves data. (277ms) (15:49:41.391) Request 2 - 304 Request - (500ms) (15:50:14.438) Request 3 - 200 request - (380ms) (15:50:21.256)

    Request 1 Request:

    GET /xml/plans/get/productCode/7d0abe0f-0402-428f-a267-278c3e4d3ade_GBP HTTP/1.1
    Connection: Keep-Alive
    Keep-Alive: 600
    Host: cheddargetter.com
    Cookie: CHEDDARGETTER=ifngjvc9tkel78usef8ln8kcc2
    Accept-Encoding: gzip, deflate

    Request 1 Response:

    HTTP/1.1 200 OK
    Server: Apache
    Vary: Accept-Encoding
    Cache-Control: private
    Content-Type: application/xml
    Content-Encoding: gzip
    Date: Fri, 09 Sep 2016 14:49:41 GMT
    Node: CHED01VMW02
    Expires: Thu, 19 Nov 1981 08:52:00 GMT
    Pragma: no-cache
    Etag: d1feeab306b3f65ade382c09a7c3ee6a
    Connection: Keep-Alive
    Last-Modified: Fri, 09 Sep 2016 14:49:22 GMT
    Content-Length: 1439

    Request 2 Request:

    GET /xml/plans/get/productCode/7d0abe0f-0402-428f-a267-278c3e4d3ade_GBP HTTP/1.1
    Connection: Keep-Alive
    If-None-Match: d1feeab306b3f65ade382c09a7c3ee6a
    If-Modified-Since: Fri, 09 Sep 2016 14:49:41 GMT
    Keep-Alive: 600
    Host: cheddargetter.com
    Cookie: CHEDDARGETTER=ifngjvc9tkel78usef8ln8kcc2
    Accept-Encoding: gzip, deflate

    Request 2 Response:

    HTTP/1.1 304 Not Modified
    Server: Apache
    Vary: Accept-Encoding
    Cache-Control: private
    Date: Fri, 09 Sep 2016 14:50:14 GMT
    Expires: Thu, 19 Nov 1981 08:52:00 GMT
    Etag: d1feeab306b3f65ade382c09a7c3ee6a
    Connection: Keep-Alive

    Request 3 Request:

    GET /xml/plans/get/productCode/7d0abe0f-0402-428f-a267-278c3e4d3ade_GBP HTTP/1.1
    Connection: Keep-Alive
    If-None-Match: d1feeab306b3f65ade382c09a7c3ee6a
    If-Modified-Since: Fri, 09 Sep 2016 14:49:41 GMT
    Keep-Alive: 600
    Host: cheddargetter.com
    Cookie: CHEDDARGETTER=ifngjvc9tkel78usef8ln8kcc2
    Accept-Encoding: gzip, deflate

    Request 3 Response:

    HTTP/1.1 200 OK
    Server: Apache
    Vary: Accept-Encoding
    Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
    Content-Type: application/xml
    Content-Encoding: gzip
    Date: Fri, 09 Sep 2016 14:50:21 GMT
    Node: CHED01VMW01
    Expires: Thu, 19 Nov 1981 08:52:00 GMT
    Pragma: no-cache
    Connection: Keep-Alive
    Content-Length: 1440

    According to our logs, request 2 and 3 were identical, however one returned a 304, one returned a 200. There were no plan changes during this.

  5. Support Staff 5 Posted by Marc Guyer on 09 Sep, 2016 08:37 PM

    Marc Guyer's Avatar

    Hi Craig -- After taking a closer look, I can't tell you why you got a 200 on your 3rd request above. I confirmed that we have a seemingly really solid functional test that passes. It ensures that the 304 is received. We've had a long-standing ticket that called for a slight improvement to some internal caching of plan properties. I found that this was causing the cache to be invalidated even if the internal caching of plan property values didn't change. So, I went ahead and resolved that ticket in the hopes that this is what was causing your 200. That patch is tested and deployed. Go ahead and rerun your test and see if you can still get the unexpected 200.

  6. 6 Posted by Craig Willis on 12 Sep, 2016 07:31 AM

    Craig Willis's Avatar

    Marc,

    Thanks for that, I'm seeing much more consistent 304's now. I'll keep trying over the next few hours.

    Could I ask if plans/get is the only page this is enabled on?

    Thanks

    Craig

  7. Support Staff 7 Posted by Marc Guyer on 12 Sep, 2016 02:04 PM

    Marc Guyer's Avatar

    Thanks Craig -- plans/get is the only endpoint enabled at this time. It's considered experimental for now. Other endpoints are typically quite quick. The other one that we want to do caching on is customers/get but the invalidation of cache after a modification of anything related to a single customer makes that difficult to implement. Also, cache storage space requirement would increase dramatically since we'd be holding every active customer record in cache all the time. It's a wishlist item at this point and will probably be implemented in a future retooling of the API.

    Would you benefit by caching of customers/get? Any other endpoints in particular?

  8. 8 Posted by Craig Willis on 12 Sep, 2016 03:31 PM

    Craig Willis's Avatar

    Customers/get would make a bit of difference to us, our API is a level lower than what is offered by cheddargetter. For example, some of our API calls depend on values of different tracked items. For each call of those APIs, we have to look up tracked items for the current user.

    They are all done as stateless API's, and don't really share data, so each one can end with a hit on cheddargetter.

    We can (and do) cache this data internally for a short lifespan, currently about 5 minutes, however it's a bit of a risk.

    When we actually update a tracked item, as our API is something like /AddX, we can't return the customer data your API returns. so this means updating a tracked item ends up being 3 different calls. Once to get the customers data, one to update the tracked item (based on a second tracked item, which is why we can't just do a single call to your add tracked items), and then a third call to get the updated client data.

    Really, I'd like to be able to query a single tracked item. I'd love to be able to go /customer/get/code/{trackedItem}, and just get a single int back that was the current value of that tracked item, or a 404 if it's not there. I don't really mind about the rest of the information at that point.

    On your cache storage comment, I have no idea how you actually store data, but we've done something similar before. Maybe something like when a customer changes, we store both the last modified date, and a hash string in a fast lookup table (like redis). Then when a request comes in, if it has the etag/last modified date headers, you can first hit this fast lookup first, and return 304 if it's found, or fall back to the normal lookup if it's not.

    That massively reduces the amount of data you cache, and lets us handle caching the full data on our end. We get back a quick 304 to know nothings changed, so we can hit our cache, or we get the updated data and update our cache, and we never worry about out of date data.

  9. Support Staff 9 Posted by Marc Guyer on 12 Sep, 2016 04:03 PM

    Marc Guyer's Avatar

    Ok. Thanks for the detail.

    We can (and do) cache this data internally for a short lifespan, currently about 5 minutes, however it's a bit of a risk.

    I'm assuming that the risk you're referring to is that the customer record could actually change within that 5 min. Is that correct? If so, you might consider lengthening that TTL and listen for hooks to trigger invalidating your local cache. The risk is that CG could be doing something that changes the customer that your app wouldn't know about. For the most part, that's the recurring billing stuff. So, you could listen for the transaction hook and invalidate the cache. That would substantially decrease that risk.

    When we actually update a tracked item, as our API is something like /AddX, we can't return the customer data your API returns. so this means updating a tracked item ends up being 3 different calls. Once to get the customers data, one to update the tracked item (based on a second tracked item, which is why we can't just do a single call to your add tracked items), and then a third call to get the updated client data.

    The 1st call shouldn't be necessary since you have the value in your cache (unless you're just doing it for safety). The 3rd call shouldn't be necessary because the 2nd call returns the latest customer data.

    Really, I'd like to be able to query a single tracked item. I'd love to be able to go /customer/get/code/{trackedItem}, and just get a single int back that was the current value of that tracked item, or a 404 if it's not there. I don't really mind about the rest of the information at that point.

    I can see how that could be valuable from an efficiency perspective when there is heavy reliance on tracked items. This, and others like it, are expected be available after a future retooling of the API.

    On your cache storage comment, I have no idea how you actually store data, but we've done something similar before. Maybe something like when a customer changes, we store both the last modified date, and a hash string in a fast lookup table (like redis). Then when a request comes in, if it has the etag/last modified date headers, you can first hit this fast lookup first, and return 304 if it's found, or fall back to the normal lookup if it's not.

    We're actually storing the full response in cache. If a 304 is to be returned, the body isn't returned so your suggestion works for that example. However, our cache layer also is used when a 200 is to be returned when the content hasn't changed. So, it's more efficient even for those who aren't using the cache request headers.

    Again, along with a future retooling of the API, we expect to include caching support in officially supported wrappers for more languages. At that point, we wouldn't need to store the full responses in our cache because most folks would be caching on their side out of the box. Unfortunately right now it's rare.

  10. 10 Posted by Craig Willis on 12 Sep, 2016 04:43 PM

    Craig Willis's Avatar

    I'm assuming that the risk you're referring to is that the customer record could actually change within that 5 min. Is that correct? If so, you might consider lengthening that TTL and listen for hooks to trigger invalidating your local cache. The risk is that CG could be doing something that changes the customer that your app wouldn't know about. For the most part, that's the recurring billing stuff. So, you could listen for the transaction hook and invalidate the cache. That would substantially decrease that risk.

    Correct, yes, the risk is the change of customers data. Most of the important things that could check for us are actually tracked items. I couldn't see a hook that would notify us on that? If there is, that would make things much easier.

    The 1st call shouldn't be necessary since you have the value in your cache (unless you're just doing it for safety). The 3rd call shouldn't be necessary because the 2nd call returns the latest customer data.

    Correct on the first call - we don't always hit it,

    However, as I mentioned earlier, our API can't return the customer information. So if we are doing something like adding something in our system that is linked to a tracked item, our rest API call can only return stuff specific to that tracked item, not the rest of the customer information. The rest of the data does need to be re-gotten, there is no way around that. (Or not without a re-write. We're considering how we can send that data to the right systems, we already send similar data via SignalR. As I mentioned before, if there was a hook that we could listen for that would notify us on this, it'd be worth spending the time to get this updating via SignalR)

    However, this has gone a little offtopic :)

    When the API retooling happens, I'd love to be notified, and if there is a hook that might do what we are interested in, I'd love to hear more on it, otherwise I'm happy for this to be closed for the moment, my 304 issue is sorted.

  11. Support Staff 11 Posted by Marc Guyer on 12 Sep, 2016 06:57 PM

    Marc Guyer's Avatar

    We generally don't do hooks for events that are initiated by remote clients since an action by a remote client receives a response with the updated customer record (including the tracked item data) which is just as good if not better than an asynchronous hook. The actions that you'd probably need hooks for are those initiated by CG's ongoing automation. For the most part, we're talking about payment transactions (the recurring billing of your customers). Another that you'll seldom see is an auto-cancel due to non payment. So, you can listen for the transaction hook and the subscription canceled hook.

  12. Marc Guyer closed this discussion on 12 Sep, 2016 06:57 PM.

Discussions are closed to public comments.
If you need help with Cheddar please start a new discussion.

Keyboard shortcuts

Generic

? Show this help
ESC Blurs the current field

Comment Form

r Focus the comment reply box
^ + ↩ Submit the comment

You can use Command ⌘ instead of Control ^ on Mac