If-None-Match header

Craig Willis

Sep 09, 2016 @ 09:55 AM

I'm looking at improving performance of my app, and in your changelog, I noticed this:

http://support.cheddargetter.com/kb/getting-started-19/change-log

API response caching
We now support the If-Modified-Since and If-None-Match headers. If you've got a cached version, CG can send back a 304 Not Modified without a body. Big speed and bandwidth improvements!
Currently only implemented on plans/get. Coming soon to customers/get.
Works with little config in the python wrapper, sharpy with local cache. Possibly others. Please let us know if you've contributed the local caching functionality for other wrappers. We'll distribute the kudos.

Should this be working?

I'm sending a request to /xml/plans/get/productCode/7d0abe0f-0402-428f-a267-278c3e4d3ade_GBP.

The first response returns the following:

HTTP/1.1 200 OK
Server: Apache
Vary: Accept-Encoding
Cache-Control: private
Content-Type: application/xml
Content-Encoding: gzip
Date: Fri, 09 Sep 2016 09:24:51 GMT
Node: CHED01VMW02
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Pragma: no-cache
Etag: d1feeab306b3f65ade382c09a7c3ee6a
Connection: Keep-Alive
Last-Modified: Fri, 09 Sep 2016 09:24:38 GMT
Content-Length: 1440

My next request looks like:

GET /xml/plans/get/productCode/7d0abe0f-0402-428f-a267-278c3e4d3ade_GBP
Connection: Keep-Alive,Keep-Alive
If-None-Match: d1feeab306b3f65ade382c09a7c3ee6a
Keep-Alive: 600
Host: cheddargetter.com
Cookie: CHEDDARGETTER=tcbqth8q5hf9hla7g1qn7la4j1
Accept-Encoding: gzip, deflate

And rather than a 304 as expected, I'm getting a 200.

Am I doing something wrong there?

Also, is this still only on plans/get, or have you enabled it anywhere else?

Support Staff 1 Posted by Marc Guyer on Sep 09, 2016 @ 01:52 PM

Hi Craig -- I see this working in Chrome, looking at the dev tools. At a glance, the difference appears to me to be that Chrome is also issuing the If-Modified-Since header:

GET /xml/plans/get/productCode/7d0abe0f-0402-428f-a267-278c3e4d3ade_GBP HTTP/1.1
Host: cheddargetter.com
Connection: keep-alive
Cache-Control: max-age=0
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Encoding: gzip, deflate, sdch, br
Accept-Language: en-US,en;q=0.8
Cookie: [redacted]
If-None-Match: d1feeab306b3f65ade382c09a7c3ee6a
If-Modified-Since: Fri, 09 Sep 2016 13:36:28 GMT

HTTP/1.1 304 Not Modified
Server: Apache
Vary: Accept-Encoding
Cache-Control: private
Date: Fri, 09 Sep 2016 13:36:44 GMT
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Etag: d1feeab306b3f65ade382c09a7c3ee6a
Connection: Keep-Alive

I checked our caching implementation and confirmed that the If-Modified-Since header is also required.

2 Posted by Craig Willis on Sep 09, 2016 @ 02:16 PM

Thanks,

With If-Modified-Since, I'm now seeing the 304 requests, some of the time. It seems to be cacheable for at most a minute or so, even though the etag doesn't change. After a short period we start getting 200's again, until we update the date. I was hoping to be able to cache this for much longer periods, I can't imagine this changing often at all.

I do notice the Last Modified in the headers seems to be very recent, much more recent that the last modification to the plans.

Also, surprisingly, the 304 are taking longer than just straight get request!

The first full get request I do usually takes 500ms, and then subsequent ones (in a short window) take about 230 ms.

When I get a 304 request, I'm seeing a fairly standard 450~ ms. regardless of how close the requests are.

Is this expected?
- Edit
Support Staff 3 Posted by Marc Guyer on Sep 09, 2016 @ 02:39 PM

With If-Modified-Since, I'm now seeing the 304 requests, some of the time. It seems to be cacheable for at most a minute or so, even though the etag doesn't change. After a short period we start getting 200's again, until we update the date. I was hoping to be able to cache this for much longer periods, I can't imagine this changing often at all.

Our config for that endpoint is to cache for 24 hours. Without additional detail about your experience (specific headers, etc), that's the best information I can provide at this time.

I do notice the Last Modified in the headers seems to be very recent, much more recent that the last modification to the plans.

It could very well be that something in a plan is being modified. Perhaps some functionality in the platform that was added since the caching component. We'd have to dig in to find out more.

The first full get request I do usually takes 500ms, and then subsequent ones (in a short window) take about 230 ms.

A typical request is expected to take ~500ms. It could be that successive requests being faster is a result of being loaded from our server-side cache and served as a 200 with the body due to request header values.

When I get a 304 request, I'm seeing a fairly standard 450~ ms. regardless of how close the requests are.

When the response is 304, I'm seeing a consistent time of ~140-180ms.
- Edit
4 Posted by Craig Willis on Sep 09, 2016 @ 02:58 PM

Marc,

Request 1 - Starts caching - saves data. (277ms) (15:49:41.391) Request 2 - 304 Request - (500ms) (15:50:14.438) Request 3 - 200 request - (380ms) (15:50:21.256)

Request 1 Request:

GET /xml/plans/get/productCode/7d0abe0f-0402-428f-a267-278c3e4d3ade_GBP HTTP/1.1
Connection: Keep-Alive
Keep-Alive: 600
Host: cheddargetter.com
Cookie: CHEDDARGETTER=ifngjvc9tkel78usef8ln8kcc2
Accept-Encoding: gzip, deflate

Request 1 Response:

HTTP/1.1 200 OK
Server: Apache
Vary: Accept-Encoding
Cache-Control: private
Content-Type: application/xml
Content-Encoding: gzip
Date: Fri, 09 Sep 2016 14:49:41 GMT
Node: CHED01VMW02
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Pragma: no-cache
Etag: d1feeab306b3f65ade382c09a7c3ee6a
Connection: Keep-Alive
Last-Modified: Fri, 09 Sep 2016 14:49:22 GMT
Content-Length: 1439

Request 2 Request:

GET /xml/plans/get/productCode/7d0abe0f-0402-428f-a267-278c3e4d3ade_GBP HTTP/1.1
Connection: Keep-Alive
If-None-Match: d1feeab306b3f65ade382c09a7c3ee6a
If-Modified-Since: Fri, 09 Sep 2016 14:49:41 GMT
Keep-Alive: 600
Host: cheddargetter.com
Cookie: CHEDDARGETTER=ifngjvc9tkel78usef8ln8kcc2
Accept-Encoding: gzip, deflate

Request 2 Response:

HTTP/1.1 304 Not Modified
Server: Apache
Vary: Accept-Encoding
Cache-Control: private
Date: Fri, 09 Sep 2016 14:50:14 GMT
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Etag: d1feeab306b3f65ade382c09a7c3ee6a
Connection: Keep-Alive

Request 3 Request:

GET /xml/plans/get/productCode/7d0abe0f-0402-428f-a267-278c3e4d3ade_GBP HTTP/1.1
Connection: Keep-Alive
If-None-Match: d1feeab306b3f65ade382c09a7c3ee6a
If-Modified-Since: Fri, 09 Sep 2016 14:49:41 GMT
Keep-Alive: 600
Host: cheddargetter.com
Cookie: CHEDDARGETTER=ifngjvc9tkel78usef8ln8kcc2
Accept-Encoding: gzip, deflate

Request 3 Response:

HTTP/1.1 200 OK
Server: Apache
Vary: Accept-Encoding
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Content-Type: application/xml
Content-Encoding: gzip
Date: Fri, 09 Sep 2016 14:50:21 GMT
Node: CHED01VMW01
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Pragma: no-cache
Connection: Keep-Alive
Content-Length: 1440

According to our logs, request 2 and 3 were identical, however one returned a 304, one returned a 200. There were no plan changes during this.
- Edit
Support Staff 5 Posted by Marc Guyer on Sep 09, 2016 @ 08:37 PM

Hi Craig -- After taking a closer look, I can't tell you why you got a 200 on your 3rd request above. I confirmed that we have a seemingly really solid functional test that passes. It ensures that the 304 is received. We've had a long-standing ticket that called for a slight improvement to some internal caching of plan properties. I found that this was causing the cache to be invalidated even if the internal caching of plan property values didn't change. So, I went ahead and resolved that ticket in the hopes that this is what was causing your 200. That patch is tested and deployed. Go ahead and rerun your test and see if you can still get the unexpected 200.
- Edit
6 Posted by Craig Willis on Sep 12, 2016 @ 07:31 AM

Marc,

Thanks for that, I'm seeing much more consistent 304's now. I'll keep trying over the next few hours.

Could I ask if plans/get is the only page this is enabled on?

Thanks

Craig
- Edit
Support Staff 7 Posted by Marc Guyer on Sep 12, 2016 @ 02:04 PM

Thanks Craig -- plans/get is the only endpoint enabled at this time. It's considered experimental for now. Other endpoints are typically quite quick. The other one that we want to do caching on is customers/get but the invalidation of cache after a modification of anything related to a single customer makes that difficult to implement. Also, cache storage space requirement would increase dramatically since we'd be holding every active customer record in cache all the time. It's a wishlist item at this point and will probably be implemented in a future retooling of the API.

Would you benefit by caching of customers/get? Any other endpoints in particular?
- Edit
8 Posted by Craig Willis on Sep 12, 2016 @ 03:31 PM

Customers/get would make a bit of difference to us, our API is a level lower than what is offered by cheddargetter. For example, some of our API calls depend on values of different tracked items. For each call of those APIs, we have to look up tracked items for the current user.

They are all done as stateless API's, and don't really share data, so each one can end with a hit on cheddargetter.

We can (and do) cache this data internally for a short lifespan, currently about 5 minutes, however it's a bit of a risk.

When we actually update a tracked item, as our API is something like /AddX, we can't return the customer data your API returns. so this means updating a tracked item ends up being 3 different calls. Once to get the customers data, one to update the tracked item (based on a second tracked item, which is why we can't just do a single call to your add tracked items), and then a third call to get the updated client data.

Really, I'd like to be able to query a single tracked item. I'd love to be able to go /customer/get/code/{trackedItem}, and just get a single int back that was the current value of that tracked item, or a 404 if it's not there. I don't really mind about the rest of the information at that point.

On your cache storage comment, I have no idea how you actually store data, but we've done something similar before. Maybe something like when a customer changes, we store both the last modified date, and a hash string in a fast lookup table (like redis). Then when a request comes in, if it has the etag/last modified date headers, you can first hit this fast lookup first, and return 304 if it's found, or fall back to the normal lookup if it's not.

That massively reduces the amount of data you cache, and lets us handle caching the full data on our end. We get back a quick 304 to know nothings changed, so we can hit our cache, or we get the updated data and update our cache, and we never worry about out of date data.
- Edit
Support Staff 9 Posted by Marc Guyer on Sep 12, 2016 @ 04:03 PM

Ok. Thanks for the detail.

We can (and do) cache this data internally for a short lifespan, currently about 5 minutes, however it's a bit of a risk.

I'm assuming that the risk you're referring to is that the customer record could actually change within that 5 min. Is that correct? If so, you might consider lengthening that TTL and listen for hooks to trigger invalidating your local cache. The risk is that CG could be doing something that changes the customer that your app wouldn't know about. For the most part, that's the recurring billing stuff. So, you could listen for the transaction hook and invalidate the cache. That would substantially decrease that risk.

When we actually update a tracked item, as our API is something like /AddX, we can't return the customer data your API returns. so this means updating a tracked item ends up being 3 different calls. Once to get the customers data, one to update the tracked item (based on a second tracked item, which is why we can't just do a single call to your add tracked items), and then a third call to get the updated client data.

The 1st call shouldn't be necessary since you have the value in your cache (unless you're just doing it for safety). The 3rd call shouldn't be necessary because the 2nd call returns the latest customer data.

Really, I'd like to be able to query a single tracked item. I'd love to be able to go /customer/get/code/{trackedItem}, and just get a single int back that was the current value of that tracked item, or a 404 if it's not there. I don't really mind about the rest of the information at that point.

I can see how that could be valuable from an efficiency perspective when there is heavy reliance on tracked items. This, and others like it, are expected be available after a future retooling of the API.

On your cache storage comment, I have no idea how you actually store data, but we've done something similar before. Maybe something like when a customer changes, we store both the last modified date, and a hash string in a fast lookup table (like redis). Then when a request comes in, if it has the etag/last modified date headers, you can first hit this fast lookup first, and return 304 if it's found, or fall back to the normal lookup if it's not.

We're actually storing the full response in cache. If a 304 is to be returned, the body isn't returned so your suggestion works for that example. However, our cache layer also is used when a 200 is to be returned when the content hasn't changed. So, it's more efficient even for those who aren't using the cache request headers.

Again, along with a future retooling of the API, we expect to include caching support in officially supported wrappers for more languages. At that point, we wouldn't need to store the full responses in our cache because most folks would be caching on their side out of the box. Unfortunately right now it's rare.
- Edit
10 Posted by Craig Willis on Sep 12, 2016 @ 04:43 PM

I'm assuming that the risk you're referring to is that the customer record could actually change within that 5 min. Is that correct? If so, you might consider lengthening that TTL and listen for hooks to trigger invalidating your local cache. The risk is that CG could be doing something that changes the customer that your app wouldn't know about. For the most part, that's the recurring billing stuff. So, you could listen for the transaction hook and invalidate the cache. That would substantially decrease that risk.

Correct, yes, the risk is the change of customers data. Most of the important things that could check for us are actually tracked items. I couldn't see a hook that would notify us on that? If there is, that would make things much easier.

The 1st call shouldn't be necessary since you have the value in your cache (unless you're just doing it for safety). The 3rd call shouldn't be necessary because the 2nd call returns the latest customer data.

Correct on the first call - we don't always hit it,

However, as I mentioned earlier, our API can't return the customer information. So if we are doing something like adding something in our system that is linked to a tracked item, our rest API call can only return stuff specific to that tracked item, not the rest of the customer information. The rest of the data does need to be re-gotten, there is no way around that. (Or not without a re-write. We're considering how we can send that data to the right systems, we already send similar data via SignalR. As I mentioned before, if there was a hook that we could listen for that would notify us on this, it'd be worth spending the time to get this updating via SignalR)

However, this has gone a little offtopic :)

When the API retooling happens, I'd love to be notified, and if there is a hook that might do what we are interested in, I'd love to hear more on it, otherwise I'm happy for this to be closed for the moment, my 304 issue is sorted.
- Edit
Support Staff 11 Posted by Marc Guyer on Sep 12, 2016 @ 06:57 PM

We generally don't do hooks for events that are initiated by remote clients since an action by a remote client receives a response with the updated customer record (including the tracked item data) which is just as good if not better than an asynchronous hook. The actions that you'd probably need hooks for are those initiated by CG's ongoing automation. For the most part, we're talking about payment transactions (the recurring billing of your customers). Another that you'll seldom see is an auto-cancel due to non payment. So, you can listen for the transaction hook and the subscription canceled hook.
- Edit
Marc Guyer closed this discussion on Sep 12, 2016 @ 06:57 PM.

Discussions are closed to public comments.
If you need help with Cheddar please start a new discussion.

New Issue
Conversation Started
The discussion is closed

No more actions from Cheddar or the discussion starter are required.

#925 state: closed
Re-open the discussion Re-open the discussion

Private Permissions

This discussion is private. Only you and Cheddar support staff can see and reply to it.

Public Permissions

This discussion is public. Everyone can see and reply to it.

Comments Feed

Keyboard shortcuts

Generic

?	Show this help
ESC	Blurs the current field

Comment Form

r	Focus the comment reply box
^ + ↩	Submit the comment

You can use Command ⌘ instead of Control ^ on Mac

Recent Discussions

	28 Mar 22:45	PDF Invoices
	24 Jan 08:33	Cancel Account
	11 Jan 07:13	Users in my Cheddar account?
	30 Nov 02:07	churn amount
	22 Nov 08:41	testing

Recent Articles

	RBI regulations - context, FAQs, & next steps
	Configuring Stripe
	Configuring AuthorizeNet
	Pricing Plan Basics
	Marketing Metrics Cookie Best Practices

If-None-Match header

Craig Willis

New Issue

Conversation Started

The discussion is closed

Re-open the discussion Re-open the discussion