Problem with New Relic and Content-Length Workaround Solution
We had a problem today with a 3rd party aggregator, News Now, who have been using wget to scrape the Daily Express site to gather content. All of a sudden their files were ending prematurely by around 100 characters.
I tried it myself with curl and reproduced the same problem, but there were a few odd things about this:
- The source code for the same page was not missing any characters.
- My working copy on a Unix machine for the same story was intact when using curl and wget.
- The HTML on the staging server on EC2, was also intact using wget and curl.
So, after eliminating the impossible (which I won’t bore you with), we were left with a problem that looked very improbable: New Relic were inserting JS code into the head and before the closing html tag to monitor users, but were not updating the HTTP Content-Length header.
Browsers are smart enough to ignore the Content-Length if it’s missing or incorrect, but wget and curl are set up by default to adhere strictly to the content length, hence the discrepancy.
Short Term Solutions
1. Add the ‘–ignore-length’ option to wget.
2. Take New Relic off the live servers.
Medium Term Solution
We spoke to New Relic, who told us we could take the automated JS injection off and instead insert it ourselves onto every page. Doesn’t sound like much fun.
Long Term Solution
The long term solution for this would be for New Relic to update the Content-Length after it has messed around with the HTML, or even remove it entirely, but it doesn’t look like this is going to happen.
Hello there!
We’re really sorry you had problems with our real user monitoring. I would love to follow up and get more specifics about your situation. We have run into issues with RUM on certain news aggregators where our injection code is mis-diagnosing the placement of the header. I would like to see our team get this fixed soon, and it would be great to know if your site is affected by this issue, or if there is something other problem. Please feel free to reach out to me directly via email, or via the support case that you opened with us.
Thanks,
Belinda Runkle
Manager, Agents
New Relic
Hi
Will from New Relic here.
This problem is fixed: Starting with version 4.3 (December 2013), the New Relic PHP agent will not inject Javascript if a content-length header has been set.
For more information, please see:
https://docs.newrelic.com/docs/releases/php
We appreciate you bringing this to our attention: We are eager to create the best PHP agent possible!
— Will