Using Google Analytics API and PHP to Ghostbust EntreCard

The EntreCard community has been all abuzz about ‘cheaters’ or ‘ghost droppers’ on EntreCard. These are people who somehow record with EntreCard that they’ve visited your site when they actually haven’t. I wrote a little bit about this in my entry Cheating at EntreCard and Finding Real Top Droppers. In that blog post, I suggested using Google Analytics to see who Google recognizes as visiting your site from your EntreCard drop box.

Since then, I’ve been using these analytics to visit those people that come to my website via the EntreCard Dropbox and visit the most pages. With that, I’ve seen more EntreCard based traffic over the past couple days than I have in nearly a year. Some of that may be simply because of the buzz about EntreCard ghosts and my current rise in popularity on EntreCard. However, it does seem like a useful strategy.

People have asked if my process could be automated, and I’m starting to work on automating it. Details are provided below. In addition, people have suggested it might be good to come up with a Quality Dropper list. Using what I’ve written so far, I can come up with the quality droppers, in terms of the number of page views, that I see. However, I could make this more useful if I had similar data from other sites. If you’re interested in submitting your own data, there are a couple different approaches.

One method, for a simple one time analysis, you could go to the Google Analytics page for your site. Click on Traffic Sources, Referring Sites, and EntreCard in the list. Set Show Rows to 500 to get the most data, and then click on Export TSV. Send the Tab Separated Values file to me and I’ll add it to my analysis. For ongoing analysis, if you give me read access to the Google Analytics for your page I can include your site in the automated analytics I’m building.

Send your TSV file to aldon.hynes at orient-lodge.com, or give the same address read access to your Google Analytics User Manager and drop me a note about it.

For those who want to build their own automated analytics, here is how I’ve been approaching it:

In the Google Analytics Data Export API - Protocol (v2), there is an example of a script that can be used to retrieve data.

The first thing I needed to do was retrieve the table ID for the site that I want to see the analytics for. I currently have many sites I track Google Analytics for, but for my own account and for various people I support. Once I found the table ID, I could take the sample code in the ‘Retrieving Report Data’ section, and modify it for my own purposes.

Specifically, I changed the dimensions to ga:referralPath and the metrics and sort criteria to ga:pageviews. I changed the filter to ga:source%3D%3Dentrecard.com so I would only get the results for EntreCard and I changed the max-results to 500 to get as many sites as possible.

The resulting data was an XML file that I parsed using PHP’s SimpleXML function. With that, I then generated the sample data:

Update: This is the data I initially presented in my blog post, but as Glenn points out, 40% of the users have moved on. I hadn't changed the default time range that Google provides.
16120 - 21

24624 - 18

8892 - 18

9654 - 18

23485 - 17

For a more up to date list, this is for January 2010:
41616 - 35

9654 - 24

50160 - 21

38558 - 19

5933 - 18

End of Update

This is a list of the EntreCard users that visited my site the most during the period I requested data for. It shows their EntreCard userid, the number of visits they produced, and is a link to the EntreCard User Detail page.

The code still needs a little work, but I’m willing to share it with others that are interested. If you have additional ideas or comments, please let me know.