Although Google does provide access to easily download some Google Account data, they do not provide direct access to download your entire Google Web History. I did a bit of research, and although you can access it via an RSS feed 1000 records at a time, it can still be difficult and time consuming to obtain in its entirety. I took it upon myself to develop a simple solution that combines all of the RSS feed files into a single, downloadable CSV file.
Update: I’ve updated the script and added some enhancements. Find out more in my updates to the Google History download script.
Privacy Info
First of all, let me put your mind at rest by informing you that my solution will not do anything malicious with your Google Account or your Google Web History:
This solution works all within your browser. None of your Google Account data or history is transmitted or received to any server, other than Google’s servers. This tool downloads your Google Distory directly from Google to your computer and your browser. The source code for my bookmarklet is freely available to download, review, and modify, under the GPL v3.
I do not want the liability nor the responsibility of handling anyone else’s private Google Account data, so priority number one for me in developing this tool was that it all had to work on the client-side, without having to pass the data through any servers. I know that if I were to use a third-party solution, I would only use a solution constructed in the manner I built mine, and I’m sure most other people feel the same way.
If you would prefer not to use this tool and manually download your Google Web History, scroll down to the Technical Details section for information on how to do this.
Usage
You’ll need flash in order to use this tool. It makes use of a library that uses a small Flash movie to convert the downloaded data into a file, within the browser (read the technical details for more info). To use it simply drag and drop this bookmarklet onto your web browser’s bookmark bar:
[raw]
[/raw]
If you’re using Internet Exploder (I feel sorry for you), you may receive a warning about the bookmarklet loading insecure content. This is because it will download a JavaScript file from geeklad.com, and load it into the current webpage. Just click Yes to proceed with adding the bookmark.
Next, visit your Google Web History. Unless you’ve visited it during your current browser session or have previously set up your browser to remember your computer for accessing your history, you’ll be asked for your Google Account password.
After you log in, click on the Download Google Web History bookmark you just created. If you’re using Internet Explorer or Google Chrome, the browser will warn you that the page has insecure content. Load Anyway, which will refresh the page, and then you’ll have to click the bookmark again.
I do not believe Firefox has this security check, and I’m not sure about Safari. In either case, if you do see a security warning, allow the page to reload and then click the bookmark once again. Next, you should see the page darken, and a box appear informing you that your history is being downloaded.
Prepare yourself for a long wait if you have an extensive history. To give you an idea, my history has over 44k searches and my tool downloaded over 135k records and the CSV file was about 28MB! If you get sick of waiting, you can click the cancel button (you can see what it looks like in my update post, the screenshot below is out-of-date) and download what has been done so far.
Eventually, all of your history will be downloaded and you will be presented with some search statistics and a button to download to a CSV file.
Update: More info on usage is available in the usage section of the latest update on this script to download Google History.
Shared Computers
If you share your computer with others, there is something of which you should be made aware. If you configure the browser to keep you logged into your Google Account, you will see the history generated by anyone using the computer (unless they log you out). When you see a bunch of searches for knitting needles, quilting, and dentures, don’t be alarmed and suspect an octogenarian of having hacked your Google account. It’s only Grandma surfing on your computer while logged into your Google Account.
If you do not even want the temptation of seeing what websites the others around you are visiting and what they are searching (good for you), make sure to always log out of your Google account when you’re done using the computer. Alternatively, you could just make sure to never check the “remember me” checkbox when you log into your Gmail, Google+, Google Docs, Google Reader, or any other Google services. That’s generally good practice anyway, when using a shared computer.
If you do decide to use my tool for evil, such as spying on your significant other or siblings, shame on you! That is not the purpose for which I’ve intended this tool to be used. That being said, I’m sure this tool will see its fair share of abuse.
If you don’t want anyone else using my tool to spy on you, don’t ever give your Google Account password to anyone. In order to access your Google Web History, you will be prompted for your password even if you’re logged into your Google account. This is a good thing, because if anyone wants to see your Google History or use my tool, they would need your Google Account password.
Technical Details
Update: There are some additional technical details on the Google Web History RSS feed in my latest update.
When I was searching for this solution myself, I discovered that there is an RSS feed of your history that can be viewed at https://www.google.com/history/lookup?q=&output=rss&num=1000&start=1. The num parameters indicates how many records you want to view at once, and start parameter which record you want to start at.
You can download 1000 records at a time. If you wanted to manually download it, you could save the output of these links:
- https://www.google.com/history/lookup?q=&output=rss&num=1000&start=1
- https://www.google.com/history/lookup?q=&output=rss&num=1000&start=1001
- https://www.google.com/history/lookup?q=&output=rss&num=1000&start=2001
- https://www.google.com/history/lookup?q=&output=rss&num=1000&start=3001
- https://www.google.com/history/lookup?q=&output=rss&num=1000&start=4001
- https://www.google.com/history/lookup?q=&output=rss&num=1000&start=5001
- … and so on … until you finally reach a page with no <item> tags in it.
Then you need to import all of the XML the files into a program to combine them into one. You can do this with the latest version of Microsoft Excel. I was initially going to go down this route. However, decided it would be worthwhile to build a tool to automatically do this since others may want the same convenience of downloading their entire Google Web History in a single CSV file.
Another technical detail I’d like to share is that this tool uses a very nice JavaScript library called Downloadify. Downloadify provides the magic that allows it to generate a CSV file without having to pass your data through a third-party server. It does this through a small Flash program that takes a string as an input, creates the download button, and then creates a dialog box to download the string into a file when you click the download button.
You’re right. It does not bring in everything, only the most recent ~4k records or so. So you load in every other day, with the assumption that you have fewer than 1000 searches every two days, a good assumption to make.
I wish we could devise a way to do it without downloading so much superfluous data. This would especially be good for a browser-based solution, because a browser user doesn’t have the convenience (nor patience) of just kicking off a script and let it run while they go off and do other things.
If you do find a better way, please let me know and I’ll update the code accordingly. Thanks very much for sharing your experience with this. Perhaps others will chime in and we’ll find a working solution.
The solution just hit me like a ton of bricks. It will require parsing the dates and a good bit of extra code, but here’s what we do:
First, load the first 1000 results as I’m doing now. Then parse out the month, day, and year for the very last result. Load the next 1000 records starting there, parse out the month, day, and year for that result and then continue with the iteration. I’ll work on it and then post an update. Thanks again for your help!
That’s great! really I just didn’t feel like learning JS to get this done. Indeed each record has a date so this would work.
I’ve made this “Firefox-Chrome-History-Search” thing (first real attempt at PHP, see gitHub), I’ll make it compatible with Google history if I can be sure I’ve actually gotten all my history off Google. My plans are get my history from Google, delete it from Google, stop using Google all-together, make Firefox/Chrome remember history forever and use the PHP to keep a permanent searchable record of browser activity.
Ok, I’ve made the updates and it appears to be working quite well. My entire history came out to 28.5MB as a CSV! I also improved the cancellation process by adding a cancel button that would allow for a partial download and resumption. It also handles time-outs and re-login requests gracefully.
Thanks again for the tip on the date parameters. I hadn’t seen those posted anywhere. I’ll work up a blog post describing the updates to the script. If you have a blog you’d like me to link to (and/or another handle you’d like me to reference), let me know so I can give you props in the new blog post.
The usual handle is Naka, no (safe and proper) site/blog so no need to reference.
Thanks a lot for the script, if you want I’ll throw my Google Web History clone your way when I’m done with it.
The below are probably of no use but it’s what I discovered a 4 weeks ago.&hl=en&month=12&day=1&yr=2010&output=rss&num=1000&max=??????? <–unix date stamp ?modified?&start=0&st=web (web search)&st=img (images)&st=news (news)&st=frg (products)&st=ad (sponsored links)&st=vid (videos)&st=maps (maps)&st=blogs (blogs)&st=books (books)
I’ve blogged about the update here:
/updated-script-to-download-google-history
One thing of note that you’ll be interested to learn is that I may have uncovered a bug in the RSS feed. If you’re not careful, you can end up in an infinite loop with consecutive days with a lot of history.
For example, you load the feed starting at July 1 and the last record is June 30. You load the next page starting at June 30, but instead it loads starting at July 1, and you end up in an infinite loop. This is something to watch out for if you develop your own script to pull the history the way I’m doing it.
I promise I will steal nothing. 🙂
at first, I made a simple excel vba loop to save the rss, but only got 4k or so items saved, not sure why.
anyway, this bookmarklet works as promised. got some 20k google search history data to play with.
Yeah, initially I had the same problem as well. The start parameter doesn’t seem to obtain all the data and only brings back about 4k records. You need to use the date parameters (yr, month, day) to extract everything. Check out my update for more info on these parameters.
Pingback: Google Web History: A Gold Mine of Personal Information | Memento
When I run the bookmarklet but get the following message: “Instructions
Please visit https://www.google.com/history/ and log into your Google Account. Then click the bookmarklet again.”
This occurs in Chrome, Firefox and IE, where Flash is installed in all of them. I have tried being logged in to G history before running the bookmarklet, and logging in after the message comes up, to no avail. Any idea how to solve this?
This also occurs using Safari on OS X 10.6.
When it downloads it makes the time Publishing date time in GMT is there a way to get it to change to my time zone?
this doesn’t seem t work anymore, any ideas why and how i could get it working again?
It should be easy to fix. Replace the “www” in line 926 of the source code with “history”.
The reason why the script is not working anymore is because Google is redirecting https://google.com/history/ and https://www.google.com/history/ to https://history.google.com/history/, which does not match against the test in the script. Easy to update, I think?
Pingback: How do I export my Google search history? | Question and Answer
Pingback: How do I export my Google search history? - DL-UAT
I love it! Excellent article. I had a good experience merging documents online and happy to share it with you. Just look at the service
http://goo.gl/BW9Fr8
. Its pretty easy to use.