{"id":24,"date":"2011-11-03T03:45:09","date_gmt":"2011-11-03T03:45:09","guid":{"rendered":"http:\/\/digitaldatatactics.com\/?p=24"},"modified":"2015-08-22T16:06:42","modified_gmt":"2015-08-22T16:06:42","slug":"gapages","status":"publish","type":"post","link":"https:\/\/www.digitaldatatactics.com\/index.php\/2011\/11\/03\/gapages\/","title":{"rendered":"Using Google Analytics Settings to Properly Identify Pages"},"content":{"rendered":"<p><span style=\"font-weight: 400;\">This year, I\u2019ve been involved in many Google Analytics implementations and audits, and there has been a recurring theme around misunderstood GA Configuration Settings, mostly regarding how a page is identified. For instance, one recent client of mine had a 350-page site. But because of missed configuration settings, those 350 pages were showing up as literally 28000 URIs! Can you imagine pulling a report on any given page of that site? So to clear the air and hopefully save some GA users out there from future headaches, here are 3 quick ways to use GA Configuration to properly identify your pages:<\/span><\/p>\n<h3><b>1. DEFAULT PAGE DOES NOT MEAN \u201cMY DEFAULTIEST PAGE\u201d<\/b><\/h3>\n<p><span style=\"font-weight: 400;\">The default page setting is used whenever a page URI ends in a trailing slash without specifying a file name- for instance, if you used this setting to specify that \u201cindex.html\u201d is your default file name, <\/span><span style=\"font-weight: 400;\">\u201cexample.com\/\u201d<\/span><span style=\"font-weight: 400;\">and <\/span><span style=\"font-weight: 400;\">\u201cexample.com\/index.html\u201d <\/span><span style=\"font-weight: 400;\">would merge into just <\/span><span style=\"font-weight: 400;\">\u201cexample.com\/index.html\u201d <\/span><span style=\"font-weight: 400;\">in your content report, making analysis on that single page much easier.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Unfortunately, the name of the setting is misleading and tempts people into entering what they consider the \u201cdefault page\u201d of their site: their home page. But if you enter <\/span><span style=\"font-weight: 400;\">\u201chttp:\/\/www.example.com\/index.html\u201d <\/span><span style=\"font-weight: 400;\">as your default page, the real result would be that any page that ends in a trailing slash will have the full home page URL appended to it:<\/span><\/p>\n<pre><span style=\"font-weight: 400;\">www.example.com\/folder\/<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">would become this in the reports:<\/span><\/p>\n<pre><span style=\"font-weight: 400;\">www.example.com\/folder\/http:\/\/www.example.com\/index.html<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">This is obviously not desirable, so please do not put your full home page URL as your \u201cDefault Page\u201d. If you have a site that sometimes uses index.html or index.php, then you may want to specify THAT as your default page, so all pages with a trailing slash would consistently have index.html appended to them. Otherwise, leave the setting blank.<\/span><\/p>\n<h3><b>2. SO WHAT DO I DO ABOUT THOSE TRAILING SLASHES?<\/b><\/h3>\n<p><span style=\"font-weight: 400;\">The default page setting cannot be used for what most people WANT to use it for- to standardize whether or not a page ends with a trailing slash. If you give in to the temptation to simply put a \u201c\/\u201d in this setting, then \u201c<\/span><span style=\"font-weight: 400;\">folder\/\u201d<\/span><span style=\"font-weight: 400;\"> and \u201c<\/span><span style=\"font-weight: 400;\">folder\u201d<\/span><span style=\"font-weight: 400;\"> wouldn\u2019t merge together as desired- rather, <\/span><span style=\"font-weight: 400;\">\u201cfolder\/\u201d <\/span><span style=\"font-weight: 400;\">would become <\/span><span style=\"font-weight: 400;\">\u201cfolder\/\/\u201d, <\/span><span style=\"font-weight: 400;\">and <\/span><span style=\"font-weight: 400;\">\u201cfolder\u201d<\/span><span style=\"font-weight: 400;\">would stay the same (remember, the setting only looks at which pages have a trailing slash, then appends the setting value to it).<\/span><\/p>\n<p><span style=\"font-weight: 400;\">If you would like to have all trailing slashes removed as the standard, so that <\/span><span style=\"font-weight: 400;\">example.com\/folder\/ <\/span><span style=\"font-weight: 400;\">and <\/span><span style=\"font-weight: 400;\">example.com\/folder <\/span><span style=\"font-weight: 400;\">would appear as the same line item in the Content report- and who wouldn\u2019t want that?- you will need to set up a custom filter that removes all trailing slashes:<\/span><\/p>\n<p><a href=\"http:\/\/digitaldatatactics.com\/wp\/wp-content\/uploads\/2015\/08\/GApages.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-25\" src=\"http:\/\/digitaldatatactics.com\/wp\/wp-content\/uploads\/2015\/08\/GApages.jpg\" alt=\"GApages\" width=\"629\" height=\"383\" srcset=\"https:\/\/www.digitaldatatactics.com\/wp\/wp-content\/uploads\/2015\/08\/GApages.jpg 629w, https:\/\/www.digitaldatatactics.com\/wp\/wp-content\/uploads\/2015\/08\/GApages-300x183.jpg 300w\" sizes=\"(max-width: 629px) 100vw, 629px\" \/><\/a><\/p>\n<p><span style=\"font-weight: 400;\">Field A -&gt; Extract A should be set to &#8220;^\/(.*?)\/+$&#8221;<\/span><span style=\"font-weight: 400;\"><br \/>\n<\/span><span style=\"font-weight: 400;\">Output To -&gt; Constructor should be set to &#8220;\/$A1&#8221;<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Please note, much to my chagrin, such a filter would prevent your profile from being eligible for the not-filter-friendly Real Time Analytics(for now), but I promise this isn\u2019t as big a deal as you might think it is, though I\u2019ll save my reasoning for the unimportance of \u201creal time\u201d analytics for another blog post.<\/span><\/p>\n<h3><b>3. EXCLUDE PARAMETERS!<\/b><\/h3>\n<p><span style=\"font-weight: 400;\">Most GA implementations I\u2019ve seen have at least a few query string parameters excluded, but I don\u2019t think I\u2019ve seen anyone get it \u201cjust right\u201d yet (admittedly, my level of nitpickiness may be a tad unrealistic). The problem with not excluding all non-content-identifying parameters is that parameters will cause one page to show up as separate items in the content report. For instance, if I want to report on how many page views<\/span><span style=\"font-weight: 400;\">promotions\/springlanding.html<\/span><span style=\"font-weight: 400;\"> got, I might need to pull the following 3 pages:<\/span><\/p>\n<pre><span style=\"font-weight: 400;\">promotions\/springlanding.html<\/span><\/pre>\n<pre><span style=\"font-weight: 400;\">promotions\/springlanding.html?secured=true<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">AND<\/span><\/p>\n<pre><span style=\"font-weight: 400;\">promotions\/springlanding.html?type=4<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">Into my reports, to report on only one piece of content. This isn\u2019t the end of the world; using filters in my reports I can usually get the info I need, though it does make trending harder. But it\u2019s such an easy fix!<\/span><\/p>\n<p><span style=\"font-weight: 400;\">To see which query parameters might have escaped your settings, go to your Top Content report and do a search for \u201c?\u201d. If there are a variety of those pernicious params in there, you may want to use an advanced filter to filter them out one at a time, to be sure you\u2019ve got them all. Now you have a handy list of parameters you can take to your configuration settings for exclusion. If you want to track one of the parameters, but not necessarily in your content report, don\u2019t forget you can always use a Profile Filter if you want to extract a query parameter and put it into another field, like a user defined variable, or just clean up parameters in general.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Be careful to not exclude parameters that actually have importance in identifying content. For instance, a products page may have a <\/span><span style=\"font-weight: 400;\">?sku=12345 <\/span><span style=\"font-weight: 400;\">that specifies which product is being viewed- this is a rather critical piece of information for some types of analysis, and should not be excluded.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Please be aware that users can add whatever parameters they want to your URLs, so you will never have full control here. Tools like Google Translate like to wreak havoc on URIs, but generally account for a very small percentage of page views.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Cleaning up your Content Report is an easy quickwin- it doesn\u2019t take a lot of effort and can make analysis much easier. For questions about identifying content in Google or SiteCatalyst, contact me on twitter- <\/span><a href=\"http:\/\/web.archive.org\/web\/20121104155459\/http:\/\/twitter.com\/#!\/jenn_kunz\"><span style=\"font-weight: 400;\">@Jenn_Kunz<\/span><\/a><span style=\"font-weight: 400;\">.<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>This year, I\u2019ve been involved in many Google Analytics implementations and audits, and there has been a recurring theme around misunderstood GA Configuration Settings, mostly regarding how a page is identified. For instance, one recent client of mine had a 350-page site. But because of missed configuration settings, those 350 pages were showing up as &#8230; <a title=\"Using Google Analytics Settings to Properly Identify Pages\" class=\"read-more\" href=\"https:\/\/www.digitaldatatactics.com\/index.php\/2011\/11\/03\/gapages\/\" aria-label=\"Read more about Using Google Analytics Settings to Properly Identify Pages\">Read more<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[4,17],"tags":[11,10],"_links":{"self":[{"href":"https:\/\/www.digitaldatatactics.com\/index.php\/wp-json\/wp\/v2\/posts\/24"}],"collection":[{"href":"https:\/\/www.digitaldatatactics.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.digitaldatatactics.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.digitaldatatactics.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.digitaldatatactics.com\/index.php\/wp-json\/wp\/v2\/comments?post=24"}],"version-history":[{"count":1,"href":"https:\/\/www.digitaldatatactics.com\/index.php\/wp-json\/wp\/v2\/posts\/24\/revisions"}],"predecessor-version":[{"id":26,"href":"https:\/\/www.digitaldatatactics.com\/index.php\/wp-json\/wp\/v2\/posts\/24\/revisions\/26"}],"wp:attachment":[{"href":"https:\/\/www.digitaldatatactics.com\/index.php\/wp-json\/wp\/v2\/media?parent=24"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.digitaldatatactics.com\/index.php\/wp-json\/wp\/v2\/categories?post=24"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.digitaldatatactics.com\/index.php\/wp-json\/wp\/v2\/tags?post=24"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}