Wednesday, 2 July 2014

Moz Blog

Moz Blog


Panda Pummels Press Release Websites: The Road to Recovery

Posted: 02 Jul 2014 04:00 PM PDT

Posted by russvirante

Many of us in the search industry were caught off guard by the release of Panda 4.0. It had become common knowledge that Panda was essentially "baked into" the algorithm now several times a month, so a pronounced refresh was a surprise. While the impact seemed reduced given that it coincided with other releases including a payday loans update and a potential manual penalty on Ebay, there were notable victims of the Panda 4.0 update which included major press release sites. Both Search Engine Land and Seer Interactive independently verified a profound traffic loss on major press release sites following the Panda 4.0 update. While we can't be certain that Google did not, perhaps, roll out a handful of simultaneous manual actions or perhaps these sites were impacted by the payday loans algo update, Panda remains the inference to the best explanation for their traffic losses.

So, what happened? Can we tease out why Press Release sites were seemingly singled out? Are they really that bad? And why are they particularly susceptible to the Panda algorithm? To answer this question, we must first address the main question: what is the Panda algorithm?

Briefly: What is the Panda Algorithm?

The Panda algorithm was a ground-breaking shift in Google's methodology for addressing certain search quality issues. Using patented machine learning techniques, Google used real, human reviewers to determine the quality of a sample set of websites. We call this sample the "training set". Examples of the questions they were asked are below:

  1. Would you trust the information presented in this article?
  2. Is this article written by an expert or enthusiast who knows the topic well, or is it more shallow in nature?
  3. Does the site have duplicate, overlapping, or redundant articles on the same or similar topics with slightly different keyword variations?
  4. Would you be comfortable giving your credit card information to this site?
  5. Does this article have spelling, stylistic, or factual errors?
  6. Are the topics driven by genuine interests of readers of the site, or does the site generate content by attempting to guess what might rank well in search engines?
  7. Does the article provide original content or information, original reporting, original research, or original analysis?
  8. Does the page provide substantial value when compared to other pages in search results?
  9. How much quality control is done on content?
  10. Does the article describe both sides of a story?
  11. Is the site a recognized authority on its topic?
  12. Is the content mass-produced by or outsourced to a large number of creators, or spread across a large network of sites, so that individual pages or sites don't get as much attention or care?
  13. Was the article edited well, or does it appear sloppy or hastily produced?
  14. For a health related query, would you trust information from this site?
  15. Would you recognize this site as an authoritative source when mentioned by name?
  16. Does this article provide a complete or comprehensive description of the topic?
  17. Does this article contain insightful analysis or interesting information that is beyond obvious?
  18. Is this the sort of page you'd want to bookmark, share with a friend, or recommend?
  19. Does this article have an excessive amount of ads that distract from or interfere with the main content?
  20. Would you expect to see this article in a printed magazine, encyclopedia or book?
  21. Are the articles short, unsubstantial, or otherwise lacking in helpful specifics?
  22. Are the pages produced with great care and attention to detail vs. less attention to detail?
  23. Would users complain when they see pages from this site?

Once Google had these answers from real users, they built a list of variables that might potentially predict these answers, and applied their machine learning techniques to build a model of predicting low performance on these questions. For example, having an HTTPS version of your site might predict a high performance on the "trust with a credit card" question. This model could then be applied across their index as a whole, filtering out sites that would likely perform poorly on the questionnaire. This filter became known as the Panda algorithm.

How do press release sites perform on these questions?

First, Moz has a great tutorial on running your own Panda questionnaire on your own website, which is useful not just for Panda but really any kind of user survey. The graphs and data in my analysis come from PandaRisk.com, though. Full disclosure, Virante, Inc., the company for which I work, owns PandaRisk. The graphs were built by averaging the results from several pages on each press release site, so they represent a sample of pages from each PR distributor.

So, let's dig in. In the interest of brevity, I have chosen to highlight just four of the major concerns that came from the surveys, question-by-question.

Q1. Does this site contain insightful analysis?

Google wants to send users to web pages that are uniquely useful, not just unique and not just useful. Unfortunately, press release sites uniformly fail on this front. On average, only 50% of reviewers found that BusinessWire.com content contained insightful analysis. Compare this to Wikipedia, EDU and Government websites which, on average, score 84%, 79% and 94% respectively, and you can see why Google might choose not to favor their content.

But does this have to be the case? Of course not. Press release websites like BusinessWire.com have first mover status on important industry information. They should be the first to release insightful analysis. Now, press release sites do have to be careful about editorializing the content of their users, but there are clearly improvements that could be made. For example, we know that use of structured data and visual aids improves performance on this question (ie: graphs and charts). BusinessWire could extract stock exchange symbols from press releases and include graphs and data related to the business right in the post. This would separate their content from other press release sites that simply reproduce the content verbatim. There are dozens of other potential improvements that can be added either programmatically or by an editor. So, what exactly would these kinds of changes look like?

In this case, we simply inserted a graph from stock exchange data and included on the right-hand side some data from Freebase on the Securities and Exchange Commission, which could easily be extracted as an entity from the documentation using, for example, Alchemy API. These modest improvements to the page increased the "insightful analysis" review score by 15%. 

Q2. Would you trust this site with your credit card?

This is one of the most difficult ideals to measure up to. E-Commerce sites, in general, perform better automatically, but there are clear distinctions between sites people trust and don't trust. Press release websites do have an e-commerce component, so one would expect them to fare comparatively well to non-commercial sites. Unfortunately, this is just not the case. PR.com failed this question in what can only be described as epic fashion. 91% of users said they would not trust the site with their credit card details. This isn't just a Panda issue for PR.com, this is a survival-of-the-business issue. 

Luckily, there are some really clear, straight-forward solutions to this address this problem. 

  • Extend HTTPS/SSL Sitewide
    Not every site needs to have HTTPS enabled, but if you have a 600,000+ page site with e-commerce functionality, let's just go ahead and assume you do. Users will immediately trust your site more if they see that pretty little lock icon in their browser. 
  • Site Security Solutions
    Take advantage of solutions like Comodo Hacker Proof or McAfee SiteAdvisor to verify that your site is safe and secure. Include the badges and link to them so that both users and the bots know that you have a safe site.
  • Business Reputation Badges
    Use at least one trade group or business reputation group (like the better business bureau) or, at minimum, employ some form of schema review markup that makes it clear to your users that at least some person or group of persons out there trusts your site. If you use a trade group membership or the BBB, make sure you link to them so that, once again, it is clear to the bots as well as your users.
  • Up-to-date Design
    This is a clear issue time and time again. In the technology world, old means insecure. The site PR.com looks old-fashioned by all measures of the word, especially in comparison to the other press release websites. It is no wonder that it performs so horribly.

It is worth pointing out here that Google doesn't need to find markup on your site to come to the conclusion that your site is untrustworthy. Because the Panda algorithm likely takes into account engagement metrics and behaviors (like pogo sticking), Google can use the behavior of users to predict the performance on these questions. So, even if there isn't a clear path between a change you make on your site and Googlebot's ability to identify that change doesn't mean the change cannot and will not have an impact on site performance in the search results. The days of thinking about your users and the bots as separate audiences are gone. The bots now measure both your site and your audience. Your impact on users can and will have an impact on search performance.

Q3. Do you consider this site an authority?

This question is particularly difficult for sites that both don't control the content they create and have a wide variety of content. This places press release websites squarely in the bullseye of the Panda algorithm. How does a website that accepts thousands of press releases on nearly any topic dare claim to be an authority? Well, it generally doesn't, and the numbers bear that out. 75% of respondents wouldn't consider PRNewswire an authority. 

Notice, though, that Wikipedia performs poorly on this metric as well (at least compared to EDUs and GOVs). So what exactly is going on here? How can a press release site hope to escape from this authority vacuum? 

  • Topically Segment Content
    This was one of the very first reactions to Panda. Many of the sites that were hit with Panda 1.0 sub-domained their content into particular topic areas. This seemed to provide some relief but was never a complete or permanent solution. Whether you segment your content into sub-directories or sub-domains, what you are really doing here is helping make clear to your users that the specific content your users are reading is part of a bigger piece of the pie. It isn't some random page on your site, it fits in nicely with your website's stated aims. 
  • Create an Authority
    Just because you don't write the content for your site doesn't mean you can't be authoritative. In fact, most major press release websites have some degree of editorial oversight sitting between the author and the website. That editorial layer needs to be bolstered and exposed to the end user, making it obvious that the website does more than simply regurgitate the writing of anyone with a few bucks. 

So, what exactly would this look like? Let's return to the Businesswire press release we were looking at earlier. We started with a bland page comprised of almost nothing but the press release. We then added a graph and some structured data automagically. Now, we want to add in some editor creds and topic segmentation.

Notice in the new design that we have created the "Securities & Investment Division", added an editor with a fancy title "Business Desk Editor" and a credentialed by-line. You could even use authorship publisher markup. The page no longer looks like a sparse press release but an editorially managed piece of news content in a news division dedicated to this subject matter. Authority done.

Q4. Would you consider bookmarking/sharing this site?

When I look at this question, I am baffled. Seriously, how do you make a site in which you don't control the content worth bookmarking or sharing? Furthermore, how do you do this with overtly commercial, boring content like press releases? As you could imagine, press release sites fair quite poorly on this. Over 85% of respondents said they weren't interested at all in bookmarking or sharing content from PRWeb.com. And why should they? 

So, how exactly does a press release website encourage users to share? The most common recommendations are already in place on PRWeb. They are quite overt with the usage of social sharing and bookmarking buttons (placed right at the top of the content). Their content is constantly fresh because new press releases come out every day. If these techniques aren't working, then what will?

The problem with bookmarking and sharing on press release websites is two-fold. First, the content is overtly commercial so users don't want to share it unless the press release is about something truly interesting. Secondly, the content is ephemeral so users don't want to return to it. We have to solve both of these problems.

Unfortunately, I think the answer to this question is some tough medicine for press release websites. The solution is multi-faceted. It starts with putting a meta expires tag on press releases. Sorry, but there is no reason for PRWeb to maintain a 2009 press release about a business competition in the search results. In its place, though, should be company and/or categorical pages which thoughtfully index and organize archived content. While LumaDerm may lose their press release from 2009, they would instead have a page on the site dedicated to their press releases so that the content is still accessible, albeit one click away, and the search engines know to ignore it. With this solution, the pages that end up ranking in the long run for valuable words and phrases are the aggregate pages that truly do offer authoritative information on what is up-and-coming with the business. The page is sticky because it is updated as often as the business releases new information, you still get some of the shares out of new releases but you don't risk the problems of PR sprawl and crawl prioritization. Aside from the initial bump of fresh content, there is no good SEO reason to keep old press releases in the index.

So, I don't own a press release site...

Most of us don't run sites with thousands of pages of low quality content. But that doesn't mean we shouldn't be cognizant of Panda. Of all of Google's search updates, Panda is the one I respect the most. I respect it because it is an honest attempt to measure quality. It doesn't ask how you got to your current position in the search results (a classic genetic fallacy problem), it simply asks whether the page and site itself deserve that ranking based on human quality measures (as imperfect as it may be at doing so). Most importantly, even if Google didn't exist at all, you should aspire to have a website that scores well on all of these metrics. Having a site that performs well on the Panda questions means more than insulation from a particular algorithm update, it means having a site that performs well for your users. That is a site you want to have.

Take a look again at the questionnaire. Does your site honestly meet these standards? Ask someone unbiased. If your site does, then congratulations - you have an amazing site. But if not, it is time to get to work building the site that you were meant to build.


Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read!

Stop Worrying About the New Google Maps; These URL Parameters Are Gold

Posted: 01 Jul 2014 05:15 PM PDT

Posted by David-Mihm

I suspect I'm not alone in saying: I've never been a fan of the New Google Maps.

In the interstitial weeks between that tweet and today, Google has made some noticeable improvements. But the user experience still lags in many ways relative to the classic version (chief among them: speed).

Google's invested so heavily in this product, though, that there's no turning back at this point. We as marketers need to come to terms with a product that will drive an increasing number of search results in the future.

Somewhat inspired by this excellent Pete Wailes post from many years ago, I set out last week to explore Google Maps with a fresh set of eyes and an open mind to see what I could discover about how it renders local business results. Below is what I discovered.

Basic URL structure

New Google Maps uses a novel URL structure (novel for me, anyway) that is not based around the traditional ? and & parameters of Classic Google Maps, but instead uses /'s and something called hashbangs to tell the browser what to render.

The easiest way to describe the structure is to illustrate it:

There are also some additional useful hashbang parameters relating to local queries that I'll describe in further detail below.

Some actual feature improvements

Despite the performance issues, New Google Maps has introduced at least two useful URL modifiers I've grown to love.

/am=t

This generates a stack-ranked list of businesses in a given area that Google deems relevant for the keyword you're searching. It's basically the equivalent of the list on the lefthand panel in Classic Google Maps but much easier to get to via direct URL. Important: am=t must always be placed after /search and before the hashbang modifiers, or else the results will break.

by:experts

This feature shows you businesses that have been reviewed by Google+ experts (the equivalent of what we've long-called "power reviewers" or "authority reviewers" on my annual Local Search Ranking Factors survey). To my knowledge it's the first time Google has publicly revealed who these power users are, opening up the possibility of an interesting future study correlating PlaceRank with the presence, valence, and volume of these reviews. In order to see these power reviewers, it seems like you have to be signed into a Google+ account, but perhaps others have found a way around this requirement.

Combining these two parameters yields incredibly useful results like these, which could form the basis for an influencer-targeting campaign:

Above: a screenshot of the results for: https://www.google.com/maps/search/grocery+stores+by:experts/@45.5424364,-122.654422,11z/am=t/

Local pack results and the vacuum left by tbm=plcs

Earlier this week, Steve Morgan noticed that Google crippled the ability to render place-based results from a Google search (ex: google.com/search?q=realtors&tbm=plcs). Many local rank-trackers were based on the results of these queries.

Finding a replacement for this parameter in New Google Maps turns out to be a little more difficult than it would first appear. You'll note in the summary of URL structure above that each URL comes with a custom-baked centroid. But local pack results on a traditional Google SERP each have their own predefined viewport -- i.e. the width, height, and zoom level that most closely captures the location of each listing in the pack, making it difficult to determine the appropriate zoom level.

Above: the primary SERP viewport for 'realtors' with location set to Seattle, WA.

Note that if you click that link of "Map for realtors" today, and then add the /am=t parameter to the resulting URL, you tend to get a different order of results than what appears in the pack.

I'm not entirely sure as to why the order changes--one theory is that Google is now back to blending pack results (using both organic and maps algorithms). Another theory is that the aspect ratio on the viewport on the /am=t window is invariably square, which yields a different set of relevant results than the "widescreen" viewport on the primary SERP.

One thing I have found helps with replicability is to leave the @lat,lng,zoom parameters out of the URL, and let Google automatically generate them for you.

Here are a couple of variations that I encourage you to try:

https://www.google.com/maps/search/realtors/am=t/data=
followed by:
!3m1!4b1!1srealtors!2sSeattle,+WA!3s0x5490102c93e83355:0x102565466944d59a
or
!3m1!4b1!4m5!2m4!3m3!1srealtors!2sSeattle,+WA!3s0x5490102c93e83355:0x102565466944d59a

Take a closer look at those trailing parameters and you'll see a structure that looks like this:

The long string starting with 0x and ending with 9a is the Feature ID of the centroid of the area in which you're searching (in this case, Seattle). Incidentally, this feature ID is also rendered by Google Mapmaker using a URL similar to http://www.google.com/mapmaker?gw=39&fid={your_fid}.

This is the easy part. You can find this string by typing the URL:

https://www.google.com/maps/place/seattle,+WA

waiting for the browser to refresh, and then copying it from the end of the resulting URL.

The hard part is figuring out which hashbang combo will generate which order of results, and I still haven't been able to do it. I'm hoping that by publishing this half-complete research, some enterprising Moz reader might be able to complete the puzzle! And there's also the strong possibility that this theory is completely off base.

In my research thus far, the shorter hashbang combination (!3m1!4b1) seems to yield the closest results to what tbm=plcs used to render, but they aren't 100% identical.

The longer hashbang combination (!3m1!4b1!4m5!2m4!3m3) actually seems to predictably return the same set of results as a Local search on Google Plus -- and note the appearance of the pushpin icon next to the keyword when you add this longer combination:

Who's #1?

Many of us in the SEO community, even before the advent of (not provided), encouraged marketers and business owners to stop obsessing about individual rankings and start looking at visibility in a broader sense. Desperately scrambling for a #1 ranking on a particular keyword has long been a foolish waste of resources.

Google's desktop innovations in local search add additional ammunition to this argument. Heat map studies have shown that the first carousel result is far from dominant, and that a compelling Google+ profile photo can perform incredibly well even as far down the "sixth or seventh" (left to right) spot.  Ranking #1 in the carousel doesn't provide quite the same visual benefit as ranking #1 in an organic SERP or 7-pack.

The elimination of the lefthand list pane on New Google Maps makes an even stronger case. It's literally impossible to rank these businesses visually no matter how hard you stare at the map:

Mobile, mobile, mobile

Paradoxically, though, just as Google is moving away from ranked results on the desktop, my view is that higher rankings matter more than ever in mobile search. And as mobile and wearables continue to gain market share relative to desktop, that trend is likely to increase.

The increasing ubiquity of Knowledge Panels in search results the past couple of years has been far from subtle. Google is now not only attempting to organize the world's information, but condense each piece of it into a display that will fit on a Google Glass (or Google Watch, or certainly a Google Android phone).

Nowhere is the need to be #1 more dramatic than in the Google Maps app, in which users perform an untold number of searches each month. List view is completely hidden (I didn't even know it existed until this week) and an average user is just as likely to think the first result is the only one for them as they are to figure out they need to swipe right to view more businesses.

Above: a Google Maps app result for 'golf courses', in which the first result has a big-time advantage.

The other issue that mobile results really bring to the fore is that the user is becoming the centroid.

This is true even when searching from the desktop. I performed some searches one morning from a neighborhood coffee shop with wifi, and a few minutes later from my house six blocks away. To my surprise, I got completely different results. From my house, Google is apparently only able to detect that I'm somewhere in "Portland." But from the coffee shop, it was able to detect my location at a much more granular level (presumably due to the coffee shop's wifi?), and showed me results specific to my ZIP code, with the centroid placed at the center of that ZIP.  And the zoom setting for both adjusted automatically--the more granular ZIP code targeting defaulted to a zoom level of 15z or 16z, versus 11z to 13z from my home, where Google wasn't as sure of my location.

Note, too, that I was unable to be exact about the zoom level in the previous paragraph. That's because the centroid is category-dependent. It likely always has been category dependent but that fact is much more noticeable in New Google Maps.

Maps app visibility

Taking both of these into account, in terms of replicating Google Maps App visibility, here is a case where specifying @lat,lng,zoom (with the zoom set to 17z)can be incredibly useful. 

As an example, I performed the search below from my iPhone at the hotel I was staying at in Little Italy after a recent SEM SD event. And was able to replicate the results with this URL string on desktop:

http://google.com/maps/search/lawyers/@32.723278,-117.168528,17z/am=t/data=!3m1!4b1

Conclusions and recommendations

While I still feel the user experience of New Google Maps is subpar, as a marketer I found myself developing a very Strangelovian mindset over the past week or so -- I have actually learned to stop worrying and love the new Google Maps. There are some incredibly useful new URL parameters that allow for a far more complete picture of local search visibility than the classic Google Maps provided.

With this column, I wanted to at least present a first stab to the Moz community to hopefully build on and experiment with. But this is clearly an area that is ripe for more research, particularly with an eye towards finding a complete replacement for the old tbm=plcs parameter.

As mobile usage continues to skyrocket, identifying the opportunities in your (or your client's) competitive set using the new Google Maps will only become more important.


Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read!

No comments:

Post a Comment

Web Analytics