Untill recently I never paid (too) much attention to the SEO parts of the DNN Dotnetnukes website in detail......
Of course the site was submitted to the major searchengines, metatags "Description" and "Key Words" were set through the [admin/site settings] and more or less repeated on page level. For some random reason I decided to look into the Google Webmaster Tools a little further. Things started growing and the digging resulted in a short study on SEO.
Although it's far from complete and I certainly won't pretend to have mastered all SEO tricks, here's what I came up with for the Dotnetnukes website.
Subjects - metatags - robots.txt file - sitemap - search engine results
DNN Files involved: default.aspx default.aspx.vb
DNN Settings involved: - host/ host settings - admin/ site settings
Cookbook instructions to improve Search Engine ranking and Results
1. default.aspx --> remove or uncomment: [meta content="INDEX, FOLLOW" name="ROBOTS" /]
Explanation: - the purpose of this tag is to tell webcrawlers and bots to index or not index a page and continue or not continue to the next (linked) page(s); |
- there is no reason to put that Metatag in there because the default action of crawlers/webbots already is to index and follow; - unless you do not want the webbots to index/ follow you can use the tag [noindex, nofollow], otherwise don't bother;
- more important: the tag got in the way of showing the robots.txt file on the Google Webmaster tools robots.txt analyser (see 6. below) and made it impossible to test links against the entries in the robots.txt file. Removing the Tag solved that problem.
Note: - default.aspx and the accompanying default.aspx.vb file determine what Metatags are added to every single pages you create in DNN - some of the metatags are hardcoded in the default.aspx file, some are generated through the default.aspx.vb "code behind"-file
2. add robots.txt including a reference to sitemap.xml --> not to default Sitemap.aspx
Explanation: - so why all this trouble of removing the robots metatag and then adding some robots.txt file? - when webcrawlers/bots visit your site they look for a file containing instruction i.e. the robots.txt file - throught the file you can tell the bots which pages and/or directories of your website are disallowed or allowed to spider and index
- results of a good robots.txt file: * not showing of unwanted/disallowed pages in Search Engines/ indexes * less webcrawlertraffic (bandwidth used) to your site when non relevant directories/ pages are disallowed * less redundant or even double urls in the Search Engines results and therefore better/higher ranking |
Notes: - every possible url in your site will be indexed .e.g.: * /About/FryslanWebServices/tabid/102/ ctl/Privacy/ language/en-US/Default.aspx * /About/FryslanWebServices/tabid/102/ ctl/Privacy/ language/nl-NL/Default.aspx * /About/FryslanWebServices/tabid/102/ language/en-US/Default.aspx * /About/FryslanWebServices/tabid/102/ language/nl-NL/Default.aspx * /About/FryslanWebServices/tabid/102/ ctl/Terms/ language/en-US/Default.aspx etc. etc. etc.
- be carefull to list the entire diretorie structure of your site, the robots.txt is easily found by hackers as well - for more info on how to create/ write a robots.txt file see: * http://www.robotstxt.org/ * http://www.thesitewizard.com/ archive/robotstxt.shtml
3. include a reference in the robots.txt file to a sitemap
Explanation:
- Sitemaps are an easy way for webmasters to inform search engines about pages on their sites that are available for crawling. In its simplest form, a Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL (when it was last updated, how often it usually changes, and how important it is, relative to other URLs in the site) so that search engines can more intelligently crawl the site. |
|
- Web crawlers usually discover pages from links within the site and from other sites. Sitemaps supplement this data to allow crawlers that support Sitemaps to pick up all URLs in the Sitemap and learn about those URLs using the associated metadata. Using the Sitemap protocol does not guarantee that web pages are included in search engines, but provides hints for web crawlers to do a better job of crawling your site.
- the best way to point webcrawlers to the sitemap is to make a reference (link) to it in the robots.txt file - Example: Sitemap: http://www.dotnetnukes.com/sitemap.xml - For more info: http://www.sitemaps.org/index.php
DNN Sitemap:
- DNN (v.4.8.4.) automatically creates a sitemap based on the pages (Tabs) you added to your site. - The sitemap file can be found in the root of the site and its called: SiteMap.aspx - The (vb) codebehind file responsible for generating the SiteMap file is called SiteMap.aspx.vb
- The default Dnn sitemap does not meet all requirements for a proper SE registration - i.e. many links/ urls are missing that you may still want to have mentioned in the SE indexes. - As a solution it's best to create your own Sitemap (manually or automatically)
- For automatic sitemap creation: http://www.xml-sitemaps.com/index.php - Next make any manual adjustments/ additions you need to match your site and wishes. - Once the new sitemap.xml is created, upload it to your websites directory and make a reference to it in the robots.txt file (see above) |
4. Disallow "duplicate pageversions" of terms, privacy, login, register
- no matter what page you are on you can click on the link for Terms, Privacy, Login or Register - depending on the page you are on these links create different url's while they all show the same information - Search engines pick up all these different url's and notice similar content - this will have a negative effect on your ranking (duplicate info)
- to avoid these redundant url's add them to the robots.txt file as "disallowed" - examples: * Disallow:/Members/Contact/tabid/90/ctl/Privacy/ language/en-US/Default.aspx * Disallow:/Members/Blog/tabid/94/ctl/Privacy/ language/en-US/Default.aspx
5. Show copyright DNN --> remove skin object --> Generator Metatag Shown
- the metatag [meta id="MetaGenerator" name="GENERATOR" content="DotNetNuke" /] does not show up in the source of your pages if you don't have the DNN copyright setting turned on through [host/ host settings/ appearance/Show copyright credits]
- however if you turn on the Copyright Credits a row is added to the bottom of every page of your site and three copyright lines are added to the [head] section of the webpages
- to solve this you can remove or uncomment the [dnn:DOTNETNUKE runat="server" id="dnnDOTNETNUKE" /] object from the website Skins
- next change the Default.aspx.vb file and change the following:
|
6. DNN and the Google Webmaster Tools
- Dnn versions 4.8.* offer the option to make use of several Google Webmaster Tools through the DNNportal - Go to [admin/ site settings]: a. submit to searchengines; b. submit Sitemap c. Verification of siteownership
- Option b. is nothing more than a redirect to the Google site for registration to make use of the Google Webmaster Tools services - A faster and more customizable way would be to register at Google, create an account, go to http://www.google.com/webmasters/ and log in to Webmaster Tools
- you can now make use of the following services: * add sitemap (the customised one, see item 3), get status info on your sitemap * Analyze robots.txt --> most usefull to test the effect of your robots file * Generate robots.txt --> just go to http://www.xml-sitemaps.com/index.php * Manage site verification --> mandatory if you want to make use of the tools/ services
And more like: * Set crawl rate * Set geographic target * Set preferred domain * Enhanced image search * Remove URLs
More Links: - http://www.ifinity.com.au/Blog/Technical_Blog/ EntryID/40/
|