We all realize that web indexes can recognize pages on a site through its internal (and external) linking structure. Nonetheless, the most comprehensive and accessible strategy of providing our site structure to search engines, by a long shot, is a valid XML sitemap that is transferred to a search engine’s Webmaster Tools (for example, Google Webmaster Tools or Bing Webmaster Tools).
Naturally, the development team behind Magento understood that manually creating an XML sitemap from steadily evolving pages, products/items, and category URLs would be an incomprehensible task. Therefore, they built their own special XML Sitemap Generator.
With a specific end goal to create our XML sitemap, we must first configure its contents. Navigate to System > Configuration > Google Sitemap and configure Frequency and Priority for our fundamental main page types. We can likewise arrange how frequently we need to produce our sitemap.
Contingent upon our Magento store setup and content, we may decide that our categories are the most important pages. For Magento, your category pages are gold as they’re the most optimized pages for broad search terms and we want search engines to find and index them first. Our next most essential pages would be our individual item/product pages; we want those to appear in search engines for customers searching specifically for our product names. This typically is more important for companies that develop their own products (that they market) or resell brands such as Air Jordan or Wilson Footballs. It might not be as important to companies that white label clothing with weird names like “Whispy Outside Tori Tank” because people will use broad search terms like Tanks Top Pink. The page type with the least priority would normally be our CMS pages.
As mentioned previously, the home page in Magento is classified as a CMS page; subsequently, based on our specifications, it will receive a lower priority. In addition, the home page URL or will be set as http://www.mydomain.com/home, which is not how we want our home page to appear.
The priority is simply a value that is passed to Google in order for it to prioritize the list of pages it will index; it will then (supposedly) do so programmatically.
Based upon our chosen SEO campaign, we would set the priority higher for those pages we are optimizing. Therefore, if we are optimizing our categories and products more than CMS pages (recommended) we would set their priorities to match the following:
- Within Categories Options, set Frequency to Daily and Priority to 1.
- Within Products Options, set Frequency to Daily and Priority to 0.8 (or anything less than 1 and more than we are about to set the CMS pages to).
- Within CMS Pages Options, set Frequency to Weekly and Priority to 0.25.
- Within Generation Settings, set Enabled to Yes, Start Time to 01 00 00 (01:00 a.m. or you can use another time that your traffic is at it’s lowest), and Frequency to Daily, and enter your e-mail address into the Error Email Recipient field.
- Click on Save Config.
In the end, we should have something that looks like this:
In order for our generation settings to automatically generate our sitemap, the Magento CRON must be enabled. A quick tutorial on how to do this can be found here: Magento Cron
Login to your Magento admin interface and go to: System > Configuration > System. Then using the ‘Cron (Scheduled Tasks)’ tab enter the following information:
Generate schedules every: 1
Schedule ahead for: 1
Missed if not run within: 30
History cleanup every: 120
Success history lifetime: 120
Failure history lifetime: 120
You can then set cron jobs to run at virtually any interval. I have mine set to 15 minutes for most of my clients due to their use of custom modules related to scheduled tasks.
Our next step is to make sure that we have an XML sitemap that will be updated based on these settings. To do this, we need to first create one as follows:
- Navigate to Catalog > Google Sitemap and click on Add Sitemap.
- For Filename, enter sitemap.xml.
- For Path, we can specify a path, but we would usually place an XML sitemap on the root of our website (enter /).
- If we have multiple store views, we can enter a specific sitemap for each Store View (in which case we would change our filename to suit the convention, for example, sitemap_en.xml for English).
- Click on Save & Generate.
This should generate an XML sitemap in our chosen path with our chosen filename. We can test this by visiting our path/filename in the URL, for example, http://www.mydomain.com/sitemap.xml.
If an error message appears informing you that the specified directory is not writable, please make sure that the folder specified under Path has sufficient privileges to allow the system to write a file—usually 775, or failing that, 777. Be careful that you don’t do 777 on folders that have sensitive information, just in case you have data issues and/or you have more people accessing your server besides your own direct team. If a folder is marked 777 people can updated and overwrite the whole directory.
Once we have confirmed that our XML sitemap is set up and working correctly, we now need to make sure that it has been submitted to our chosen search engine—Google.
There are two ways to do this, but for safety’s sake, we would usually perform both:
- Open the robots.txt file and add in Sitemap: http://www.mydomain.com/sitemap.xml.
- Log in to Google Webmaster Tools (www.google.com/webmasters/tools/), click on our website (or add our site if we need to create one), and then, within Crawl, click on Sitemaps and Add/Test Sitemap.