Sitemap file online. Sitemap file: HTML, XML, TXT, how to create and add to Yandex and Google webmaster

Sitemap file online. Sitemap file: HTML, XML, TXT, how to create and add to Yandex and Google webmaster

28.10.2023

Which are needed for search robots. Some will say that it is not needed, because all sections are already displayed. However, the need for such a page exists if the site contains fifty pages or more. For search engines and users, it will serve as a guide to help them understand where this or that information is contained.

XML and HTML files

Since it is used not only for search robots, but also for users visiting the site, two maps are usually compiled: in XML and HTML formats.

To create a Sitemap for search robots, use an XML file. Thanks to it, robots add new ones to their search database. In the absence of a map on a multi-page site, a large number of pages may not be indexed for sometimes a very long time.

An HTML file is used to create a sitemap for users. The importance of this map lies in the fact that its convenience directly determines whether the user will find the information he is interested in or not. Therefore, such a map is created for those Internet projects in which all sections and their subsections do not fit in the main menu.

How to create a Sitemap XML

There are three ways to solve this problem:

    Buying a generator for a sitemap.

    Create a Sitemap using online services.

    Manually writing a file.

To significantly save time, it is proposed to purchase generators. Therefore, if twenty to thirty dollars to purchase a license is a small waste of money for a webmaster, then buying it, especially for a large Internet resource, still won’t hurt, since then you won’t need to create a site manually.

For a site containing several hundred pages, online services are recommended, where in order to create a Sitemap, you only need to indicate the address of the Internet resource and download the result.

The best option is to manually create a map. To do this, you need to know tags such as url, urlset, loc, lastmod, changefreg and priority. In this case, the first three tags are considered mandatory, but the last three can be dispensed with.

Creating a Sitemap in Joomla

To create a Sitemap on a website, Joomla and Wordpress have special add-ons, like most well-known administration systems, thanks to which a sitemap is created manually or automatically. For large Internet projects that constantly update materials, this addition is very convenient.

In Joomla it is called Xmap, in Wordpress it is called Google XML Sitemaps.

Automatic sitemap creation

Free online servers help you create a Sitemap automatically if your site has no more than five hundred pages. Here's how easy it is to generate a sitemap:

    Having visited one of these Internet resources, you need to find the “Generate Sitemap” item, click on the “Create” button and create a Sitemap file automatically.

    Find “Site URL” and enter there the address of the site for which the map is being created.

    The system may require you to enter a verification code. You must also enter it and click “Start”.

    Upload the finished map to the website.

Manual way to create a map

This method is, on the one hand, the most difficult, taking up precious time, but on the other hand, it is the most reliable method, used in cases where other options are not suitable. So, for example, if there are many pages that are not particularly necessary to be included in the site map, but they automatically end up there, of course, the manual method will save the map from the “overdose” of such pages. Another reason for choosing this method is poor site navigation.

To implement manual map creation you must:

    Collect pages to include in a map.

    In the excel file, insert all addresses in the third column.

    Insert both url and loc in the 1st and 2nd columns.

    In the 4th and 5th columns, insert the closing url and loc.

    Use the “link” function to connect five columns.

    Create a sitemap.xml.

    Add both urlset and /urlset tags to this file.

    Insert a connected column between them.

The resulting file must be checked. This can be done, for example, in Yandex, in the webmaster panel.

How to create a Sitemap for Yandex and Google

After the site is created, it is added to the site. For this purpose, the file with the site map should be called Sitemap.xml and added to the root directory. To find it quickly, Google and Yandex have special tools. They are called “Webmaster Tools” (in Google) and “Yandex Webmaster” (in Yandex).

Adding a Sitemap to Google

Adding a Sitemap to Yandex

Likewise, you must first log in to Yandex Webmaster. Then go to Indexing/Sitemap files, specify the file path there and click the “Add” button.

    Search robots today will only take those files that contain no more than fifty thousand URLs.

    If the card exceeds ten megabytes, it is better to split it into several files. Thanks to this, the server will not be overloaded.

    To create a Sitemap xml correctly, if there are several files, you need to register them all in the index file, using the sitemapindex, sitemap, loc and lastmod tags.

    All pages must be written either with or without the “www” prefix.

    The required file encoding is UTF8.

    You also need to add an indication of the language namespace in the file.

How to create a sitemap for users

Since such a map is created for users, it should be as simple and clear as possible. Despite this, it is necessary to accurately convey all the information about the structure of the site being used.

HTML maps basically have a familiar custom structure of sections and subsections highlighted in specific ways, such as CSS styles and graphical elements.

To create a Sitemap for a large Internet project, as in the case of an XML map, splitting is also recommended here. In this case, it is carried out in the form of separate tabs, eliminating the bulkiness of the map.

The functionality of the page will be enhanced by the JavaScript language, which can be used in this map, since it is created not for search engine robots, but for users.

Order for a sitemap file

It is advisable that the created file containing the Sitemap always be clean and tidy, especially if the site has a large number of pages. Since search engine robots scan sitemaps very quickly, there may simply not be enough time to view the entire file of a large Internet resource.

Therefore, if you get used to adding pages to the site map not at the bottom, but at the top, then, on the one hand, there is no doubt that the search robot will have time to view the addresses of new pages, and on the other hand, in this way it will be much easier to control all pages.

The sitemap.xml file is a tool that allows webmasters to inform search engines about the site pages that are available for indexing. Also, in the XML map you can specify additional page parameters: date of last update, frequency of updates and priority relative to other pages. Information in sitemap.xml can influence the behavior of the search crawler and, in general, the process of indexing new documents. The sitemap contains directives for including pages in the queue for crawling and complements robots.txt, which contains directives for excluding pages.

In this guide you will find answers to all questions regarding the use of sitemap.xml.

Do I need sitemap.xml

Search engines use sitemap to find new documents on the site (this can be html documents or media content) that are not accessible through navigation, but need to be crawled. Having a link to a document in sitemap.xml does not guarantee that it will be crawled or indexed, but most often the file helps large sites be indexed better. In addition, data from the XML map is used to determine canonical pages, unless specifically indicated in the rel=canonical tag.

Sitemap.xml is important for sites where:

  • Some sections are not accessible through the navigation menu.
  • There are many isolated pages or poorly connected pages.
  • Technologies that are poorly supported by search engines are used (for example, Ajax, Flash or Silverlight).
  • There are a lot of pages and there is a chance that the search crawler will miss new content.

If this is not your case, then most likely you do not need sitemap.xml. For sites where every page important for indexing is available within 2 clicks, where JavaScript or Flash technologies are not used to display content, where canonical and regional tags are used if necessary, and fresh content appears no more often than a robot visits the site, in the file sitemap.xml is not necessary.

For small projects, if there is only a problem with a large level of document nesting, it can be easily solved using an HTML sitemap, without resorting to using an XML map. But if you decide that you still need sitemap.xml, then read this guide in its entirety.

Technical information

  • Sitemap.xml is a text file in XML format. However, search engines also support text format (see next section).
  • Each sitemap can contain a maximum 50,000 addresses and weigh no more 50MB(10MB for Yandex).
  • You can use gzip compression to reduce the size of the sitemap.xml file and increase its transfer speed. In this case, use the gz extension (sitemap.xml.gz). At the same time, weight restrictions remain for uncompressed sitemaps.
  • The location of the Sitemap determines the set of URLs that can be included in the Sitemap. The map containing the addresses of the pages of the entire site should be located in the root. If the sitemap is located in a folder, then all URLs in this sitemap should be located in this folder or deeper ().
  • Addresses in sitemap.xml must be absolute.
  • The maximum URL length is 2048 characters (1024 characters for Yandex).
  • Special characters in the URL (such as ampersand "&" or quotes) must be masked in the HTML entity.
  • The pages specified in the map must display a 200 http status code.
  • The addresses listed in the map should not be closed in the robots.txt file or in meta-robots.
  • The sitemap should not be closed in robots.txt, otherwise the search engine will not crawl it. The file itself may be in the index, this is normal.

XML map formats

Search engines support a simple text sitemap format, which simply lists page URLs without additional parameters. In this case, the file must be UTF-8 encoded and have the extension .txt.

Search engines also support the standard XML protocol. Google additionally supports sitemaps for images, videos, and news.

An example sitemap containing only one address.

https://сайт/ 2018-06-14 daily 0.9

XML tags
urlset
url(required) - The parent tag for each URL.
loc(required) - Document URL, must be absolute.
lastmod- date of the last modification of the document in Datetime format.
changefreq- frequency of page changes (always, hourly, daily, weekly, monthly, yearly, never). The meaning of this tag is a recommendation to search engines, not a command.
priority- URL priority relative to other addresses (from 0 to 1) for scanning order. If not specified, the default is 0.5.

XML map for images

Some optimizers insert links to images into sitemap.xml in the same way as links to HTML documents. This can be done, but it is better for Google to use an extension of the standard protocol and send additional information about the images along with the URLs. Creating XML image maps is useful if images need to be scanned and indexed, and at the same time, they are not directly accessible to the bot (for example, JavaScript is used).

An example of a sitemap containing one page and its associated images

http://example.com/primer.html http://example.com/kartinka.jpg http://example.com/photo.jpg Вид на Балаклаву Севастополь, Крым http://creativecommons.org/licenses/by-nd/3.0/legalcode

XML tags
image:image(required) - information about one image. A maximum of 1000 images can be used.
image:loc(required) - path to the image file. If a CDN is used, then it is acceptable to link to another domain if it is verified in the webmaster panel.
image:caption- caption for the image (may contain long text).
image:title- title image (usually short text).
image:geo_location- the shoot place.
image:license- Image license URL. Used for advanced image search.

XML map for video

Similar to the image map, Google also has a video sitemap extension where you can specify detailed information about the video content that affects how it appears in video searches. A video sitemap is necessary when the site uses videos that are hosted locally, and when indexing these videos is difficult due to the technologies used. If you are embedding a video from YouTube on your website, then a video-sitemap is not needed here.

News Sitemap

If you have news content on your site and participate in Google News, it is useful to use a Sitemap for news, so Google will quickly find your latest materials and index all news articles. In this case, the Sitemap should contain only addresses of pages published in the last 2 days and contain no more than 1000 URLs.

Using multiple cards

If necessary, you can use several sitemaps, combining them into one index sitemap. Multiple sitemap.xml are used in cases where:

  • The site uses several engines (CMS).
  • The site has more than 50,000 pages.
  • It is necessary to set up convenient error tracking in sections.

In the latter case, each large section of the site has its own sitemap.xml and all of them are added to the panel for webmasters, where it is convenient to see which section has the most errors (see the section on finding errors in the sitemap).

If you have 2 or more sitemaps, they need to be combined into an index sitemap, which looks the same as a regular sitemap (except for the presence of sitemapindex and sitemap tags instead of urlset and url), has similar restrictions and can only link to regular XML maps (not index maps) .

Example Sitemap Index:

http://www.example.com/sitemap-blog.xml.gz 2004-10-01T18:23:17+00:00 http://www.example.com/sitemap-webinars.xml.gz 2005-01-01

sitemapindex(mandatory) - specifies the current protocol standard.
sitemap(mandatory) - contains information about a separate sitemap.
loc(required) - sitemap location (in xml, txt or rss format for Google).
lastmod- time of sitemap change. Allows search engines to quickly discover new URLs on large sites.

How to create sitemap.xml

Methods for creating XML Sitemap:

  • Internal CMS tools. Many CMSs already support sitemap creation. To find out, read the documentation for your CMS, look at the menu items in the admin panel, or contact engine technical support. Upload the file https://yoursite.com/sitemap.xml on your site; it may already exist and is being dynamically generated.
  • External plugins. If the CMS does not have functionality for generating a sitemap, and it supports plugins, Google which plugin covers the sitemap.xml question for your engine and install it. In some cases, you need to contact programmers to write a similar plugin for you.
  • Separate script on the site. Knowing the XML map protocol and technical limitations, you can create sitemap.xml yourself by adding a generation script to CRON. If you are not a programmer, use the other items in this list.
  • Sitemap generators. There are many sitemap.xml generators that scan your site and give you a ready-made map to download. The disadvantage here is that every time the site is updated, you need to manually generate a sitemap.
  • Parsers. Desktop programs designed for technical analysis of a website usually provide the opportunity to download sitemap.xml, generated based on crawled pages. It works similarly to sitemap generators, only it runs locally on your machine.

Popular online sitemap generators

XML-Sitemaps.com

Allows you to get sitemap.xml in a few clicks. Supports XML, HTML, TXT and GZ formats. Convenient to use for small sites (up to 500 pages).

A similar generator, but has a little more settings and allows you to create a map of up to 2000 pages for free.

Has many settings, allows you to import URLs from a CSV file. Scans up to 500 URLs for free.

There is no limit on the number of pages to scan. But for large sites, the generation process may freeze for several tens of minutes.

Local programs for generating XML Sitemap

G-Mapper Sitemap Generator

Free desktop version of the sitemap generator for Windows.

Screaming Frog SEO Spider

Flexible sitemap generation tool with many settings. Convenient if you already use screamin frog for other SEO tasks. After scanning the site, use the menu item Sitemaps -> Create XML Sitemap.

Netpeak Spider

A less flexible, but still convenient solution for quickly generating sitemap.xml. After scanning the site, you need to use the menu item Tools -> Generate Sitemap.

(Last update: 12/25/2019)

Hello colleagues! In this post I will tell you how to create and configure a Sitemap for WordPress, for search engines such as Yandex, Google, Bing, [email protected]. Don't confuse XML with HTML. The first is suitable for search engines, and the second is intended primarily for users. You probably already know what an XML sitemap is.

Let me remind you: this is a list of pages of your website/blog that your visitors do not see, but are only clearly visible. The XML Sitemap file allows you to inform Google and Yandex about the pages of your site so that they are guaranteed to be included in the search engine index.

XML Sitemaps can help search engines determine the location of site pages, blog pages, when they were last updated, frequency of updates, and importance relative to other pages on the web resource so that the search engine can index the site more intelligently.

What is a Sitemap?

A sitemap is a way of organizing a website, showing URLs and data in each section. The XML document contains instructions for search engine robots.

Sitemap - XML ​​files with information for search engines (such as Google, Yandex, Bing, [email protected]) about website pages that are subject to indexing. Simply put, these are the site URLs that you send to search engines.

Yandex supports XML and TXT formats. The XML format allows additional information to be conveyed.

How to create Sitemaps for a WordPress site?

The plugin will help us create a blog or site map on WordPress Google XML Sitemaps. Which generates an XML file that improves the indexing of a web resource by search engines, updates it, and so on. All you need is a plugin, configure it and forget about it. Installation of Google XML Sitemaps is standard.

Google XML Sitemaps WordPress Plugin

One of the best WP plugins. It will provide a complete XML sitemap for search engines. It has already been installed more than 24,243,146 times.

Use this plugin, it will greatly improve your SEO. It will create a special XML sitemap and help search engines such as Google, Bing, Yandex and Mail Ru better index your web resource. With a sitemap like this, it's much easier for crawlers to see the full structure of your site and extract it more efficiently. The plugin supports all kinds of pages generated by WordPress, as well as custom URLs.

Plus, it notifies all major search engines every time you post new content. The module is completely free and translated into Russian (though not completely, but the most important things have been translated).

Install the plugin in the usual way using the Plugins - Add New function. In the search field, enter its name Google XML Sitemaps:

Setting up XML Sitemaps

After successfully installing and activating the plugin, you need to configure it. In the "Settings" section, click on XML-Sitemap:

The page: XML Sitemap Generator for WordPress will open, where you need to configure it. On this page, at the very top, you will see a link to your card:

You can click on it and see what it looks like:

Important settings are in our native and powerful Russian language, so it won’t be difficult for you to understand everything. What settings can be made? The plugin developer indicates that the default values ​​are suitable for most sites. But still, each user must decide for himself. Indicate which categories to exclude from the map, the contents of the site map, priorities, frequency of changes, and so on. It should be something like this:

After all the settings, be sure to click “Update settings”. Ready. The next step is to add the Sitemap file in the search engine webmaster to speed up the indexing of the WordPress site. And also add a link to the map in robots.txt.

So, which pages should you include in your map? For SEO reasons, it is recommended to only include pages that you would like to see in searches.

Now, when writing an article, the plugin will inform search engines (not all, but only Google, Bing, Yahoo and Ask.com) about updating your blog. The plugin will automatically update your sitemap if you post, so there's nothing else you need to do

For other search engines, you need to do this - read.

Please note, friends, the XML Sitemaps function is available in SEO plugins and .

All the best and see you again. Bye bye!

(function(w, d, n, s, t) ( w[n] = w[n] || ; w[n].push(function() ( Ya.Context.AdvManager.render(( blockId: "R-A -292864-4", renderTo: "yandex_rtb_R-A-292864-4", async: true )); )); t = d.getElementsByTagName("script"); s = d.createElement("script"); s .type = "text/javascript"; s.src = "//an.yandex.ru/system/context.js"; s.async = true; , this.document, "yandexContextAsyncCallbacks");

Using the Sitemap file, you can inform Yandex about the current site structure by specifying a special directive in robots.txt or adding it to Yandex.Webmaster.

The webmaster allows you to:

Download Sitemap

    Select a site from the list.

    In the field, enter the URL where the file is available. For example, https://example.com/sitemap.xml.

    Click the Add button.

After adding the file, it is queued for processing. The robot will download it within two weeks. Each added file, including those attached to the Sitemap index file, is processed by the robot separately.

After downloading, next to each file you will see one of the statuses:

Status Description Note
"OK"
"Redirect" Remove the redirect and notify the robot about the update
"Error" The file is not formed correctly inform the robot about the update
"Not indexed"
Disallow inform the robot about the update
Status Description Note
"OK" The file is formed correctly and loaded into the robot database

The date of the last download will be displayed next to the file.

Indexed pages will appear in search results within two weeks

"Redirect" The specified URL redirects to another address Remove the redirect and notify the robot about the update
"Error" The file is not formed correctly Click the Error link for details. After making changes to the file, notify the robot about the update
"Not indexed" When accessing Sitemap, the server returns an HTTP code other than 200

Check if the file is accessible to the robot using the tool by specifying the full path to the file.

If the file is not available, contact the administrator of the site or server on which it is located.

Access to the file is denied in robots.txt using the Disallow directive Allow access to the Sitemap and notify the robot about the update

Sitemap update

If you have changed the Sitemap file added to Yandex.Webmaster, you do not need to delete it and upload it again - the robot regularly checks the file for updates and errors.

To speed up crawling a file, click the icon. If you are using a Sitemap index file, you can start processing each file listed in it. The robot will download the data within three days. You can use the function up to 10 times for one host.

Once you have used up all attempts, the next one will be available 30 days after the first. The exact date is displayed in the Webmaster interface.



Removing Sitemap

In the Yandex.Webmaster interface, you can delete those files that were added on the Sitemap Files page: If a directive was added for Sitemap in the robots.txt file, delete it. After making changes, information about the Sitemap will disappear from the robot and Yandex.Webmaster database within a few weeks.

Questions and answers

The sitemap is displayed in the service as an excluded page with the status “Invalid document format”

Displaying a Sitemap (as well as other XML files) as excluded pages is for informational purposes only and does not in any way affect site indexing or Sitemap processing.

The Sitemap file may be displayed in a group of excluded pages because the robot tried to index it as a regular page, while XML files are not indexed in the Yandex search engine and do not participate in search results.

"Unknown tag" error occurred while processing Sitemap

A Sitemap file can only contain certain XML elements. If the Webmaster detects other elements in the file (for example, an indication of a mobile or multilingual version, addresses of pictures), the “Unknown tag” error will appear in the Webmaster. Unsupported elements are ignored by the robot when processing the Sitemap, while data from supported elements is taken into account. Therefore, it is not necessary to change the Sitemap file.

If the contents of the file are changed, it will take up to two weeks to update the information in Webmaster.

The Sitemap file is in the "Not Indexed" status

The Sitemap file may not be indexed for several reasons:

    The robot recently crawled the Sitemap and has not yet processed it. Wait two weeks. If you are using a Sitemap index file with multiple files, they may take longer to process than a single Sitemap file.

    The site was previously unavailable to the indexing robot. You must wait until the next robot visit to the site.

    Access to Sitemap is denied in the file


Enter site name



Choose the map type - xml or html, priority, frequency of changes and included site pages - as a result you will receive a valid site map file.

XML sitemap generator

You can create a Sitemap online completely free of charge using the special Saitreport service. A site map is a necessary condition for its promotion. With its help, information about the structure of the resource is transmitted to search engines.

The lack of a sitemap makes it difficult to promote it. Most search engines may not notice important documents for a long time. Therefore, to index pages, it is necessary to correctly generate and place a site map.

The Saitreport online service allows you to generate an XML map and makes it possible to create a Sitemap for website promotion. This file will allow you to select the priority of page indexing, the frequency of updates and the type of documents included.

How to create a sitemap?

To generate a sitemap, fill in all the necessary fields on the service website yourself: home page address, date and frequency of updates, priority and number of pages. Run the tool and wait for the Sitemap generation to finish.

The map generator will analyze the site and generate a text file Sitemap.xml, which you need to add to the root yourself. Using a sitemap provides the following benefits:

  • site pages will be added to the search;
  • search engines will identify the site much faster;
  • The search match level will increase.

When using the Saitreport service, you can create an XML map and take advantage of additional functionality by setting parameters.

© 2024 hecc.ru - Computer technology news