I just added a few web pages to my crawl and a window popped up with the following instruction:

"Tick the following box beside the url if you want to promote its ringset during the next crawl.
Note: The seed url's ringset can be promoted if it was already reached by previous crawls in current network."

I couldn't find anything on this in the userguide, but I presume it to mean that this url was reached as a linked non-seed site on the earlier crawl and that this option will promote it to a seed site?

Thanks,

M

  • anon

    Yes, that is correct. This is to do with the "ringset" attribute, which all pages and pagegroups have.

    If a page is entered as a seed URL then it will have ringset=1 ("seedset"). Pages that are discovered in the crawl that link to the seed page (i.e. seed page hyperlinks to new page or vice-versa) will have ringset=2 ("first ring"). But note: only newly-discovered pages will get ringset=2 - if a seed page links to another seed page, then the ringset doesn't get updated.

    If you later decide you want to add more seed URLs (do another crawl using the same database), then in the add seeds window you will get presented with a popup window if a seed URL you have added was already in the database. This window will allow you choose to "promote" the URL so it becomes a fully-fledged seed i.e. it will have ringset=1 ("seedset") even if it previously had ringset=2.

    If you don't choose to promote the ringset of the new URL, it will still be crawled, but it will have ringset=2 and new pages discovered by crawling that page will have ringset=3 (as long as they were not already in the database).

    Rob

    Jan 16, 2013