• Article

How to find all pages on a website

  • June 16, 2022
  • 7 min

To find out the weaknesses of your site, you should perform a complete site audit. During this process, you will be able to find out the number of links on your site. Every website owner needs to know the total number of pages to understand if all the pages of the website have made it to the search engine index. So how to see all the pages of a website?
You need to know ways to check the number of pages on your site and your competitor’s site. How can you do this for free with a website page counter?
How many pages does a website have? In this article, we’ll take a look at four easy ways to figure this out.

Why do you need to find all the pages on a website?

By knowing how many pages are on a website, you can calculate if they are indexed and listed in the search engines database because sometimes there may be non-indexable pages.

Find all the hidden pages on a website, and you will understand if you have many duplicated pages, which negatively affects your site’s ranking on the web. It is important to know which web pages may have errors so that you can detect and correct them.

important

Errors on your site’s pages significantly hinder your search engine rankings. Perform regular audits and find all the URLs on a domain to know the status of your site and discover weaknesses.

Another important piece of information is the link weight.

You should evenly distribute the link weight on your resource pages, as this will depend on your search engine rankings. To do this, you need to have every link on your site and each web page carrying links to other pages. It is how you transfer the internal link weight on your site.

Also, if you want to create an XML site map, remove unnecessary pages or adjust internal linking, the correct number of pages on your website is essential.

With a complete list of pages, you can cleanse the site from junk, fix technical errors on pages and improve the ranking.

single page website

Why are pages “isolated”?

The reasons may be different, for example:

  • Landing pages were created for a specific campaign
  • Test pages were created for split testing
  • Pages were removed from the internal linking system, but not deleted
  • Pages were lost during the transfer of the site
  • Product category pages were removed, but the product pages remained
  • Such pages are cut off from the rest of the site, which means that the search robot can not scan them. Also, the crawler will not see the page, closed from him through a file .htaccess. 
  • Finally, some pages are not indexed due to technical problems.

With the help of different tools, we will find absolutely all website pages, including dead-end pages. But let’s take it one step at a time. First, let’s upload the list of all indexed and correctly working pages.

Why one tool is not enough for data collection

Let’s try to collect data from three different tools:

  • From the Site Audit Tool in Rush Analytics, we will upload all the pages that are open to search engine crawlers;
  • The next step is to find all the pages with views in Google Analytics;
  • SEO software, such as Screaming frog SEO Spider, will help you to collect all the pages, as well;
  • From Google Search Console we can get the rest of the pages that are closed to search robots, which have no views.

By comparing all the data we will get a complete list of pages on your site.

At this step, we will find indexed URLs. But we do not need only them. Many sites have hidden pages to which no internal link leads. They are called orphan pages.

Search for all pages with views in Google Analytics

Search engine crawlers find pages by clicking on internal links. So if no link on the site leads to a page, the crawler will not find it.

They can be found using data from Google Analytics: the system stores information on visits to all pages. Some bad news is that GA does not know about those hits, which were there before you connected analytics to your site.

Such pages will not have many views, because you will not be able to click on them from your site. To find them, go to your Google Analytics account and proceed as follows:

  • Go to Engagement → Pages and Screens
  • Sort the pages

Next, click on the Views column to sort the list from lesser to greater value. As a result, the most rarely viewed pages are at the top, and orphaned pages are among them.

views in Google Analytics 4
  • Research for all the pages

Move down the list until you see pages with significantly more views. These are already pages with customized linking.

  • Export the collected data to a .csv file
1 share report ga

Find share button

Push share button at the right top of a page

2 download file ga

Download file

Choose Download File button

3 download csv report google analytics

Download CSV

Click Download CSV

Get all pages of a website with Rush Analytics tool

Our next step is to compare the data from Rush Analytics and Google Analytics, to understand which pages are not accessible by the search engines crawlers.

We copy the data from the .csv file uploaded from Google Analytics and paste it into the table next to the data from Rush Analytics.

We only unloaded the end of the website URL from Google Analytics, and we want all the data to be in the same format. Therefore in column B we insert the address of the main page of the site as shown in the screenshot.

  • Compare the data from Rush Analytics and Google Analytics

Next, use the concatenate function to merge the values from columns B and C into column D and stretch the formula down to the end of the list.

  • Use the concatenate function

And now for the fun part: we will compare column “Rush Analytcis” and column “GA URLs” to find orphan pages on a website.

There will be a lot of pages, so it would take an infinitely long time to analyze them manually. Fortunately, there is a match function that allows you to determine which values from the “GA URLs” column are in the “SE Ranking” column. We enter the function in column E and drag it down to the end of the list.

  • Match the data with the match function

In column E we will see which pages from GA are not in column Rush Analytcis, there the table will show an error (#N/A). To collect all the errors, sort the data in column E alphabetically.

  • Sort the data in Google spreadsheets

Now you have a complete list of pages not linked to the site. Before moving on, examine each page. Your goal is to understand what that page is, what its role is, and why no links leading to it.

Next, there are three ways to proceed:

  1. Put an internal link on the page. To do this you must determine its place in the structure of your site.
  2. Remove the page, setting up a 301 redirect from it if it is an extra web page.
  3. Leave everything as is, but assign the page tag <noindex>, if, for example, the page was created for an advertising campaign.

After working with isolated pages, you can once again unload and compare the lists from Rush Analytics and Google Analytics. This way you can make sure you haven’t missed anything.

Start using Rush Analytics today

Get 7 days free trial access to all tools.
No credit card needed!

Try for free
Start using Rush Analytics today

Searching for remaining pages in the Google Search Console

We have learned how to find dead-end pages that are not linked to the site. Let’s move on to the rest of the pages that Google knows about: we’ll analyze the data from Google Search Console.

  • First, open your account and go into Coverage. Ensure that you have selected the “All Pages Processed” display mode and click on the “No Errors” tab
  • The “No Errors” tab in Google Search Console

This will list the Indexed pages that are not in the sitemap, as well as the Sent and Indexed pages.

  • Google Search Console information

Click on the list to expand it. Examine the data carefully: there may be pages in the list that you have not seen in Rush Analytics and Google Analytics downloads. If so, make sure that they are properly performing their role within your site.

Now let’s go to the Excluded tab so that only non-indexed pages are displayed.

  • Excluded tab in Google Search Console

Most often pages from this tab have been deliberately blocked by site owners: these are pages with redirects, closed with a “no index” tag, blocked in the robots.txt file, and so on. Also in this tab, you can identify technical errors that need to be fixed.

  • Errors in Google Search Console

If you find pages that you did not encounter in the previous steps, add them to the general list. This way, you will finally have a list of all the pages on your site without exception.

Other ways to get hidden pages of a website

There are other ways to find out how many links your site or your competitor’s site has, despite the site page counter. Let’s look at the most popular ways.

  • View the XML sitemap file

You must create an XML sitemap file. It is very useful when you need to know how to view all the pages of a website. Use a sitemap generator to create one for you; it’s a simple way. It is done automatically and you don’t need to have any technical knowledge or experience in creating XML sitemaps.

Having an XML sitemap is an advantage during search engine ranking. If during the site audit it is discovered that you do not have a sitemap, this fact will be marked as a critical error.

  • Use your CMS

If your site runs on a content management system (CMS), such as WordPress or WIX, you can generate a list of all your web pages from the CMS. There are many plugins on the web that can help you collect all the links on your site with one click. It’s very simple and free – try it to count the pages of your website!

  • Use a log

A log of all pages served to visitors is another way to know the number of all pages on your website. Simply log into your cPanel, then look for the raw log files. So you can list all the pages on a website: the most frequently visited links, the ones that have never been visited, and the ones with the highest abandonment rates.

  • Use site tracking tools

Another simple and popular way to find out how many pages a website has is to use site audit tools. There are many of them, so you can choose one that your team subscribes to. It can be Rush Analytics Site Audit Tool or Screaming Frog.

A free subscription to the tool is enough to know the number of all inbound and outbound links on your site. You will not need to purchase a subscription just for this task.

Conclusions

If you have access to the necessary tools, it is not difficult to collect all the pages of the site. Yes, you can’t do everything in two clicks, but in the process of collecting data, you will find hidden pages that you may not have guessed existed.

Pages that are not seen by either search engines robots or users, do not bring the site any value. As well as pages that are not indexed because of technical errors. If there are a lot of such pages on a website, it can have a negative impact on SEO results.


Head of Rush Analytics Dmitry
Views
2313
Rating
5,0/5
Rate
Comments
0
Comment
Rate this article Rating anonymous
Add a comment

More articles

Back to blog

Get 7 days free trial access to all tools.

Pick the right keywords from Google, YouTube and Yandex suggestions

Free Trial