Screaming Frog

Advanced Xpath Functions for the Screaming Frog SEO Spider

Mark Porter — Tue, 17 Dec 2024 09:48:30 +0000

One of the most powerful features of the Screaming Frog SEO Spider is the ability to scrape whatever information you want from a website in bulk using custom extractions. I have previously written about using XPath for custom extractions, which is a great place to get started if you’re interested in learning the basics. You can also see my talk on this subject at BrightonSEO in April 2024. If you want to create advanced custom extractions, you will need to learn how to use functions.

Screaming Frog has just released version 21, which has upgraded its support from XPath 1.0 to 2.0, 3.0 and 3.1. This means we can now use advanced functions to turn those custom extractions up to 11. See below for some examples of XPath functions as well as examples of how they can be used to scrape information from sites. Hopefully, these will inspire you to create your own custom extractions to achieve some amazing analysis.

This article is a guest contribution from David Gossage, SEO Manager at Click Consult.

string-join()

What it does: This function combines multiple pieces of content that match your XPath query, separated by a specified delimiter.

Syntax:
string-join([Target XPath],"[delimeter]")

Example: Extract all text from

tags, separated by spaces:

string-join([//p," ")

This is especially handy for messy HTML templates (looking at you, Bootstrap). Instead of dealing with fragmented data, you get a single, clean string.

distinct-values()

What it does: Removes duplicates from your extraction results, returning only unique values.

Syntax:

distinct-values([Target XPath])

Example 1: Get a list of all unique CSS classes used on a page:

distinct-values(//@class)

This will return a de-duplicated list of every class that is used on a page. When used to crawl your entire site, you can then compare your list with your CSS file to see if there are any unused styles that can be removed to make your site load faster.

Example 2: You can combine it with the count() function to identify pages with bloated HTML:

count(distinct-values(//@class))

This will count the number of unique classes which are on each page. The idea is that the more unique classes there are, the more bloated the HTML is and the greater the opportunity is to create cleaner code that loads quickly and is easier for search engines to render.

starts-with()

What it does: This little gem was actually supported by the SEO Spider before the latest release, but I’ve only just discovered it. It filters the results to include those that begin with a certain character or string.

Syntax:

`starts-with([Target XPath] 'lookup string’)]`

Example: A really powerful use for this is to extract all relative URLS, meaning that you can get a list of all internal links from every page as follows:

`//a[starts-with(@href, '/')]/@href`

This will extract every URL which begins with / from every link on the page so you can audit your internal linking strategies or navigation audits.

ends-with()

What it does: This function works in the same way, but to search for the string at the end of the attribute.

Syntax:

ends-with([Target XPath] 'lookup string’)]

Example: This can be used to find all links to certain file types, such as PDF documents:

//a[ends-with(@href, '.pdf')]/@href

matches()

What it does: This enables you to combine XPath with Regex to turbo-charge your custom extractions. The beauty of this rule is that you can use several conditions at once to extract based on a range of criteria.

Syntax:

matches([Target XPath], '[regex rule]')]

Example 1: Extract links to image files:

//a[matches(@href, '\.(jpg|png|gif)$')]/@href

This will return the URLs of any links that contain jpg, png or gif. Since links to images do not generally add value, this can help to find those dead ends to your navigation.

Example 2: You can use this to identify any links containing UTM parameters, which can play havoc with your analytics tracking:

//a[matches(@href, 'utm_')]/@href

exists()

What it does: Check if a certain piece of HTML actually exists on a page, as follows:

Syntax:

`exists([Target XPath])`

Example: This can be used to identify where important HTML snippets are missing, such as meta descriptions:

`exists(//meta[@name='description'])`

The output will be a boolean value of either “true” if the element exists and “false” if not.

format-dateTime()

What it does: You can format your publish dates in a way that is easier to organise by date with a great deal of ease. This is dependent on your articles having an attribute that contains the date and time in a format that the XPath can read.

Syntax:
format-dateTime([Target XPath], '[Y0001]-[M01]-[D01]')

Example: The following example will organise dates into YYYY-MM-DD format:

format-dateTime(//time/@datetime, '[Y0001]-[M01]-[D01]')

Alternatively, you can get the dates from the Open graph data as follows:

format-dateTime(//meta[@property="article:published_time"]/@content, '[Y0001]-[M01]-[D01]')

The result can be seen as follows:

if()

What it does: This powerful function will return the value of an XPath only if certain conditions are met.

Syntax:

if([conditional XPath]) then [Target XPath] else ''

Example: Check the canonical URLs of pages with noindex tags using the following XPath rule:

if(contains(//meta[@name='robots']/@content, 'noindex')) then //link[@rel='canonical']/@href else 'Not noindexed'

This concept can be used for a wide range of different applications, from inconsistent canonical tags, missing alt descriptions, nofollow links to social profiles and more.

You can ramp this up by using comparison operators such as < or > to filter down your results. Let’s look at the following example, which counts how many

tags are in your article to identify thin content and then returns that content for quick analysis:

if(count(//p) < 4) then string-join(//p, ' ') else ''

tokenize()

What it does: This function splits the extracted value by a delimiter, similar to how the SPLIT() function works in Google Sheets. It can take a few goes to get it right so bear with us.

Syntax:

[Target XPath] ! tokenize(., '[delimiter]')

Example: Check the canonical URLs of pages with noindex tags using the following XPath rule:

//a[contains(@href,"http")]/@href ! tokenize(., '/')[3]

So the first part is the target XPath you’re looking to extract, in this case all links which contain “http” in the URL. The second part splits it out every time a slash appears. I’ve put [3] at the end to only return the result after the third instance of extracted content (ie. the domain name).

For this example, I’ve also added the distinct-values() function to remove duplicates from the list for the following results:

One more example…

The Screaming Frog team has previously written about scraping social meta tags. However, this method is quite manual to set up and will result in a separate column for each tag resulting in extra work to combine them into a usable format. However, you can combine some of the rules above, along with the concat() function to extract all your Open graph tags into a single cell:

string-join(//meta[starts-with(@property, 'og:')]/concat(@property, " - ", @content), codepoints-to-string(10))

Ok, this is a difficult one to follow but it looks for all meta tags where “property” begins with “og:”, which is then concatenated with the value for “content” in the same tag. Once it’s found all these tags, it then joins them with the string-join() function, delimited by line breaks (exhale).

When running your extraction, it will look like its all joined on a single row. However, if you copy or export the contents elsewhere it should look nice and tidy like in the CSV export shown below:

Conclusion

Many thanks to Chris and Liam (literal geniuses) from the Screaming Frog team with help on this, I wouldn’t have been able to write this without you. We’d love to hear how you’re using these advanced functions as part of your SEO strategy. If you have a killer XPath, let us know!

The post Advanced Xpath Functions for the Screaming Frog SEO Spider appeared first on Screaming Frog.

How to Elevate Your PR Strategies With Competitor Analysis

Mark Porter — Tue, 10 Dec 2024 10:35:05 +0000

We’ve all heard that imitation is the sincerest form of flattery, right?

Well, any PR professional will be able to tell you of a campaign they’ve been inspired by and wondered how they could do something similar for their own clients.

As PR professionals, we’re constantly tweaking strategies and plans, to make sure we’re not only improving our own work but keeping up with the rapidly changing media landscape.

One of the most effective ways to improve your strategy is through competitor analysis. It can reveal what’s working well in the bigger picture, highlight gaps in your own strategy and show you where you can level up.

Spot Strengths… and Weaknesses

One key takeaway from conducting a competitor analysis is understanding that what doesn’t work is just as important as knowing what does.

A few key questions to keep in mind include:

Are they getting consistent coverage in top-tier publications?
Are they engaging in PR strategies I hadn’t considered?
Are they missing out on key target publications?
Are there certain niches they haven’t covered?

These gaps might reveal where you can step in and differentiate your client by focusing on areas they’ve overlooked or failed to prioritise.

Google is your best friend when it comes to this initial analysis. Searching Google News for recent news coverage can quickly help identify if your competitors regularly engage in link-building work.

Ahrefs can also quickly point out if there’s been a change in strategy. For instance, by comparing the referring domains of our own client (shown in blue) with their competitor’s, you can see when there’s been a sudden uptick in referring domains, indicating a potential change in approach.

This could be the result of a major PR push, a successful campaign, or a shift in focus—such as increased reactive commentary or a new, successful creative campaign.

While the reasons for the uptick aren’t always immediately clear, tracking these changes provides valuable clues. By cross-referencing this data with Google News or the competitor’s website, you can often identify the specific coverage or campaign responsible for the increase and adapt your strategy accordingly.

Ahrefs can also help identify where an influx of links has come from. Select the ‘Backlinks’ tab and filter the search options to show ‘New’ links and select a custom date range to filter the links. This will help you quickly identify where the influx of new links came from at a specific time.

Identify What You’re Missing Out On

Competitor analysis often highlights activities your competitors are undertaking that you haven’t yet explored. For instance, maybe you notice a competitor excelling at data-led PR or securing coverage through surveys. Perhaps it’s time to consider how you can take a similar approach but put your own spin on it.

Competitor analysis also gives you ideas for new content formats or angles you may not have considered. Maybe you see competitors in the travel industry gaining more coverage through interactive hero content or by publishing insight-led reports. If those strategies seem to be working, it’s worth asking whether you should be doing the same—or even better, pushing the envelope with something more creative.

Ahrefs provides a wealth of insight when looking at your competitor’s work. ‘Best by Links‘ and ‘Linking Authors‘ are our go-to. Best by Links identifies pages that are most linked to. While the usual suspect, the homepage, tends to come out on top, it’s really useful to see if blog and information pages are linked to regularly and how they’re being linked to via earned editorial.

This can spark ideas for your future campaign strategies, showing you which topics or formats resonate most with other websites and their audiences.

Ahrefs’ Linking Authors feature, which is relatively new, is also an excellent resource for discovering who is linking to your competitors. It helps identify journalists who regularly link to your competitors, allowing you to focus your outreach efforts where it will be most effective. By studying this data, you can prioritise building relationships with the authors who regularly link to your competitors and understand the type of content they’re looking for, allowing you to boost your referring domains.

Comparing topical authority on Majestic is also helpful in identifying new niches to target. This example shows the topical authority of our client (first) and their competitors in the Recreation/Autos industry. Cross-referencing this against various topics provides a wealth of insight for future PR work. This data reveals which competitors are leading the way and pinpoints different industries that can be targeted with future PR work.

Turn Insights into Actions

Competitor analysis is only valuable if you take the insights it provides and translate them into actionable steps. It’s not just about recognising what others are doing well or where they’re missing out—it’s about using that knowledge to elevate your PR strategies.

Once you’ve gathered data, the next step is to ask: How can you apply these findings to your own campaigns?

For example, if a competitor’s data-led PR has garnered significant coverage, you should ask yourself how you can create a similarly compelling narrative, but with a twist that plays to your strengths. Maybe you have access to unique insights or industry expertise that can differentiate your take. Or, perhaps you can focus on specific topics or publications that have not been explored yet, giving you an edge in standing out from the competition.

It’s also essential to adapt quickly when you spot a competitor having success with a particular strategy—whether it’s reactive commentary, thought leadership or a major creative campaign. If it’s working for them, it might work for you too, but with your unique voice, vision and angle.

The key here is not to follow suit but to push further with creativity, innovation and relevance.

Remember, competitor analysis shouldn’t be a one-off exercise. To stay ahead in the fast-moving PR landscape, you need to continuously revisit and refine your approach based on new insights. Every time you analyse competitors’ strategies or content, it’s an opportunity to rework your plans, making sure you remain agile, forward-thinking and always one step ahead.

It is also essential to talk to your clients about how this work can offer tangible results to their campaign. Be sure to highlight specific examples of how competitor analysis has shaped your strategy. In your reporting, share how this analysis helped identify untapped opportunities or expand media lists. Key metrics, like increased coverage and diversified referring domains can further demonstrate the effectiveness of your strategy.

The post How to Elevate Your PR Strategies With Competitor Analysis appeared first on Screaming Frog.

A Hat-Trick Of UK Search Awards For Screaming Frog

Mark Porter — Thu, 05 Dec 2024 14:57:31 +0000

In the Bloomsbury big top, the Screaming Frog team net 3 awards.

Having been shortlisted for 8 categories, across both SEO and PPC, the Screaming Frog team headed to the 2024 UK Search Awards with aspirations of adding some more silverware to an ever growing trophy cabinet.

Hosted by the very funny comedian Russel Kane, the nights entertainment and setting was befitting of housing the very best in the industry.

Following a tasty three-course meal, a record 76 awards were up for grabs. After our successes at previous UK Search Awards, and more recently, the UK Paid Search Awards, the team was excited to see what the night had in store.

First up, Screaming Frog scooped the award for Best Local Campaign (PPC): Large for its work with Core Sash Windows.

As a small London-based company, our campaigns went toe to toe with the likes of Anglian Home Improvements and Everest, showing that even in the face of huge competition from large, nationally recognised brands, a small local business can still effectively compete and deliver amazing returns if the channel is run with a laser focus and dynamic management.

Here’s what the Judges had to say:

“A campaign that was executed with a clear emphasis on strategic keyword selection and ongoing optimisation, highlighting a systematic approach to paid search. The team successfully navigated various challenges throughout the process, ultimately achieving positive outcomes and enhancing overall performance. By filtering out audiences less likely to be interested in the products, they significantly improved both conversion rates and spending efficiency.”

Following this success, it wasn’t long before the team collected the second award of the night, winning the category Best Use of Content Marketing: Large for InsureMyTrip.

From a combination of both SEO and PR, this project utilised the best of our in-house skills in ideation, data analysis, design and development and PR to produce a series of informative, inspirational and headline-grabbing content marketing campaigns that promoted IMT as a go-to resource in a competitive field.

Having earnt close to 1,000 pieces of international media coverage, with many target top-tier publications, we were delighted to see our dedication to media research and tailored outreach recognised. You can read more about our work with InsureMyTrip here.

The judges added:

“The use of data-driven hero content and trend-led insights was not only timely but also showcased a keen understanding of the market landscape. The integration of diverse data sets allowed for a more comprehensive approach, driving impressive results across a range of content. Particularly noteworthy was the ability to balance short-term and long-term themes, diversifying campaign narratives in a way that kept the strategy both fresh and consistent. The detailed research and creative execution shone through, resulting in a strong, data led, and targeted strategy that set a high bar for success.”

Our final award came in the form of Best Use of Social Media in a Search Campaign: Large for our work with Abbott, a multinational medical devices and health care company.

Exceeding targets, our expert judgement was leant on as we approached the campaign in a way that hadn’t been considered previously by the client, resulting in a two pronged approach of both awareness and return.

Using this activity as a case study, Abbott are now seeding future campaigns with new highly relevant audiences that weren’t previously accessible and we are already seeing the fruits of these labours in their subsequent activities.

The judges thoughts acknowledged this:

“The creative strategy of targeting users at varying brand awareness levels across different phases demonstrates a sophisticated understanding of the customer journey. Running the campaign in two stages was a strategic move that effectively supported the achievement of targets, and the inclusion of an awareness phase was a smart addition to enhance engagement. An impressive, yet often overlooked, success of this campaign was the ability to create personalised ads while overcoming the notorious challenges of connecting with local teams.”.

Now with a hat-trick of awards in hand, there was nothing left to do but celebrate on the dance floor!

Thank you to Don’t Panic Events for hosting another brilliant UK Search Awards ceremony; you’ve made a particular group of Frogs, very happy indeed!

Head over to our case studies to read more about our award-winning work across SEO, PPC, Digital PR and more.

The post A Hat-Trick Of UK Search Awards For Screaming Frog appeared first on Screaming Frog.

How To Use Generative AI To Automate Ecommerce Category Page Text

Mark Porter — Wed, 04 Dec 2024 11:05:00 +0000

So let’s talk about ecommerce text on category pages.

If you’ve worked in ecommerce SEO for any given time, you’ve likely seen this strategy. After the product listings on a given category page, you’ll see text that appears at the bottom that’s clearly written for search.

This article is a guest contribution from Chris Long, VP of Marketing at Go Fish Digital.

Here’s an example on the site Lulus.com.

While there are no hard and fast rules around this text, you’ll generally find common characteristics such as:

Over 1,000+ words of text
Optimized for the target query of the category
Utilizes 3-4 common sections
Uses a H2/H3 headings to pose questions that are answered with paragraphs below

The reason that many retailers deploy this strategy is that frankly…it works quite well. From the sites I work on and analyze, I consistently find that sites with this text at the bottom perform better in the search engines and that adding this text improves a category page’s chances of search performance.

The SEO community also tends to agree. From a poll I took, over 80% of respondents said that this SEO text helps improve visibility.

There’s other data out there as well. SearchPilot ran an SEO experiment where they found that removing this “SEO text” resulted in a -3.8% drop in organic traffic.

So clearly there’s something to this strategy.

However, the issue is that this requires resourcing from copywriting/editorial. This might not even be the highest priority SEO copywriting initiative (blogging, product descriptions) let alone the biggest overall copywriting initiative (email, performance ads). So even assuming a retailer has the resourcing to execute, it’s still a challenge to implement this copy.

Some Upfront Caveats

I debated on even writing this post. SEO still has an unclear relationship with content written by generative-AI and its long term impact.

However, after testing the process, I found that the output is generally pretty good and on-par with what successful sites are doing already. Where I landed is that this process serves as a helpful starting point for internal teams, but absolutely needs a copywriting team member to make significant edits and decisions.

The goal here isn’t to copy/paste the output but to give your team an initial draft that saves 50% of their workload.

How To Automate Ecommerce SEO Text With Screaming Frog + the OpenAI API

Alright, so now let’s start walking through the process of how to do this.

Earlier this year, Screaming Frog announced a monumental new feature add – the ability to crawl with ChatGPT. You can connect a crawl to OpenAI’s API and get ChatGPT output directly in the crawl data. This is an extremely powerful feature and I’m surprised it hasn’t been written about more.

So we’re going to do most everything through the Screaming Frog interface, which is nice since most SEOs are already familiar with the tool.

For this article, we’re going to be using an example of Steve Madden. This process will outline the steps we’d take to get SEO text written for their category pages.

Determine Your Ecommerce Sections

The first thing you’ll want to do is actually determine what sections you’ll want for the output. In general, this text has 3-4 sections, oftentimes based on common user questions.

You’ll want to perform some research on a common format for questions people have around your products. You can review competitors, use PAAs, or even use AI to determine common formats.

For Steve Madden, I came up with a few sample questions based on competitors and utilizing the H1. These are sections we’d want to appear on every page:

Get Your OpenAI API Key

Next, you’ll need to go grab an OpenAI API key. You’ll need this to connect Screaming Frog to OpenAI. The directions to grab that are over here.

Enable Storing HTML

While Screaming Frog has had the ability to connect to OpenAI for a while now through the use of JavaScript snippets, it’s now possible to connect directly to the API in version 21.0. With a single connection, you can run multiple ChatGPT prompts.

In order to run this process, you’ll first need to enable storing HTML. Navigate to Configuration > Spider > Extraction. In the “HTML” box ensure that both “Store HTML” and “Store Rendered HTML” are turned on.

Connect Your OpenAI API Key Into Screaming Frog

Now, let’s connect our crawl to OpenAI. Navigation to Configuration > API Access > AI > OpenAI. You should see a window open up that prompts you to connect your OpenAI API key. Add this here.

Select “Connect” and you should be all connected to ChatGPT for any prompts you want to run.

Configure You ChatGPT Prompt

Another great part about this new version update is that it allows you easily configure the ChatGPT prompt with simple fields and dropdowns. You can still go the Custom JavaScript route, but this way is just a bit simpler.

To configure your prompt, navigate to the “Prompt Configuration” tab and select “Add”. We’re going to configure a custom prompt to send to OpenAI.

You’ll then see a new row get added to the table with editable fields and dropdowns. Here’s the configuration setting I used for the prompt.

Name: Ecommerce SEO Text
Model: ChatGPT / gpt-4o
Content Type: HTML
Text: Page Text
Prompt: “You are an expert copywriter. Read the H1 and then create FAQs based on these questions. Answer 2-3 sentences for each question. What can you wear with this? How do you accessorize this? Is this still in style?’”

Save all these configurations. You might need to come back and refine but this should at least get you started.

Note: You can also click the “play” button to test your prompt before running.

Filter Your Crawl To Category Pages

Before starting your crawl, you’ll want to be sure that you’re only crawling category pages. That way, you won’t have OpenAI create this content for pages that don’t need it (products, blog).

So let’s create a filter that only crawls category pages.

Since Steve Madden is on Shopify, they use the /collections/ modifier in the URL for all categories. I’ll go to Configuration > Include to open the “Include” filter. I’ll then type “/collections/.*” to crawl only these pages.

Of course, you’ll need to adjust to whatever your category pages URLs are.

Start Your Crawl

At long last, we can start to crawl our site and automate the process of creating these using OpenAI. Plugin your domain into Screaming Frog and start your crawl.

In the “Overview” sidebar navigate to “AI”. You should find your ChatGPT prompt that you configured. Click this and you can see the automatically generated descriptions from OpenAI.

Of course, now you can export them and begin to review them.

In fact, in 10 minutes, OpenAI completed 250 different pages. That’s a rate of 25 pages per minute. So in about 7 hours time, you could create this content for 10,000 category pages.

Analysing The Output

Now let’s take a look at the content that OpenAI produced. In general, they turn out to have strong outputs.

For instance, here’s some FAQs it generated for their “Women’s Designer Boots” page:

It correctly uses the target query in each heading, answers the prompts accurately and gives a response that’s in-depth enough for these sections.

Here’s another example for their Men’s Boots page. Once again, a pretty solid response.

In fact, I’d say that 80% of the output I see when running this serves a strong starting point. This is going to give your editorial team a jumping off point and save them a ton of work if implementing this text.

Of course, there are times where these outputs are a complete miss. For instance, for the generic Denim page, the output starts to create content for “Denim Shoes” – a completely ridiculous concept:

This is exactly why you need a copywriter to review this. Some of these will be inaccurate, be too broad and miss the mark. Of course, this doesn’t even account for the lack of internal links.

Conclusion

So hopefully this is a helpful process for understanding how you can create this SEO text at scale. The goal here is for you to be able to get an initiative started that you might not typically have internal resourcing for.

While 5,000 category pages would take a single copywriter weeks to create content for, you can use generative AI to give you a starting point in just 5 hours time with only 30 minutes to an hour of actual configuration.

The post How To Use Generative AI To Automate Ecommerce Category Page Text appeared first on Screaming Frog.

Optimising Your Location Pages for Both Users and Search

Mark Porter — Wed, 27 Nov 2024 09:09:35 +0000

In the digital landscape, effective communication is key to attracting and retaining site visitors and climbing search engine rankings.

Location pages and ‘About Us’ pages often play crucial roles in shaping user experience and enhancing SEO.

These pages are not only the key to providing essential information but also reflect your brand’s identity and values.

In this post, we’ll explore best practices for optimising a location page from both an SEO and UX copy perspective, with a part two blog focused on ‘About Us’ pages.

By implementing these strategies, anyone has the ability to improve search visibility, foster trust, and create a more engaging experience for their audience.

What Is a Location Page?

A location page is a dedicated area of a website that provides specific information about a geographic area related to a business or service.

Location pages are often used for business headquarters, store locations and events.

The information contained on a location page can vary from an address, contact information, and hours of operation, to maps and directions to help customers find the location easily.

Location pages often highlight the services within that area, along with customer testimonials and relevant images to enhance engagement.

These days, a lot of search results are heavily localised to each user based on things like their IP address, their device location and more. This means that a user in London searching for ‘appliance repair’ will see completely different search results to a user in Manchester using the same search term.

As such, location pages are crucial for local SEO and help businesses appear in search results when potential customers look for services in their area.

If your service requires an on-site visit, a well-crafted location page could be a game changer.

How To Improve Your Location Page Content

If you’re looking to improve the content on your location page, then follow these effective tips below:

Focus on Keywords: Targeting the correct keywords on your location page improves search visibility, drives relevant traffic, and attracts local customers. This contributes to increasing your site’s online presence. Often, the keywords will be a combination of your generic target keywords and the location in question, such as ‘appliance repair London’.
Make Sure Your Name, Address and Phone Number Are Correct: These three pieces of information are essential to any location page, and making sure that the information is correct and delivered to users in an easy-to-read manner is the key to avoiding confusion.
Always Add Opening and Closing Times: If relevant, including opening and closing times can improve user experience by providing information for planned visits. In terms of SEO, it can impact local search rankings and relevance for potential customers.
Make Use of Location-Specific Content: Location-specific content can enhance UX by making information relevant and relatable, which in turn fosters trust. Additionally, it helps prevent your location pages from being seen as potential doorway pages by search engines. By including relevant, location-specific information, you ensure that your pages are distinct and well-differentiated, even at scale.
Use Google Maps to Your Benefit: Embedding Google Maps on your location page can make it much easier for users to find you, improving the overall experience.
Location-Specific Images Are a Must: Similar to location specific content, incorporating relevant imagery also helps differentiate your location pages, and increases legitimacy in the eyes of users if it’s your own unique material. See our guide to Image SEO Guide for more tips.
Use Internal Linking: Internally linking to other relevant locations can help increase overall authority within the local area.
Include Focused Reviews: Reviews are a strong trust signal, especially if your product offering or service lines are seen as significant purchases. Including them on location pages enhances authority and trustworthiness, influencing potential customers’ decisions, and potentially improving your rankings for longtail keywords. It’s recommended that these reviews be from happy clients or customers from within a specific location. For example, if a client were from London, their review should definitely appear on London location pages.
Add Valuable FAQs: It’s important to add FAQs of value, for example ‘Can I get to [location] from public transport?’ These FAQs help drive relevancy and have the potential to appear in featured snippets.

How To Optimise Location Pages for SEO?

To keep in line with best practices, site owners can optimise their pages for SEO in the following ways.

Feature Customer Reviews

As mentioned previously, the quality and legitimacy of your reviews and how you present them could be the difference between a user converting, or bouncing back to the search results. Here are some tips on building the perfect customer reviews:

Build Trust: 86% of U.S. consumers believe transparency in business matters now more than ever. Showcasing authentic experiences, highlighting positive feedback, and responding to concerns is important. Transparency and engagement demonstrate reliability, fostering a sense of community and encouraging potential customers to engage.
Boost Credibility: Boost credibility with customer reviews by displaying verified testimonials, showcasing diverse experiences, and addressing negative feedback constructively. Authenticity and responsiveness reinforce trust, demonstrating commitment to customer satisfaction and quality service.
Improve Local SEO Efforts: Improve local SEO by encouraging customer reviews on platforms like Google and Yelp. Fresh, positive reviews can boost rankings, enhance visibility, and signal relevance to search engines, attracting more local traffic.
Influence Purchase Decisions: Influence purchase decisions by showcasing authentic customer reviews that highlight benefits and experiences. Positive testimonials address concerns, and provide social proof, encouraging potential buyers to choose your product or service.

Implement LocalBusiness Structured Data

Schema markup is a structured data vocabulary that helps search engines understand website content better.

Implementing the markup in your HTML, preferably using JSON-LD, can make you eligible for rich results, in turn helping you stand out amongst the competition.

You don’t need to add LocalBusiness schema on every page of your site, but it should be implemented on relevant pages like your homepage and location pages.

Be sure to make use of Google’s Rich Results Test or the SEO Spider to double check implementation, and monitor performance in the search console.

Incorporate Relevant FAQs

Incorporating relevant FAQs on a location page can improve SEO by addressing common customer queries, enhancing user experience, and increasing engagement.

This content can lead to featured snippets in search results, boosting visibility and driving more targeted traffic to your site.

To include FAQs, identify common questions related to your location and services, and create a dedicated FAQ section on your page, using clear, concise answers. Use tools such as AlsoAsked to identify FAQs with search interest at scale.

Create a Localised URL Structure

Optimising URLs is a very small ranking factor, but by incorporating geographic identifiers and relevant keywords, you can also create a cleaner, more user-friendly structure.

Imagine a URL like www.example.com/ballet/london — it’s immediately clear to users what they can expect from this page.

This approach not could not only affect your rankings, albeit small, but also enhances the user experience by making it crystal clear how relevant your page is to a specific area.

To localise your URLs like a pro, sprinkle in those location keywords, use hyphens for easy reading, keep it concise, and maintain consistency across all your location-specific pages.

For more information, see our URL Structure Guide.

Create a Localised Page Title

A localised page title should include specific geographic keywords, enhancing relevance for local searches, for example, “Appliance Repair in London – RepairGuys.co.uk”.

Localised page titles can improve SEO by enhancing relevance for specific geographic searches, increasing click-through rates, and helping search engines understand the content context.

This ultimately drives more targeted traffic to a website.

To create a localised page title, include the location name and relevant keywords. Keep it concise, descriptive, and engaging. For more information on how to do this, check out our in depth guide on page titles.

Location Pages for Areas You Don’t Have a Presence

On the subject of creating location pages for areas where you don’t have a physical presence, such as an office or headquarters, this is absolutely fine if your business does actually service those areas.

If you are spinning up location pages for areas that you do not currently serve, this is where it could be seen as spammy or unethical, as well as a poor user experience.

As well as this, it’s important not to go too granular when it comes to creating pages within locations. This is something to be checked on a case by case basis, and would involve analysing things such as search volume, the size of the location and the overall competitiveness of the area and industry.

How To Optimise Location Pages for UX and SEO Copy

Because of the practical uses of a locations page from outlining the essential locations of provided services to customers, it’s important to consider how the copy reads through the lens of user experience copy.

Some key criteria for creating a well-written locations page include:

Clear & Concise Headings: Clear headlines help users quickly grasp the page’s purpose, making navigation intuitive and improving comprehension, while improving rankings. For more information see our SEO headings guide.
Descriptive Subheadings: Subheadings break up text and guide users through the content, enhancing readability and allowing for easier scanning of information.
Local Language & Terms: Using familiar terms and phrases relevant to the local audience creates a connection, making the content more relatable and engaging.
Action-Oriented Language: Encouraging users to take action with strong verbs (for example, ’Visit,’ ‘Contact’) in the copy motivates engagement and increases conversion rates.
Engaging & Relevant Content: Providing valuable information about local services, events, and community ties keeps users interested and encourages them to explore further.
FAQs With Clear Answers: Anticipating user questions and providing direct answers enhances user satisfaction, reducing confusion and increasing trust in the business.
Consistent Tone of Voice: A consistent tone fosters familiarity and reinforces brand identity, making users feel more comfortable and connected with the business.
Use Bulleted Lists: Bullet points make information digestible and easier to scan, helping users find key details quickly without overwhelming them with text.
Highlight Unique Selling Points (USPs): Showcase what sets your location apart (special services, promotions) to help persuade users to choose your business over competitors.
Create a Dedicated ‘Team’ Section: A team section helps shine a spotlight on team members who are responsible for work in that area. This adds both authority and an air of expertise to a locations page.
Call-to-Action (CTA) Clarity: Clear, compelling CTAs guide users on the next steps, improving conversion rates by making engaging with your business easy.

Implement CRO Tactics Based on Optimisation Needs

CRO tactics are a set of strategies used by SEO professionals to increase conversion rates, improve user experience, maximise ROI, and effectively turn visitors into customers.

These tactics include A/B testing, user feedback, optimised CTAs, streamlined navigation, personalised content, mobile optimisation, and trust signals to enhance conversions.

Additional ways CRO tactics can be used include:

Regularly Update Content: Regularly updating content is crucial for SEOs as it increases accuracy, enhances user experience, and reflects current offerings. Fresh content improves search rankings, engages visitors, and builds trust, ultimately driving more traffic and conversions as business needs evolve.
Monitor Page Engagement Metrics in GA4: Monitoring page engagement metrics in GA4 is useful for SEO as it reveals user behaviour, identifies high-performing content, and highlights areas for improvement. Key metrics include bounce rate, average session duration, and pages per session, as they indicate content relevance and user interest. Tracking user behaviour helps identify user engagement patterns, revealing which content attracts attention.

Track Local Keyword Rankings

Tracking local keyword rankings for a location page is crucial for SEO as it helps assess visibility in local searches.

It identifies which keywords drive traffic, allowing optimisation efforts to focus on high-performing terms.

Use tools like Google Search Console, SEMrush, or Ahrefs to track local keyword rankings for your location page. It’s important to use location-based rank tracking as it helps site owners monitor any searches from specific locations.

Monitor specific local keywords, analyse performance over time, and adjust your SEO strategy based on insights gathered to improve visibility and drive more targeted traffic.

To Summarise

In summary, a well-crafted location page is essential for any business looking to enhance its online presence and attract local customers.

By providing clear, relevant information about your geographic area, you not only improve your visibility in search results but also create a seamless user experience.

Implementing best practices—such as optimising for local keywords, providing accurate NAP details, incorporating customer reviews, and using schema markup — can significantly boost your local SEO efforts.

Also, focusing on user experience through engaging content and clear calls to action will help convert visitors into loyal customers.

Don’t overlook the power of location pages, they are a vital part of your marketing strategy that can make a lasting impact on your business’s success.

We hope you’ve taken away valuable information that can be used to enhance the SEO and UX potential of your locations page.

For all your copywriting needs, see our dedicated copywriting services page, where we offer you high-quality, tailor-made copy with boundless SEO potential.

Keep your eyes peeled for part two in this blog series, “How to Write an ‘About Us’ Page”.

The post Optimising Your Location Pages for Both Users and Search appeared first on Screaming Frog.

Screaming Frog SEO Spider Update – Version 21.0

screamingfrog — Tue, 12 Nov 2024 09:07:05 +0000

We’re delighted to announce Screaming Frog SEO Spider version 21.0, codenamed internally as ‘towbar’.

This update contains new features and improvements based upon user feedback and as ever, a little internal steer.

So, let’s take a look at what’s new.

1) Direct AI API Integration

In our version 20.0 release we introduced the ability to connect to LLMs and query against crawl data via custom JavaScript snippets.

In this update, you’re now able to directly connect to OpenAI, Gemini and Ollama APIs and set up custom prompts with crawl data.

You can configure up to 100 custom AI prompts via ‘Config > API Access > AI’.

You’re able to select the category of model, the AI model used, content type and data to be used for the prompt such as body text, HTML, or a custom extraction, as well as write your custom prompt.

The SEO Spider will auto-control the throttling of each model and data will appear in the new AI tab (and Internal tab, against your usual crawl data).

In a similar way as custom JS snippets, this can allow you to create alt text at scale, understand the language of a page, detect inappropriate content, extract embeddings and more.

The ‘Add from Library’ function includes half a dozen prompts for inspiration, but you can add and customise your own.

The benefits of using the direct integration over custom JS snippets are –

You can input your API key once for each AI platform, which will be used for all prompts.
You don’t need to edit any JavaScript code! You can just select requirements from dropdowns and enter your prompt into the relevant field.
JavaScript rendering mode isn’t required, data can be returned through any crawl mode.
The APIs are automatically throttled as per their requirements.

This new AI integration should make it even more efficient to create custom prompts when crawling. We hope users will utilise these new AI capabilities responsibly for genuine ‘value-add’ use cases.

2) Accessibility

You can now perform an accessibility audit in the SEO Spider using the open-source AXE accessibility rule set for automated accessibility validation from Deque.

This is what powers the accessibility best practices seen in Lighthouse and PageSpeed Insights. It should allow users to improve their websites to make them more inclusive, user friendly and accessible for people with disabilities.

Accessibility can be enabled via ‘Config > Spider > Extraction’ (under ‘Page Details’) and requires JavaScript rendering to be enabled to populate the new Accessibility tab.

The Accessibility tab details the number of accessibility violations at different levels of compliance based on the Web Content Accessibility Guidelines (WCAG) set by the W3C.

An accessibility score for each page can also be collected by connecting to Lighthouse via PageSpeed Insights (‘Config > API Access > PSI’).

WCAG compliance levels build upon each other and start from WCAG 2.0 A to 2.0 AA, then 2.0 AAA before moving onto 2.1 AA and 2.2 AA. To reach the highest level of compliance (2.2 AA), all violations in previous versions must also be achieved.

The Accessibility tab includes filters by WCAG with over 90 rules within them to meet that level of compliance at a minimum.

The right-hand Issues tab groups them by accessibility violation and priority, which is based upon the WCAG ‘impact’ level from Deque’s AXE rules and includes an issue description and further reading link.

The lower Accessibility Details tab includes granular information on each violation, the guidelines, impact and location on each page.

You can right-click on any of the violations on the right-hand side, to ‘Show Issue in Browser’ or ‘Show Issue In Rendered HTML’.

All the data including the location on the page can be exported via ‘Bulk Export > Accessibility > All Violations’, or the various WCAG levels.

There’s also an aggregated report under the ‘Reports’ menu.

Please read our How To Perform A Web Accessibility Audit tutorial.

3) Email Notifications

You can now connect to your email account and send an email on crawl completion to colleagues, clients or yourself to pretend you have lots of friends.

This can be set up via ‘File > Settings > Notifications’ and adding a supported email account.

You can select to ‘Email on Crawl Complete’ for every crawl to specific email address(es).

So many friends.

Alternatively, you can send emails for specific scheduled crawls upon completion via the new ‘Notifications’ tab in the scheduled crawl task as well.

The email sent confirms crawl completion and provides some top-level data from the crawl.

We may expand this functionality in the future to include additional data points and data exports.

Please read about notifications in our user guide.

4) Custom Search Bulk Upload

There’s a new ‘Bulk Add’ option in custom search, which allows you to quickly upload lots of custom search filters, instead of inputting them individually.

If you’re using this feature to find unlinked keywords for internal linking, for example, you can quickly add up to 100 keywords to find on pages using ‘Page Text No Anchors’.

Please see our ‘How to Use Custom Search‘ tutorial for more.

Other Updates

Version 21.0 also includes a number of smaller updates and bug fixes.

Additional crawl statistics are now available via the arrows in the bottom right-hand corner of the app. Alongside URLs completed and remaining, you can view elapsed and estimated time remaining, as well as crawl start date and time. This data is available via ‘Reports > Crawl Overview’ as well.
Custom Extraction has been updated to support not just XPath 1.0, but 2.0, 3.0 and 3.1.
Scheduling now has ‘Export’ and ‘Import’ options to help make moving scheduled crawl tasks less painful.
The Canonicals tab has two new issues for ‘Contains Fragment URL’ and ‘Invalid Attribute In Annotation’.
The Archive Website functionality now supports WARC format for web archiving. The WAR file can be exported and viewed in popular viewers.
You can now open database crawls directly via the CLI using the –load-crawl argument with the database ID for the crawl. The database ID can be collected in the UI by right-clicking in the ‘File > Crawls’ table and pasting it out, or viewed in the CLI using the cli –list-crawls argument.
There’s a new right click ‘Show Link In Browser’ and ‘Show Link in HTML’ option in Inlinks and Outlinks tab to make it more efficient to find specific links.

That’s everything for version 21.0!

Thanks to everyone for their continued support, feature requests and feedback. Please let us know if you experience any issues with this latest update via our support.

Small Update – Version 21.1 Released 14th November 2024

We have just released a small update to version 21.1 of the SEO Spider. This release is mainly bug fixes and small improvements from the latest major release –

Fixed issue with custom database locations not being picked up.
Fixed bug in OpenAI Tester.
Fixed a couple of crashes, including for users that had ‘auto connect’ selected for the old GA API, which hasn’t been available for sometime (and is now removed!).

Small Update – Version 21.2 Released 28th November 2024

We have just released a small update to version 21.2 of the SEO Spider. This release is mainly bug fixes and small improvements –

Reduce AI credit use by not sending blank prompts.
Ensure all images in JavaScript rendering mode are available to AI APIs.
Fixed issue with browser not finding accessibility issues when opened.
Fixed issue preventing crawls being exported via the right click option in the crawls dialog.
Fixed a bug with segments ‘greater than’ operator not working correctly.
Fixed various unique crashes.

Small Update – Version 21.3 Released 11th December 2024

We have just released a small update to version 21.3 of the SEO Spider. This release is mainly bug fixes and small improvements –

Update AMP validation to latest version.
Fixed issue with invalid schedules being created when removing email from notifications.
Fixed issue with GA4/GSC crawls sometimes not completing.
Fixed issue with loading in some AI API presets.
Fixed various unique crashes.

The post Screaming Frog SEO Spider Update – Version 21.0 appeared first on Screaming Frog.

Screaming Frog Crawling Clinic Returns to brightonSEO San Diego

Mark Porter — Wed, 23 Oct 2024 09:37:56 +0000

It’s that time of year again! After a successful visit to San Diego last year, the Screaming Frog Crawling Clinic are packing up their gear and heading across the pond for brightonSEO San Diego Edition once again. This time it’s hosted from the 18th to the 20th of November.

You’ll find us at stand 19, right at the heart of the action as you enter the venue. Be sure to pop by to chat all things SEO Spider (and Log File Analysis if you really want to nerd out), as well as pick up some cool swag. You’ll find us at the stand spot below:

We’ll happily run through some of our latest releases, diagnose issues, listen you your feature requests or talk to you about our award-winning agency services.

Top Talks

We’ve got our eyes on some must-see talks at brightonSEO, so if you’re wondering which sessions to prioritise, here are our top picks:

Felipe Bazon: Topical Authority in the Age of Entity SEO

Wednesday 20th, 9:15 am

If you’re looking to gain an edge on topical authority and entity SEO, don’t miss this session with Felipe Bazon. We had the pleasure of catching him at a similar event in Eindhoven earlier this year, and his insights were truly eye-opening. It’s bound to be one of the highlights of the week!

Ross Hudgens: Data-Driven Lessons from 12+ Years in Content-Led SEO

Tuesday 19th, 9:15 am

For any content-focused SEOs, Ross Hudgens‘ talk is one for your calendar. As long-time fans of Ross’ work with Siege Media, we’ve been particularly inspired by their approach to keyword opposition to benefit (KOB) analysis. In fact, we applied this method to our own work, with great results — check out the full rundown of the KOB strategy here. If you’re into content marketing and SEO, this session is a must.

Pre-Event Fun: The Boardwalk Bash

Before the main event kicks off, make sure you head to the brightonSEO Boardwalk Bash on the evening of the 18th for some networking, free beers, and beachside vibes. We’re thrilled to be sponsoring this event, and you’ll find our signature Screaming Frog beer mats scattered around – so grab some beers and mats on us! Find out more about the event here.

Whether it’s your first brightonSEO (UK or USA) or you’re a regular attendee, we’re excited to see you there. Swing by Stand 19, grab some merch, and chat with the team about how we can help you level up your SEO game.

See you in San Diego!

The post Screaming Frog Crawling Clinic Returns to brightonSEO San Diego appeared first on Screaming Frog.

Create Custom Heatmap Audits With the SEO Spider

Mark Porter — Mon, 21 Oct 2024 08:37:06 +0000

Ten years ago, I was searching for a tool that would help me determine what was wrong with a website I was working on.

While scrolling through posts on a forum, I noticed that people had been praising a tool that had a catchy name, as opposed to all the other “SEO-something-something” tools. So I gave it a shot.

This article is a guest contribution from Miloš Gizdovski, Operations Manager at Lexia.

It’s October 2024 now, I’m still using this tool, and the name is quite familiar in the SEO community — Screaming Frog SEO Spider Tool.

Looking back, I can’t imagine doing any kind of technical SEO audit without it. None of my colleagues at Lexia marketing agency can either.

We’re happy when that “New version available” message pops up as it feels like a movie trailer we’ve been looking forward to for months has finally been released.

A few years ago, one of our new clients had hundreds of blog posts on their website. After the usual procedure and initial audits, I wanted to create something that will help me determine which of those posts were actually worthy focusing on.

I saw a map of users’ scores for each episode of Game of Thrones, so I thought it would be cool to use something similar for the blog posts. This would show each month’s data in terms of organic users, sessions, bounce rates, conversion, clicks, impressions, and average positions.

After several tries, I managed to create a report that could be applied to any website.

These reports have a name – Lexia Heatmaps, or just Heatmap reports if you like.

Lexia Heatmaps show a trend of specific parameters over a period of several months or even a year. However, instead of just one page, the reports show the trend for all pages at the same time. This way, nice looking reports that can reveal a lot of possibilities or threats are created.

The following sections of this article will describe how to create a heatmap report for any group of similar pages.

The blog section will be used as an example, but this can be applied to, for example, products or collections of products as well.

A quick note — it will require some familiarity with Excel functions to create the heatmaps, as they are a bit more advanced than the SEO reports that you can export directly from Screaming Frog SEO Spider. Don’t worry, I will show you how to do it step by step!

First Step – Collect the Blog Posts

Obviously, the first step would be to export the list of blog posts in order to create the heatmap.

There are several ways to do this:

Manually collecting the URLs
Creating a custom extraction in the SEO Spider tool
Exporting the links from the sitemap

The first option is the most time consuming. Not a problem for a website with 10 blog posts, but try doing this with those counting over a hundred.

The second option is using the SEO Spider tool. You can create a custom extraction by picking the specific elements, for example:

/blog/ path in the URL
Author’s section
Publishing date

However, I find that the third option, which is using the sitemap, is the most suitable one.

If you’re lucky, there will be a “Post” sitemap, where all the blog posts are hosted. This comes in handy in cases where the blog posts don’t have /blog/ in the URL.

Second Step – Prepare the SEO Spider Tool

Now that you have a list, you can proceed with gathering the data. There are two things you need to do before running the crawl.

The first one is adjusting the crawl mode, and the second one is setting up the API.

Crawl options can be found in Mode settings, and here you have to select the List.

Look for the API under the Configuration options. For the purpose of this article, I will choose Google Analytics 4, which is one of the several options available for selection.

After the API Access window opens, sign in with a Google account that has access to Analytics.

The main tab of the API Access will now show this:

Here is where you can choose the account — pick the one you need and be sure to select the right property and data stream. In my case, the website has only one property and data stream, so I will keep it as All Data Streams.

The next step is setting up the Date Range, which is the tab next to the Account Information:

We have to run a new scan for each month, but remember — this only makes sense if the month is over. So, I’ll start with September 2024, as I have all the data for September ready.

The next tab is Metrics.

Since I already have the Sessions included by default, I will leave everything as is.

However, if I wanted to create a heatmap of the Average session duration, I would need to check this parameter in the Session list, and it would be included in the results.

Lastly, there is one more tab that we need to adjust – Filters.

Since you want to check the organic performance of the blog posts, here is what you need to select:

I didn’t include Organic Social because, for this experiment, I’m only interested in users that came from the organic channels.

If you want to check the users coming organically from social media, or even the Direct traffic users, you can do it as well! Just select the channel and your heatmap will show the trend of direct traffic MoM or YoY.

Third Step – The Results

We are now ready to start the crawling process and get organic sessions data for the month of September.

I already mentioned that you have several options for uploading the list of URLs using the List mode:

In my case, I just pasted the URL of the sitemap.

This is the best option for situations where a lot of blog posts have been published since the last time you updated the heatmap, and here is why:

Whenever you publish a blog post, it will end up in the sitemap, so search engine bots could discover it. By adding the sitemap URL to the SEO Spider tool, it will read all URLs.

If you simply paste the previous month’s list, you will miss the opportunity to track the performance of the new blog posts that have been published since. Most of these new blog posts will have just a few sessions but, after all, they should be tracked from the beginning.

After the crawl is over, head to the Analytics section and check the All reports:

Slide the results to the right and you will see the scores from GA4. The GA4 Sessions is the column we are looking for:

All the numbers in that column are the GA4 organic sessions for each page individually.

The process of gathering September data is done, so now we have to add it to the heatmap list.

Fourth Step – Data Entry

We have two scenarios here:

You are building the heatmap ground zero.
You already have the heatmap done, and you just need to this month’s data

Both of these cases follow the same process, but of course, you’ll need more time to create everything from scratch.

Scenario #1 – Creating a New Heatmap Report From the Start

In our example, the reports are created in Google Spreadsheets. The blank report looks like this:

There are two main sections here.

On the left, there is the URL map section, where the list of all blog posts should be added.
On the right, you can see the months and Organic Sessions 2024 in the header. This is where the results from the SEO Spider tool will go.

Since we started with an empty document, I will just copy and paste the results from the SEO Spider tool:

This process should be repeated for all months, meaning that you will need to adjust the Date Range and include only August.

In order to do this, you first need to Clear the crawl and then go the API section and adjust the Date Range:

The next part is very important — the new crawl will re-order the URLs, so don’t simply copy and paste the GA4 Sessions results.

Go the main Export option and export each monthly crawl instead:

The order of the original list will stay the same in the exported file (.xlsx), so you just have to get to the GA4 Sessions column, which was the BQ column for me.

Once you have everything sorted and all months added to the reports, it’s time for the last step – adding the colors.

In Google Spreadsheets, you can easily do this by selecting the range of cells and applying the Conditional formatting. The process is almost the same as in Excel.

Go to the Conditional formatting option:

On the right side, you’ll see a new window with various options. It will be Single color by default.

Select the range by clicking on the small windows icon:

I will select everything between columns C and N. Now we have to format the rules.

The first rule will be that if the cell is equal to 0, the cell color should be light red:

Now, while you’re still in the editor, click on the + Add another rule, just below the Cancel / Done button. It will open another formatting option within the same range.

You now have to apply another rule, but think about this one. Some websites will have a lot of traffic, so adding values between 1-10 would be a waste of time (and color).

In such cases, even values between 1 and 100 could represent the low-traffic blog posts.

In this example, I will use lower values to create the heatmap report:

After all the “between” rules are defined, the last one could simply be “greater than”:

The heatmap is now finalized.

A few tips to keep in mind while creating the format rules:

Tip #1 Decide which color should show zero traffic and which one should show all positive values. I pick light red for zeros and variants of green for positive numbers.
Tip #2 Define the “between” values, but don’t forget about the end values as they shouldn’t overlap. For instance, if you have a rule from 1-10, the next one should start with 11.
Tip #3 Use gradients of the same color. This will help you determine the trend more easily.
Tip #4 The last color variants will be darker, so it would be good to change the color of the text from dark to white. This will help you see the values properly.
Tip #5 You don’t have a lot of color options by default. Choose the between values wisely.

Scenario #2 – Updating an Existing Heatmap Report

Let’s imagine you have to add previous month’s values to the existing list. However, you’ve published some articles since the last Heatmap report update, so you have to use the Sitemap upload option.

The order of the URLs will not match the one you already have in the Heatmap report.

There are several options to match the URLs, depending on how many new articles you have published.

Option 1

If you haven’t published many new articles, you can just manually add them to the Heatmap report list.

Copy your updated Heatmap list to an empty Excel file and save it. After that, set up the SEO Spider tool’s API and change the mode to List.

Instead of choosing the Download XML Sitemap option, choose From a file. Upload your Excel file, run the crawl, and hit the main Export button.

Find the GA4 Sessions option and copy data to the Heatmap report.

Option 2

If you have published a lot of new blog posts, it will take some time to manually add them to the list.

You can use Excel to identify the new articles instead.

Set up the API for the new month and run the usual crawl from the sitemap (mode List, upload option Download XML Sitemap).

Then, paste the URL list to one column and GA4 results to the other column of an Excel file, let’s say column A and B.

In the fourth column (D) just paste the list of the URLs from your Heatmap report. I like to keep one column empty (column C), just to have better visualization when the real URLs are there.

It will look like this:

The columns will have different URL orders — this will happen when you upload the list from the sitemap and your Heatmap report.

These two lists will not have the same number of rows because of the newly published articles. The Sitemap column will have more rows in this case.

So, we need to match two URL lists that have different order and assign GA4 Session values to the Heatmap list using a VLOOKUP function.

Go to the column E, add a new header (Heatmap GA4 results) in first row and create this function:

=VLOOKUP(D:D,A:B,2,FALSE)

meaning:

Match the entire content of A and D columns, and then assign values from column B to the corresponding cells in column D.

Apply the function to all cells.

Here is the result, you can see how the GA4 Sessions for the URL 2 (Sitemap list) is now present in the Heatmap list as well:

The new blog posts will be present somewhere in column A, but we want to add them to column D as well.

We can easily find them if we check the duplicate values of the columns A and D, and then simply look for those that were not marked as duplicates.

First, select these two columns and search for duplicates using the Conditional Formatting > Highlight Cells Rules > Duplicate Values option:

The duplicates in both columns will have red cells, while some of the cells in column A will remain transparent. These are your new blog posts, still not included in the Heatmap report.

The last step is to collect them all, which can be done by applying a filter:

Select column A
Add the Filter (upper right corner, Sort & Filter option)
Filter by Color
No Fill
Expand the selection

Here are the URLs that were not found in column D:

You can now copy them to column D and run an additional Duplicate check, if you want to make sure that all URLs are present in the Heatmap column (D).

Your heatmap list is now ready, so just add the values from the column E to your Google spreadsheet, and don’t forget to expand the list of spreadsheet’s URLs by adding the new blog posts from Excel’s column D.

The process is repeated each month. Don’t worry, you will become quicker over time, so it will probably take you around 20-30 minutes to add GA4 values for the new month after you get familiar with the process.

How to Use a Lexia Heatmap Report?

The heatmap reports are not a new thing in marketing. A lot of us have used various tools to determine where people click on our website, so we could optimize the pages for more conversions.

This heatmap is a bit different, because it’s based on a larger group of pages, values, and parameters.

It shows a trend of events that helps us determine winning and losing pages, for instance. If we see that certain blog posts are losing traffic over time, we can ask ourselves – why?

Maybe they need more internal links, a content rewrite, updated images and videos, meta tags update, etc.

On the other hand, we see that some pages are doing great. How to use them? Well, they can become a source of internal links to other pages that need a “push”. Or, we can promote those pages on social media and newsletters, and get even more positive results.

How about products?

We can use the Lexia Heatmap report there as well. If there are products that bring traffic, but no sales, maybe we can add a discount there. How about those that have low traffic, but sales are great? Include them into quality blog posts, prepare newsletters, and promote them on social media.

Combining several heatmap reports will give you an even better perspective. If you include GSC and Ahrefs, for example, you can see track clicks, impressions, positions, backlinks, and more.

Options are endless.

What do you think about Lexia Heatmaps? Let me know on LinkedIn, or contact me via our Lexia website.

The post Create Custom Heatmap Audits With the SEO Spider appeared first on Screaming Frog.

The brightonSEO Crawling Clinic October ’24

Mark Porter — Fri, 27 Sep 2024 09:09:09 +0000

The biannual brightonSEO events have been marked in our calendar for well over 12 years now, and next week’s is no exception. As always, you can find us in our usual spot (stand 34, right hand side of the exhibition hall as you walk in):

Come and meet the team and discuss any issues you’re experiencing, our exciting version 20 features, things you’d like to see added to the SEO Spider, and more. We’re also offering a full 2-week trial licence, and the team are more than happy to give you a primer on how best to use it.

If you’re after agency services, we’re also one of the most decorated agencies in the UK, and one of our team would happily talk you through our award-winning offering.

GreenSEO Meet-up

Heading down early? Our SEO & Data Manager, Aaron, is speaking at the GreenSEO Meet-Up on Wednesday 2nd October, which we’ve also sponsored. If you’re interested in how SEO practices can contribute to reducing the environmental impact of websites, you’ll definitely want to attend.

Merch

Lastly, we’ll be dishing out our highly-coveted merch, including new beanies in a range of colours, ready for the coming winter months…!

We look forward to seeing you all next week!

The post The brightonSEO Crawling Clinic October ’24 appeared first on Screaming Frog.

Using the Screaming Frog SEO Spider and OpenAI Embeddings to Map Related Pages at Scale

Mark Porter — Mon, 23 Sep 2024 08:53:37 +0000

Since Screaming Frog SEO Spider version 20.0 was released, SEOs can connect Screaming Frog and OpenAI for several use cases, including extracting embeddings from URLs.

Using embeddings is a powerful way to map URLs at scale at a high speed and low cost. In this blog post, we’ll explain step by step what they are and how to map them using Screaming Frog, ChatGPT (OpenAI API) and Google Colab. This post is a more complete version of my original post gathering more use cases and feedback from SEOs who tried it.

After your crawl, all you need to do is upload a sheet and you’ll receive back another one, with your source URL and related ones in another spreadsheet. It’s that easy!

This article is a guest contribution from Gus Pelogia, Senior SEO Product Manager at Indeed.

Use Cases

Before we dive into the how, let’s explain the why. Mapping pages at scale has several use cases, such as:

Related pages, if you’ve a section on your website where you list related articles or suggested reads on the same topic
Internal linking beyond matching anchor text, your links will have a better context because the page topic is related
Page tagging or clustering for cases where you want to create link clusters or simply understand performance per topic, not per single page
Keyword relevance, such as written on the iPullRank blog, where they explain a method to find the ideal page to rank for a keyword based on keyword and page content

What Are Embeddings?

Let’s get it straight from the horse’s mouth. According to Google on their Machine Learning (ML) crash course:

Embeddings make it easier to do machine learning on large inputs like sparse vectors representing words. Ideally, an embedding captures some of the semantics of the input by placing semantically similar inputs close together in the embedding space. An embedding can be learned and reused across models.

In my own SEO words: embeddings are unique numbers attributed to words on a page.

If this is still not clear, don’t get caught up on the concept. You can still find similar pages without knowing the theory.

What Is Cosine Similarity?

So far, you’ve thousands of embeddings mapped. Each URL has hundreds of these large numbers separated by a comma. The next step is to understand cosine similarity. As read in this iPullRank article, cosine similarity is “The measure of relevance is the function of distance between embeddings”.

In my own SEO words: with embeddings, you transformed pages into numbers. With cosine similarity, you’re finding how topically close these numbers/words/pages are. Using the Google Colab script (more on it later) you can choose how many similar pages you want to put next to each other.

You’re matching the whole page content, not just the title or a small section, so the proximity is a lot more accurate.

Using Screaming Frog + OpenAI to Extract Embeddings

Here’s where things start getting more hands-on. First of all, you need to get an OpenAI API and add some credit to it. I’ve extracted embeddings from 50.000 URLs with less than $5 USD, so it’s not expensive at all.

Open Screaming Frog and turn JavaScript rendering on. From the menu, go to Configuration > Crawl Config > Rendering > JavaScript.

Then, head to Configuration > Custom > Custom JavaScript:

Lastly, select Add from Library > (ChatGPT) Extract embeddings […] > Click on “JS” to open the code and add your OpenAI key.

Now you can run the crawl as usual and embeddings will be collected. If you want to save a bit of time, untick everything on Configuration > Crawl and Extraction since you won’t look at internal links, page titles or other content or technical aspects of a website.

Using LLMs to Create a Python Script

After having your crawl done, it’s time to use ChatGPT again to create the code for your tool. Ask something along the lines of: “Give me a Python code that allows me to map [5] related pages using cosine similarity. I’ll upload a spreadsheet with URLs + Embeddings on this tool. The code will be placed on Google Colab”.

You can try it yourself or use my existing Related Pages Script to upload your sheet directly, reverse engineer the prompt or make improvements. The tool will ask you to upload your csv file (the export from Custom JavaScript created by Screaming Frog). The sheet should have two headers:

URL
Embeddings

Once it processes the data, it’ll automatically download another csv with Page Source and Related Pages columns.

As with anything AI related, you’ll still want to manually review everything before you make any drastic changes.

Common Issues

While this is an easy to use tool, some problems might come up. Here are the ones I’ve seen so far:

Rename the headers in your Screaming Frog export to “URL” and “Embeddings”
CSV file has URLs without embeddings, such as crawled images or 404 pages, which don’t generate embeddings. Make sure every column has a valid URL and the embedding is visible
The crawl has a high speed and you started getting errors from OpenAI. Decrease crawling speed, go grab a coffee and let it do its work
OpenAI has many models and some page crawls might fail due to the number of output tokens requested. Generate your API using gpt-4o mini (up to 16.384 tokens) twice as much as gpt-4 (8.192 tokens). If some pages still fail, remove them from the crawl

The post Using the Screaming Frog SEO Spider and OpenAI Embeddings to Map Related Pages at Scale appeared first on Screaming Frog.