The TL/DR is this:
Really plan out your URLs and where you think they will be in the next 5+ years if at all possible. The goal is to future proof your URLs so that redirect, 404 and lots of other common issues simply aren’t problems in the years to come.
URL Guide and Rules
Whenever possible the following guidelines should be followed:
- Use dashes (-) and not underscores (_) to bridge one word to another
- Correct: /sun-round/
- Not Correct: /sun_round/
- Refrain from using stop words like “and”, “but”, “a” and other stop words whenever possible.
- Avoid using years or dates in URLs as it is hard to just refresh the page in later years without it looking dates.
- Optimal is 3-5 words, long is 6-8 words, extremely long URLs are more than 8 words.
- Use the Page Title as a guide and the overall topic of the page to help create the URL.
- Don’t repeat keywords or phrases if possible as it just looks over optimized, spammy and longer than it should be.
- Avoid using capital letters and mixed case URLs. Go with lower case always for URLs.
- Trailing slash vs. No trailing slash – go with one and make sure it works with your CMS.
- Avoid # in URLs as they will not follow after that.
- Organize, prioritize and create standards when you do use URL parameters/variables.
Arguments For Keeping URLs As Is
In meetings often SEOs have to fight with various teams on WHY URLs should not change or what they should change too.
For those times here is some ammo to take into those meetings but do listen to the full podcast to get even more ways to combat requests to change URLs
- Analytics & Reporting – doing WoW or Month over Month reporting becomes very complicated.
- Mixed case URLs complicate reporting.
- Variables in URLs make reporting complicated or duplicate pages.
- Slash and No Slash can again muddy reporting.
- Redirect Chains – if you have done 2-4 site redesigns over 10+ years and have changed the URLs each time, you likely have many redirect chains.
- Domain Change – one of the few times when Rob is for URLs changing.
- Platform Change – if you are changing from ASP to PHP based platform this is one of the other rare times where you might need to change URLs.
Tools for Managing URLs
While every project and issue is different but Dave asked Rob what tools he goes to the most often:
- Google Search Console – it just gives so much data and insight that you cant get anywhere else.
- Screaming Frog (or crawler of choice) – he uses it to look at the export of all the internal links for a site.
- Screaming Frog Log File Analysis – to get a view of what is going on with crawlers
- Link Tools (Semrush, Ahrefs, Majestic, etc.) – uses them to help find external links maybe pointing weirdly to a site
Matt Siltala: [00:00:00] Welcome to another exciting episode of the business of digital podcast featuring your host, Matt scintilla and Dave roar. Hey guys, excited to have you join us on another one of these business of digital podcasts. Uh, grateful for, um, you know, grateful for that as always, that you give us the time of day and today we have a really good one for ya.
Well at least I’m excited for this one cause. Um, I know that we’ve got Dave there, but also the surprise guests that we have for you is a good friend of mine, Rob woods. Hey Rob, how’s it going man?
Rob Woods: [00:00:33] It’s good. Good man. Thanks for having me.
Matt Siltala: [00:00:35] So, just want to give guys a little bit of a background. So Rob and I, probably one of one of the oldest industry, if you call it friends that I have, um, way, way back in the day, one of my, one of the first clients.
And, and, uh. Anyway, Rob and I have been working together in some form or another. He’s even come out to avalanche and done what we call an avalanche day at times, and where [00:01:00] does training for the teams and, and, uh, anyway, um, Rob, I’m gonna let you go ahead and, uh, kind of introduce where you’re at now and what you consider, uh, your, your title to be and all that good stuff.
But, uh. I do. Again, I do appreciate you joining us and I’m excited to chat, uh, URLs with you today, so go for it.
Rob Woods: [00:01:19] Yeah, you bet. Yeah. Thanks again. Yeah, yeah. Like you said, I think, I don’t know how long we know it’s 13, 14 gears. I was trying to figure it out the other day. Something like, that’s a long time anyway, so, yeah, mean I’m an SEO consultant independent, um, been doing that for about seven years.
Uh, I was in house at a couple of different places, uh, before that for, uh, about 12 or 12 or 13. So I’ve been in the industry since, uh, since 2000. Um, especially the last, uh, 10, 12 years, mostly, uh, are almost exclusively focused on, on SEO. Um. Kind of the technical SEO is really my wheelhouse. Audit speak, you know, big technical problems, or, you know, we’re talking about URLs today that kind of kind of [00:02:00] fits right in there.
Uh, content stuff, that kind of thing as well. Uh, you know, content strategy. But, uh, yeah, that’s kinda my, uh, my, uh, my wheelhouse. So all kinds of clients from the smallest to the, uh, to the biggest. So yeah, everything in between. Awesome.
Matt Siltala: [00:02:15] Well, again, grateful for, for you joining us today and we are going to, I’m going to just kind of.
If you will virtually hand it over to Dave and let him get started with this URL side of it. But I guess, um, you guys have, you know, what we were chatting with before you guys, uh, or before we started recording, you know, you’ve seen a lot of wonky, crazy stuff going on. Uh,
Dave Rohrer: [00:02:39] Oh, we always do. We likely see that
Matt Siltala: [00:02:41] stuff, but something I was, and this, this is kind of unrelated, but it’s related to SEL, but something that I saw someone post this morning, they were talking about how.
Cause I guess a Hertz rent a car is getting ready to do bankruptcy or something, but some
Dave Rohrer: [00:02:57] are selling all the Corvettes.
Matt Siltala: [00:02:58] Yeah, I saw that. But some of
[00:03:00] Dave Rohrer: [00:03:00] those are overpriced. Those are, that’s got too many miles. Those are beat up. I wouldn’t,
Rob Woods: [00:03:05] those are hard miles.
Matt Siltala: [00:03:06] Those are, those are very hard. But, but anyway, my point was, it was interesting because I guess they paid like seven something million for their website.
And if you do a bot run with how like Google sees it, it’s basically a blank, a blank page. And you guys probably see a Dell stuff bill and stuff like that all the time. It’s just amazing to me that someone would build a $7 million website that didn’t even care that Google could actually read easier or rainfall or see, but I will stop talking.
I will let you take it over. That was just an observation and a, all right. There we go.
Dave Rohrer: [00:03:39] Is in every meeting pointing that out.
Matt Siltala: [00:03:41] That’s a good point.
Rob Woods: [00:03:42] And then got overruled
Dave Rohrer: [00:03:43] and got overruled time and time again when, when it would be brought up that the application would then take an extra three months and that it would take another million dollars and they probably pointed out solutions and ways around it not to get off the topic.
Um, but it [00:04:00] kind of a nice segue. There’s always a problem when it comes to SEO. There’s always a solution. The problem often is that there’s a business decision or time or dev or someone overrules it. Um, which is a good lead in because often, and I talk about this things that we’re seeing or like, Oh my God, you won’t believe what happened today.
Um, especially like, like nine o’clock on a Friday or. You know, I just happened to check my email at Sunday, you know, yesterday at two in the afternoon. And my client had emailed me and freaking out because the dev did something Saturday night and no one knew. Um, SEOs spend most of their time running around either trying to fix things that they didn’t know about.
Um, someone else shot the site in the foot or they’re just overruled. And I think URLs are one of those things. What’s the most, you don’t have to say them. I want to say the most recent cause I [00:05:00] don’t want you to have to give away anything. Rob, what is a an instance where in recent year or two where you saw a client completely shoot themselves in the foot, big time with URLs.
Rob Woods: [00:05:15] Oh, you have so many examples. There’s so, yeah, there’s so many to choose from. Uh, I mean, some of it is, even when they don’t, um, they don’t even know what is going on or that these URLs even exist. I mean, I think you had one, uh, um, even that, I, that, uh, that we were chatting about this morning, right, right.
You know, there’s a, there’s a thousand URLs, variations of one URL, and you don’t, and you don’t even don’t even know where they’re coming from. I had one like that with a client that for some reason I was doing an audit. And Google’s crawling about 200,000 URLs a day and the site only has 2000 skews. So when you take one of the 2000 skews.
Okay, that’s interesting. And you know, maybe they have a hundred or 50 or whatever kind of category pages, [00:06:00] this category of products, that category of products, I mean, they really shouldn’t be crawling 200,000 URLs per day. Um, and we could not find the URLs on the site. I couldn’t find them, uh, in, uh, in Google search console.
I couldn’t find them by browsing the site. Um, and so that’s when I had to default to looking at say, the log files, right? Which is, is, uh, really kind of the best way to really see every single URL, but that Google or, or, or any of the other bots might be seeing on your, on your site. And it was some bizarre URLs or being created by the add to cart button.
Um, with like weird, like tilders and asterisks and weird stuff in the URLs. Um, they had no idea that, that Google was able to crawl these URLs. So for like every variation of every product, every size, every color, uh, was creating these unique cart URLs. And so we were able to go in and say, look, stop crawling.
Those [00:07:00] URLs don’t spend huge resources crawling URLs that basically have no value to the search engines. And, uh, um, and focus, you know, your crawling efforts on the ones that we do care about made the crawl so much more efficient. Um, that’s kind of one I can think of recently. Another one is, uh, um, a client changed kind of how their, their site is.
It was coded and some of the old pages stayed and all of those will URLs ended. Dot HTM. The all of the new ones that are generated and with just a slash so now whenever they try and do something with the URLs in the code, they have to take both of those into account so they can be, well, we’re just going to change all URLs with dot HTM to a to a slash.
Well, you can’t just change that. There’s always change them. No, no. You gotta change them. If you’re going to change them, you got to redirect them all. [00:08:00] You got to redirect any URLs that are redirected currently to the URLs that you’re changing and update them to point to the final, you know, correct. URL.
Um, and so that’s, that’s kind of one of my biggest problems is, is people just developers wanting to just change URLs just for. The heck of it. Um, or, or with URLs that just aren’t, don’t lend themselves to solutions. So for me, URLs can’t really help your SEO, but they can sure hurt it. Right? Um, you know, stuffing your, your URLs full of keywords and all that kind of stuff really isn’t gonna help your rankings too easily.
It too easy to spam, right? So Google’s not going to give it a lot of value cause it’s just, it’s too easy to stuff keywords in there that don’t provide any value to users. So obviously it’s not something that’s weighted very heavily in the algo, but you can sure mess up your URL or make your life a whole lot more difficult if you [00:09:00] don’t.
Do them right and really plan them right, right from the get go, right? Think about how your entire site is going to be structured, you know, for the next five years, basically, and work that into your URL, a structure, your, your plan, how they’re going to be formatted, what’s some folders? Are you going to have all of those kinds of things before you, before you, uh, before you even start implementing another, right?
Dave Rohrer: [00:09:27] I have a question for you. It’s for me, it’s kind of like the, what keyword density should I have? But as I look at our little internal thing, then one of the things I tend to see a lot of people do is whatever the page title is, is what the URL is. And so like right now I’ve had a special kind of week, so it’s, what day is it talking to you?
Where else with Rob is the working title for. This episode, which is not going to be what it is, but it’s been in front of week. Nine times out of 10 someone [00:10:00] will have infographic white paper about why, you know, the sun is round and dah, dah, dah, and it’s like 30 words long. What if there is an optimal, what would you say you try to tell people to look at or try to condense a URL down to just for sanity sake.
But also there is a limit for the URL that Google in new browsers will even use.
Matt Siltala: [00:10:27] Yeah.
Rob Woods: [00:10:28] That’s, I think that’s a pretty large limits to thousands of
Dave Rohrer: [00:10:33] sites have problems
Rob Woods: [00:10:34] with that though. Yeah. Yeah. I’ve seen some very, very long wins, especially when the sessions are all amateurs. Or there’ll be, you know, each variable is 30 characters, right?
And they have 10 different variables that are 30 characters each, uh, that, that just keep adding and adding on and on and on and on. Um, I mean, I don’t know there’s an optimal length. I would just say shorter is better. Uh, you know, I try and keep it, you know, [00:11:00] probably, you know, more than five words probably, um, enough that someone looking at the URL, uh, it helps reinforce what the page is about.
Right. So you, you were L’s do show up sometimes in the SERPs, depending on how Google is deciding to change the search engine results today and whether they show them or not. Um, but, so let’s, you know, I, there’s another signal that a. The page is the right page to click on because Hey, look in the URL and in the title and the description, it kind of all says, it looks like this is a good answer for whatever I searched for.
If someone lands on the page and they glance at the URL, again, it’s maybe a quick signal that, Hey, I’m in the right place. So I think keeping them short and meaningful, uh, is good for users. It’s maybe good for click through rate a little bit. Um, but that’s mostly why. So, you know, take out all the stop words that if a and, but you know, all of that kind of stuff.
Um, but [00:12:00] just leave enough of the main words that someone looking at it still gets a sense of what the page is. Yeah.
Dave Rohrer: [00:12:10] Don’t the number of times I’ve seen where they’ve keyword stuffed it and it’s, I don’t know. I’m looking windows. So if you’re in the category windows and then it’s a subcategory, you know, white windows and the neat think the URL has to be, because we need the, we need it in the, we needed an URL.
We have to say windows again. So it’s like you’ve already said it in the category. Someone landing here already sees pictures. There’s already a page title. You know, you don’t need to have eight places where you put the word windows in the URL.
Rob Woods: [00:12:38] Exactly. And, and, and that kind of leads into, another thing I was thinking about is, is really think about having those categories like windows in your URL.
Because it really helps with, I find for like reporting, you know, grouping like URLs together, you know, whether it’s for in analytics, like, Hey, I want to know what the [00:13:00] windows category is doing for traffic. It’s a heck of a lot easier to just look for anything with slash windows slash in the URL than to try and figure out every single URL that happens to be one of your windows products.
Dave Rohrer: [00:13:12] That’s a good segue into. Reasons, an ammo that a CEO’s can have beyond, it’s, it’s a, you know, a developer. Every time it’s like, well, we’re going to have to change your at all. Well, I can put a three Oh one look, we have a plugin. Look, I already, I already added it to the HDX. This file. It’s done. It’s not a problem.
Rob Woods: [00:13:32] You know, it’s
Dave Rohrer: [00:13:32] a really simple solution. Um, or I’ll create some red jacks and, Oh, look, we just fixed it. I don’t think they developers think about, sorry, developers. I’m picking on you today. Um,
Rob Woods: [00:13:45] that’s what we do.
Dave Rohrer: [00:13:46] Oh yeah, yeah. They pick on us too. In this case though, they’re wrong. And there’s so many reasons why and beyond that.
So from link tracking, but [00:14:00] also what you’re just saying, reporting, how difficult is it. Do you think Rob and Matt, you can even chime in on this one. You’re over your reporting when every single URL across the million page site changes.
Rob Woods: [00:14:12] Well, it’s not, not only, you know, here over a year, so year over year is an issue.
I just had that with a client, right? It’s, um, we want to change all of these URLs. Well, that’s going to make year over year or even month over month tracking really difficult because you can’t, you don’t have the same URL to compare. This period to that period. Right. Um, you know, as well as reporting a lot, a lot of sites go to the brief flat structure, you know, domain slash URL slug.
That’s it. And that’s a little bit what I have with, with a client right now. And, and the analytics guy was talking to me saying like, this is a nightmare. They want to know what all, what the traffic is doing or what the conversions are in this category. And we don’t have categories in our URLs. So every [00:15:00] time I have to run this report, I have to, you know, generate whatever this giant rejects to say, I want this, these 500 URLs that actually make up all the things that are in that category.
But there’s no simple way to just just categorize it. Um, so yeah, changing URLs or not having. Uh, you know, parts of the URL you can, you can certain filter on is that is a huge challenge. That’s why I say, you know, you were, those can’t really help you, but they can sure hurt you if you don’t plan them out.
Right. And again, like you say, changing them and changing them as just it, it can become a nightmare as they get changed and changed and changed and changed. And you and I probably both, you know, gone through sites that have been redesigned, you know, five times and the URL is changed every single time.
Well, a lot of those it’s, you know, URL one redirects to you or all to which redirects to you, or all three, which four to five, two to whatever I’ve had, you know, I’ve seen six or [00:16:00] seven hops just based on, we just changed the site and redirected them, or even we changed the site, didn’t redirect them. Uh, uh, so yeah, changing URLs is just a, it’s a huge headache and you should avoid it.
I think at all costs, unless there is a really compelling reason to change them.
Matt Siltala: [00:16:20] Well, what would be one of those really compelling reasons? I’m just curious.
Rob Woods: [00:16:25] Um, I mean, obviously a change of domain. I mean, if you’re changing domains in your URL is gonna change cause it’s just a new domain and that’s part of the URL and that’s going to change, um, changing platforms sometimes.
Like say you’re on a site that has all SPX URLs right? And you’re leaving the ASP platform and going to, you know, something else you were also going to change in that case, I might change them. Gotcha.
Matt Siltala: [00:16:52] Well, I, I have one that would probably drive drive you guys nuts. Uh, it was some of, I recently talked to him [00:17:00] and, uh, they have a 20 year old domain.
Okay. And some. Technical guy that they have in there that’s, you know, a computer person,
Rob Woods: [00:17:08] quote unquote,
Matt Siltala: [00:17:09] but decided to, uh, you know, talk to them about, Hey, we need to just get a new domain name. And ended up, they got a new domain name and they stuffed some keywords in there, speaking of, you know, doing that.
And, uh, so then these guys started losing like a lot of their traffic as, as you would understand. And they came to me and was like, what happened here? And I’m like, okay. Why would you give up a 20 year old domain? Just make some changes, make some improvements to it. Here’s what we could do versus going to this new domain name and here get this guys.
They did it. They built it on a platform where there are no URLs. It’s one of the, what are they called? Those, uh. I can’t remember the word
Dave Rohrer: [00:17:53] for page.
Matt Siltala: [00:17:54] Yeah. But basics. And so like a lot of times like you get stuck in a [00:18:00] nightmares like that. And, and, uh, the hardest thing is, and, and, uh, again, I’d love to hear your guys’ feedback, how you, how you share this with a client.
But you know, like, so let’s say they just spent two or three grand on a website that you have to tell them, look, I’m sorry, you just wasted money. And you know, that’s essentially what I had to tell these guys. But. Anyway, one of those interesting deals. And so, I dunno if you have any other specific, uh, questions for Rob, uh, about that or any thoughts?
Uh, Dave, but
Dave Rohrer: [00:18:30] yeah, we see that kind of stuff all the time. The, um, the crazy crawling, like there was literally a weird thing that I saw today over the last, um, site has 500, 500, 600. Like blog posts, categories, Asian type things, maybe 20 or 30, maybe up to 50, including about pages. I mean, there’s not even 600 700 pages at the very most.
Um, most of the pages are redirected. There’s not a lot of four Oh four [00:19:00] fours and other stuff, but over the last month, there’s been three different jumps of weird stuff going on in search console. And I noticed it a couple of weeks ago and it was like, well, that was odd, but it wasn’t like a big enough jump.
And then I just looked this morning and they’ve had now from about 400 pages with weird canonical issues to 3000 almost 4,000 and of the thousand that I could export out of Google search console, which Google, that’s not really helping me. When you tell me I have 5,000 or 4,000 pages and I can only see the first thousand by the way, I’m like.
90 95% of them are all this one URL. And I don’t know why. There’s nothing that links to it. Um, you know, someone said, suggested, I don’t know if it was Rob or someone may have been Jesse. Um, someone had told me, he was like, you know, what about redirects? And I was like, Oh, okay. Well, went back, looked at one, all the links to the page.
Um, you know, so here it [00:20:00] is, one URL. Nothing. Wrong with it. And yet I’m running around because Google with weird parameters is just blowing this page up.
Rob Woods: [00:20:10] Yeah. And that’s part of where when you get those weird URLs, it’s just trying to figure out where they’re coming from. Right? Yeah. It could be internal links.
It could be external links like mine. It was the shopping cart technology on that one that I had, like, who knew? Like you could not see these links anywhere, uh, except they were being generated kind of on the fly. Uh, um, and Google was finding them. And that’s, again, that’s a lot of times where you have to do it.
The a log file. It’s tough to get log files from a lot of clients. I find it’s, it’s really tough to, to a get them period or anything else to get them to the right format. It’s impossible. Yeah. Yeah. It’s, it’s really like, and a lot of them don’t even know how to get them right. Especially smaller clients, you know, they don’t know, you know, a log file from, from, from whatever.
Right. So, um. But yeah, a lot of times that’s where you have to and, and, and uh, I mean, [00:21:00] yeah, it, it, it’s even beyond just all the changing them and everything. I think a lot of people don’t understand even what the basic elements of a URL and what a different URL is. I rented a so many duplicate
Dave Rohrer: [00:21:13] content.
Rob Woods: [00:21:13] Yeah. Mixed case. I yeah. W uh, you know, trailing slash versus non, non trailing slash other than for your main domain. That’s, that’s duplicate. If it has the trailing slasher, it doesn’t have a trailing slash those are two different URLs, right? If it has mixed case, and mixed case can be a real pain because some platforms allow both to resolve, right?
You can add a capital atrial, lowercase H and they both work and some allow even it’s a content creator. So Hey, I’m uploading a blog post. I’m going to make the URL, slug old title case where the first letter of every word is capitalized or another one might do all lower case. So unless you record what is the originally created URL and somehow make that the canonical, it can be really difficult to figure out which version.
The one with [00:22:00] uppercase characters along with all lowercase even should be canonical. That’s why I like to, go ahead.
Dave Rohrer: [00:22:05] I was going to say wait till you have to try to report on that and you have examples like that, which then translates into a hundred.
Rob Woods: [00:22:11] Yeah, so I really like to, if you can, out of the gate force all lowercase.
Yes. And then it’s super easy. Like in the HT access file, it’s like three or four lines of code that just says, if you encounter an uppercase letter, you know, redirect that URL to the version with the lowercase. Like, it’s super, super easy. It’s, it’s, you’d think it’d be complicated that, Hey, if you find an H, you have to redirect it to a lowercase agent if you have, you know, whatever.
It’s actually really easy to just force all lowercase. So unless there’s an absolute reason we, I mean, we often have to deal with it with legacy cases where it’s just their mixed case and that’s just the way it is. I always go all lowercase, if you can, if you can think of it. But w between variables and, uh, any, any single character being [00:23:00] different.
Um, you get a lot of duplicate content, or what I see a lot of times is people don’t understand that Google ignores anything after a hashtag, a hash in the URL. And so you’ll get that, that changes the content quite a bit, uh, or completely changes the page right after the pound sign. And so it’s like the op opposite of duplicate content.
Which version of the content are you showing to Google for that same URL you using the hash to really? Um, um. Yup. T to really change the content. Here’s a fun one. Google sees it as the same URL, so it’s different content, same URL, rather than, you know, duplicate content, which is, you know, same content, different URL.
Dave Rohrer: [00:23:44] a fun one. Is, is, Oh, many years ago now, I was helping any commerce site, which I’m sure most people that deal with the eCommerce site sees that, see this to some extent, but there was, um, I don’t even know. We’ll just use, um. Like G [00:24:00] shorts cause you know, hopefully it’s going to be warm here eventually.
Um, depending on which one you selected first. If you selected shorts or you selected men’s, the URL would change based on what you selected first, second, third, fourth. So if you picked, I want men’s clothing, shorts, size, color, that was one URL. And then if I selected shorts. Color size meant like, and it would go on and on and on.
There was no, there was never a first, and there was never a last, there was no order. It was just the URL was always just build based on what you selected last.
Rob Woods: [00:24:41] Yeah.
Dave Rohrer: [00:24:41] Couldn’t figure out why they weren’t ranking for anything and why they had so many pages in Google search console and how many pages were being crawled every day.
Rob Woods: [00:24:51] Like, yeah, that’s a great point, right. That, that, um, that you, uh, um, I haven’t seen it in [00:25:00] awhile, but yeah. Well you’ve talked to about yet, whether you pick, you know, uh, black, uh, um, you know, short pants or short pants in black. Yeah. The URLs are different than, Oh yeah. It becomes, it becomes difficult to come on a Calise.
It becomes difficult to block the crawling cause. How do you block. Google from crawling that and the robots. Dot text. There’s no unique part of the URL that for each of those that you can say, block this one and not that one. You can’t block anything with shorts in it. You can’t block everything with black in it.
You can’t, you know, so you almost can’t control the crawl. In that case, you can control the indexing. Maybe you control the commoditization maybe, but you just can’t control the crawl. Right. There’s just is no way. If there isn’t a unique. Part and that’s, I mean, that’s why I think you can really plan them out.
You’ve got to really plan them out and say, look, I know what your URLs, I don’t want indexed or crawled. I need to build something unique into the, the URL of that, [00:26:00] those pages that is different from the other pages so I can block one set in and not the other. It’s,
Dave Rohrer: [00:26:04] it’s such a simple thing that gets screwed up so often, and I think.
With Google also taking away the canonical is insane. It’s now a hint. Yeah. And uh, and you and I have talked about recently where we saw examples where, you know, sure, that’s a hint. But also the robots. Dot. Text is being taken as a hint when it depends, you know, in which way the wind is blowing
Rob Woods: [00:26:27] you. No follows a hint.
Dave Rohrer: [00:26:29] every, everything’s a hint from Google, yet they wonder why, you know, people can’t control crazy sites. What tools, and I know it’s. You’re an SEO, we’re all SEO. So it always depends on the whatever the the issue is and what technology stack they’re on and you know, everything else. But what are a couple tools you tend to go back to or solutions to try to wrangle URLs after the fact?
Is there one or two tools, like whether it’s screaming [00:27:00] frog or you know, or. Or things on the in the code, what are like two or three things that you do?
Rob Woods: [00:27:05] Just really, I mean, Google search console, right? It’s, it’s going to give you a sample, but it’s really the one of the best tools to say, here’s what Google is seeing and how they’re understanding the URLs, whether they should or shouldn’t, right?
It’s, it’s, here’s the ones that they’re discovering and not. Uh, indexing or crawling and not indexing wa why are they doing that to one set of URLs and not to the other, or, or, or, or whatever it is. If you want to duplicate without a, uh, uh, a canonical, or they’re ignoring the canonical or, you know, whatever, whatever that is.
Certainly Google search console. I got, I’m in Google search console, you know, multiple times a day, every day for different clients. Uh, screaming frog, I think for sure, for, um, uh, for internal links. And I really like to, you know, I like to use that, um, to, to find, uh, you know, crawl as Google bot and see what internal links are being crawled, even look through the whole big export.
Uh, if you can get it [00:28:00] all. Of say the outlet links are all, you know, are all of the end links, uh, for, for a page, uh, or for a site. Uh, it can get, it can get a little bit. I think I exported one the other day. There was 750,000 lines and, and, and, uh, so if it was a huge site, you’d probably, you’d probably break Excel if you tried to try to run it.
But certainly screaming frog for internal links, like, what are we linking to internally that we shouldn’t be, Oh, we’ve got these weird URLs, or, Hey, here’s these URLs that. Um, you know, that the, the crawlers finding that I didn’t even know existed, um, for log file analysis, like screaming frog as well. Uh, just cause I’m used to it and it’s pretty easy to use screaming frog log file analysis and then, you know, again, for finding weird URLs or any or all of the big, uh, um.
Uh, kind of link, uh, data tools, you know, whether it’s SEM, rush, majestic H reps, link research tools. I probably use all of the above. Uh, again, to see it. Is it, [00:29:00] is it someone externally that’s linking to these URLs that’s creating this weird issue? For me, it’s nothing on my site. It’s just, you know, some scraper has scraped my site and truncated the URL.
And that’s causing weird URLs to show up in my reports that it’s not even, you know, it’s not, not even my doing right. It’s just, it’s just beyond my control unless I somehow get those links removed or whatever. So those are probably the main tools that I use to find weirdness in URLs.
Matt Siltala: [00:29:29] Very good. Well, um, Rob, we really appreciate you taking the time to chat about all this with us today.
Um, we’re, we’re just about at that point where we need to wrap things up, but do you have any. Any final thoughts or anything that came to your mind that you want to share before we, uh, before we sign off?
Rob Woods: [00:29:46] Uh, not really, other than just kind of the, the drum I’ve been beating a little bit through. The whole thing is just, I mean, if you have the opportunity really, really carefully, plan your URLs ahead of time for, you know, both [00:30:00] being friendly to users, you being able to report and on, on results and control, uh, the crawl or the, or the indexation.
Um. If, if you have a legacy system, there’s lots of tools to try and control them, but you’re far better to try and plan it all out ahead of time to avoid having to fix mistakes or, or make changes later.
Matt Siltala: [00:30:25] Very good. Very good. Well, we did it guys. We spent 30 minutes talking about URLs,
Rob Woods: [00:30:30] so there you go. I never thought it would have been possible
Dave Rohrer: [00:30:32] without even trying and really,
Matt Siltala: [00:30:34] well, again, Rob, I do appreciate you taking the time to, uh.
Out of your day and schedule and everything going on to, uh, to, to record with us. And so, um, everybody that, uh, is curious to learn more about, Rob will have all his information on the writeup. Uh, you’ll be able to find out from him there or find out more about him there. And, uh, as always, just want to encourage anyone that’s listening, um, that likes this, uh, [00:31:00] podcast.
Go to iTunes and give us a five star review. You don’t have to read this. Just give us five stars if you like it. So, uh, for Rob woods and Dave roar, I met Cinderella and we appreciate you guys, uh, chatting with us and have a good day. Bye bye.
Dave Rohrer: [00:31:14] Bye.