Register - Login
Views: 96135870
Main - Memberlist - Active users - Calendar - Wiki - IRC Chat - Online users
Ranks - Rules/FAQ - Stats - Latest Posts - Color Chart - Smilies
12-10-18 11:38:30 PM

Jul - Meta - Two suggestions both regarding useragents New poll - New thread - New reply
Next newer thread | Next older thread
Jamie
140
Level: 24


Posts: 23/143
EXP: 69491
For next: 8634

Since: 06-03-14


Since last post: 8 days
Last activity: 8 days

Posted on 09-21-18 03:15:19 AM Link | Quote
This was discussed earlier awhile ago in the IRC, I remember pinging XK and kak about it: could "Mobile" and "Phone" become a catch-all for triggering the mobile view? I think it appears in most UAs regarding mobile devices (iOS and Windows Phone respectively).

Also, wouldn't an effective way of blocking spiders be to ban the useragent matching "bot" or "spider" at .htaccess level (or whatever this server is using)?

Xkeeper

Level: 251


Posts: 23358/24710
EXP: 251094235
For next: 2945641

Since: 07-03-07

Pronouns: they/them, she/her, etc.

Since last post: 1 day
Last activity: 4 hours

Posted on 09-21-18 10:19:51 AM Link | Quote
In an unamusing ironic twist, bots have to be able to crawl pages to know that they aren't allowed to crawl pages.

I wish I was fucking kidding.
Kak
heh
Level: 73


Posts: 1808/1817
EXP: 3397935
For next: 87933

Since: 09-03-13

From: ???

Since last post: 13 days
Last activity: 5 hours

Posted on 09-21-18 10:48:05 AM Link | Quote
Bots also won't necessarily report themselves as such.

(Though well known spiders like GoogleBot or BingBot do, soooo....)
StapleButter
Member
Level: 43


Posts: 491/507
EXP: 525047
For next: 39999

Since: 02-24-13

From: your dreams

Since last post: 14 hours
Last activity: 14 hours

Posted on 09-21-18 10:56:59 AM Link | Quote
although completely denying them access yields the same result, ie they aren't able to crawl and index shit
Rena

Star Mario
Fennel
Level: 129


Posts: 5256/5258
EXP: 24589788
For next: 459866

Since: 07-22-07

Pronouns: he/him/whatever
From: RSP Segment 6

Since last post: 48 days
Last activity: 11 days

Posted on 09-21-18 01:45:11 PM Link | Quote
Well good bots will obey robots.txt and have a proper user agent string, and bad bots... are bad.

Anyway it occurs to me that there doesn't appear to be a way to see a user's title (or ban message) on mobile.
Xkeeper

Level: 251


Posts: 23358/24710
EXP: 251094235
For next: 2945641

Since: 07-03-07

Pronouns: they/them, she/her, etc.

Since last post: 1 day
Last activity: 4 hours

Posted on 09-21-18 03:48:50 PM Link | Quote
Originally posted by StapleButter
although completely denying them access yields the same result, ie they aren't able to crawl and index shit

In the case of Google and friends, they still have earlier non-blocked versions of your website and will continue to use that in search results, even if they are blocked in the future.

The eventual plan is to outright deny everything, but due to this bullshit they need to crawl to see the "no index, no cache, no follow" tags.
Rena

Star Mario
Fennel
Level: 129


Posts: 5257/5258
EXP: 24589788
For next: 459866

Since: 07-22-07

Pronouns: he/him/whatever
From: RSP Segment 6

Since last post: 48 days
Last activity: 11 days

Posted on 09-21-18 04:43:53 PM Link | Quote
Wait, why do we want to block Google?
Xkeeper

Level: 251


Posts: 23358/24710
EXP: 251094235
For next: 2945641

Since: 07-03-07

Pronouns: they/them, she/her, etc.

Since last post: 1 day
Last activity: 4 hours

Posted on 09-21-18 06:01:26 PM Link | Quote
Thankfully I can just direct you to the answer
Jamie
140
Level: 24


Posts: 26/143
EXP: 69491
For next: 8634

Since: 06-03-14


Since last post: 8 days
Last activity: 8 days

Posted on 09-24-18 09:23:19 AM Link | Quote
Originally posted by Xkeeper
Originally posted by StapleButter
although completely denying them access yields the same result, ie they aren't able to crawl and index shit

In the case of Google and friends, they still have earlier non-blocked versions of your website and will continue to use that in search results, even if they are blocked in the future.

The eventual plan is to outright deny everything, but due to this bullshit they need to crawl to see the "no index, no cache, no follow" tags.

I thought eventually they just disappeared, after a while, but you're more than likely right on this.

Also a lot of abuse bots (ie ones that ignore robots.txt) still use bot at the useragent level - AhrefsBot and Semrush being two examples
Xkeeper

Level: 251


Posts: 23358/24710
EXP: 251094235
For next: 2945641

Since: 07-03-07

Pronouns: they/them, she/her, etc.

Since last post: 1 day
Last activity: 4 hours

Posted on 09-24-18 03:59:27 PM Link | Quote
Eventually I plan on updating it to outright block bots, but for right now you could say this is something of a transitional period.
Next newer thread | Next older thread
Jul - Meta - Two suggestions both regarding useragents New poll - New thread - New reply




Rusted Logic

Acmlmboard - commit 220d144 [2018-11-04]
©2000-2018 Acmlm, Xkeeper, Inuyasha, et al.

29 database queries, 1 query cache hits.
Query execution time: 0.211088 seconds
Script execution time: 0.014188 seconds
Total render time: 0.225276 seconds