Zuletzt aktiv 1 month ago

Erreur32's Avatar Erreur32 hat die Gist bearbeitet 6 months ago. Zu Änderung gehen

Keine Änderungen

George Boobyer hat die Gist bearbeitet 11 years ago. Zu Änderung gehen

1 file changed, 1 insertion, 1 deletion

Apache Better Blocking with common rules.txt

@@ -1,2 +1,2 @@
1 1 Following on from other Gists I have posted, this one shows a neat way of using Includes to centralise general blocking rules for Bad Bots, creepy crawlers and irritating IPs
2 - see the full post at http://www.blue-bag.com/
2 + see the full post at http://www.blue-bag.com/blog/apache-better-blocking-common-rules

George Boobyer hat die Gist bearbeitet 11 years ago. Zu Änderung gehen

1 file changed, 2 insertions, 2 deletions

Apache Better Blocking with common rules.txt

@@ -1,2 +1,2 @@
1 - Following on from other Gists I have posted, this one shows a neat way of using Includes to centralise general blocking rules for Bat Bots, creepy crawlers and irritating IPs
2 - see the full post at http://technology.blue-bag.com
1 + Following on from other Gists I have posted, this one shows a neat way of using Includes to centralise general blocking rules for Bad Bots, creepy crawlers and irritating IPs
2 + see the full post at http://www.blue-bag.com/

George Boobyer hat die Gist bearbeitet 12 years ago. Zu Änderung gehen

1 file changed, 0 insertions, 0 deletions

Apache-Better-Blocking with common rules.txt umbenannt zu Apache Better Blocking with common rules.txt

Datei ohne Änderung umbenannt

George Boobyer hat die Gist bearbeitet 12 years ago. Zu Änderung gehen

1 file changed, 0 insertions, 0 deletions

__readme.txt umbenannt zu Apache-Better-Blocking with common rules.txt

Datei ohne Änderung umbenannt

George Boobyer hat die Gist bearbeitet 12 years ago. Zu Änderung gehen

1 file changed, 0 insertions, 0 deletions

readme.txt umbenannt zu __readme.txt

Datei ohne Änderung umbenannt

George Boobyer hat die Gist bearbeitet 12 years ago. Zu Änderung gehen

4 files changed, 144 insertions

blocked-adresses.conf(Datei erstellt)

@@ -0,0 +1,21 @@
1 + ## A list of known problem IPs
2 +
3 + # pen test on FKEditor
4 + SetEnvIfNoCase REMOTE_ADDR "175\.44\.30\.180" BlockedAddress
5 + SetEnvIfNoCase REMOTE_ADDR "175\.44\.29\.92" BlockedAddress
6 + SetEnvIfNoCase REMOTE_ADDR "175\.44\.30\.180" BlockedAddress
7 + SetEnvIfNoCase REMOTE_ADDR "174\.139\.240\.74" BlockedAddress
8 +
9 +
10 + # looking for backups
11 + SetEnvIfNoCase REMOTE_ADDR "192\.99\.12\.128" BlockedAddress
12 +
13 + # Bad Crawler
14 + SetEnvIfNoCase REMOTE_ADDR "144\.76\.195\.72" BlockedAddress
15 + SetEnvIfNoCase REMOTE_ADDR "54\.189\.47\.213" BlockedAddress
16 +
17 + # Java scraper
18 + SetEnvIfNoCase REMOTE_ADDR "62\.116\.110\.111" BlockedAddress
19 +
20 + # Big hitter - known spammer
21 + SetEnvIfNoCase REMOTE_ADDR "109\.201\.137\.166" BlockedAddress

blocked-agents.conf(Datei erstellt)

@@ -0,0 +1,37 @@
1 + # list obtained from 3rd party
2 +
3 + SetEnvIfNoCase User-Agent ^$ bad_bot #this is for blank user-agents
4 + SetEnvIfNoCase User-Agent "Jakarta" BlockedAgent
5 + SetEnvIfNoCase User-Agent "User-Agent" BlockedAgent
6 + SetEnvIfNoCase User-Agent "libwww," BlockedAgent
7 + SetEnvIfNoCase User-Agent "lwp-trivial" BlockedAgent
8 + SetEnvIfNoCase User-Agent "Snoopy" BlockedAgent
9 + SetEnvIfNoCase User-Agent "PHPCrawl" BlockedAgent
10 + SetEnvIfNoCase User-Agent "WEP Search" BlockedAgent
11 + SetEnvIfNoCase User-Agent "Missigua Locator" BlockedAgent
12 + SetEnvIfNoCase User-Agent "ISC Systems iRc" BlockedAgent
13 + SetEnvIfNoCase User-Agent "lwp-trivial" BlockedAgent
14 +
15 + SetEnvIfNoCase User-Agent "GbPlugin" BlockedAgent
16 + SetEnvIfNoCase User-Agent "Wget" BlockedAgent
17 + SetEnvIfNoCase User-Agent "EmailSiphon" BlockedAgent
18 + SetEnvIfNoCase User-Agent "EmailWolf" BlockedAgent
19 + SetEnvIfNoCase User-Agent "libwww-perl" BlockedAgent
20 +
21 + ## end of 3rd party list (note could also block them in Robots.txt see article)
22 +
23 + ## List derived from actual activity
24 + # Mozilla/5.0 (compatible; BLEXBot/1.0; +http://webmeup-crawler.com/)
25 + SetEnvIfNoCase User-Agent "BLEXBot" BlockedAgent
26 +
27 + # Mozilla/5.0 (compatible; 007ac9 Crawler; http://crawler.007ac9.net/)
28 + SetEnvIfNoCase User-Agent "007ac9 Crawler" BlockedAgent
29 +
30 + #Mozilla/5.0 (compatible; MJ12bot/v1.4.5; http://www.majestic12.co.uk/bot.php?+)
31 + SetEnvIfNoCase User-Agent "MJ12bot" BlockedAgent
32 +
33 + # Fetchbot (https://github.com/PuerkitoBio/fetchbot)
34 + SetEnvIfNoCase User-Agent "Fetchbot" BlockedAgent
35 +
36 + #Mozilla/5.0 (compatible; SISTRIX Crawler; http://crawler.sistrix.net/)
37 + SetEnvIfNoCase User-Agent "SISTRIX" BlockedAgent

readme.txt(Datei erstellt)

@@ -0,0 +1,2 @@
1 + Following on from other Gists I have posted, this one shows a neat way of using Includes to centralise general blocking rules for Bat Bots, creepy crawlers and irritating IPs
2 + see the full post at http://technology.blue-bag.com

vhost-sample.conf(Datei erstellt)

@@ -0,0 +1,84 @@
1 + <VirtualHost *:80>
2 + ## Note this is heavily reduced just to show the relevant lines
3 + ## Expires and security options have been removed
4 + ## Don't just paste this - but refer to it along with your customisations
5 +
6 + ServerName www.example.com
7 +
8 + DocumentRoot /var/www/example.com/live/htdocs
9 +
10 + <Directory /var/www/example.com/live/htdocs>
11 + Options +FollowSymLinks
12 +
13 + # Disable .htaccess files (remember to account for any rules they implement)
14 + AllowOverride None
15 +
16 + # Include our blocked lists
17 + Include /etc/apache2/blocked-addresses.conf
18 + Include /etc/apache2/blocked-agents.conf
19 +
20 + Order allow,deny
21 + Allow from all
22 + # Deny from our blocked lists
23 + deny from env=BlockedAddress
24 + deny from env=BlockedAgent
25 +
26 + <IfModule mod_rewrite.c>
27 + RewriteEngine on
28 +
29 +
30 + # Intercept Microsoft Office Protocol Discovery
31 + # OPTION requests for this were hitting site regularly
32 + RewriteCond %{REQUEST_METHOD} ^OPTIONS
33 + RewriteCond %{HTTP_USER_AGENT} ^Microsoft\ Office\ Protocol\ Discovery [OR]
34 + RewriteCond %{HTTP_USER_AGENT} ^Microsoft\ Office\ Existence\ Discovery [OR]
35 + RewriteCond %{HTTP_USER_AGENT} ^Microsoft\-WebDAV\-MiniRedir.*$
36 + RewriteRule .* - [R=405,L]
37 +
38 + ##### Security hardening ####
39 + ## DENY REQUEST BASED ON REQUEST METHOD ###
40 + RewriteCond %{REQUEST_METHOD} ^(TRACE|TRACK|OPTIONS|HEAD)$ [NC]
41 + RewriteRule ^.*$ - [F]
42 +
43 +
44 + </IfModule>
45 + </Directory>
46 +
47 + ## the following log details are included to show
48 + ## how to use SetEnvIf to include/exclude certain requests for images etc
49 + ## Also turn on robots.txt logging to check robots behaviour
50 +
51 + ## Custom Logging for combined logs - note they are filtered to not log images, robots.txt, cs, js etc
52 + UseCanonicalName On
53 + LogFormat "%V %h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"" vcommon
54 +
55 + ErrorLog /var/www/log/customer-error.log
56 +
57 + # Possible values include: debug, info, notice, warn, error, crit,
58 + # alert, emerg.
59 + LogLevel warn
60 +
61 + ## we aren't logging images, css, js etc
62 +
63 + ## flag robots.txt requests - allow these to test robot behaviour
64 + SetEnvIf Request_URI "^/robots\.txt$" robots-request=0
65 + ## flag favicon requests
66 + SetEnvIf Request_URI "^/favicon\.ico$" favicon-request=1
67 +
68 + ## flag image requests
69 + SetEnvIf Request_URI "(\.gif|\.png|\.jpg)$" image-request=1
70 +
71 + ## flag Css and JS requests
72 + SetEnvIf Request_URI \.css css-request=1
73 + SetEnvIf Request_URI \.js js-request=1
74 +
75 + ## set do_not_log if any of the above flags are set
76 + SetEnvIf robots-request 1 do_not_log=1
77 + SetEnvIf favicon-request 1 do_not_log=1
78 + SetEnvIf image-request 1 do_not_log=1
79 + SetEnvIf css-request 1 do_not_log=1
80 + SetEnvIf js-request 1 do_not_log=1
81 +
82 + ## only log if do_not_log is not set
83 + CustomLog /var/www/log/customer-access.log vcommon env=!do_not_log
84 + </VirtualHost>
Neuer Älter