Erreur32 zrewidował ten Gist 6 months ago. Przejdź do rewizji
Brak zmian
George Boobyer zrewidował ten Gist 11 years ago. Przejdź do rewizji
1 file changed, 1 insertion, 1 deletion
Apache Better Blocking with common rules.txt
| @@ -1,2 +1,2 @@ | |||
| 1 | 1 | Following on from other Gists I have posted, this one shows a neat way of using Includes to centralise general blocking rules for Bad Bots, creepy crawlers and irritating IPs | |
| 2 | - | see the full post at http://www.blue-bag.com/ | |
| 2 | + | see the full post at http://www.blue-bag.com/blog/apache-better-blocking-common-rules | |
George Boobyer zrewidował ten Gist 11 years ago. Przejdź do rewizji
1 file changed, 2 insertions, 2 deletions
Apache Better Blocking with common rules.txt
| @@ -1,2 +1,2 @@ | |||
| 1 | - | Following on from other Gists I have posted, this one shows a neat way of using Includes to centralise general blocking rules for Bat Bots, creepy crawlers and irritating IPs | |
| 2 | - | see the full post at http://technology.blue-bag.com | |
| 1 | + | Following on from other Gists I have posted, this one shows a neat way of using Includes to centralise general blocking rules for Bad Bots, creepy crawlers and irritating IPs | |
| 2 | + | see the full post at http://www.blue-bag.com/ | |
George Boobyer zrewidował ten Gist 12 years ago. Przejdź do rewizji
1 file changed, 0 insertions, 0 deletions
Apache-Better-Blocking with common rules.txt zmieniono nazwę na Apache Better Blocking with common rules.txt
Zmieniono nazwę pliku bez modyfikacji zawartości
George Boobyer zrewidował ten Gist 12 years ago. Przejdź do rewizji
1 file changed, 0 insertions, 0 deletions
__readme.txt zmieniono nazwę na Apache-Better-Blocking with common rules.txt
Zmieniono nazwę pliku bez modyfikacji zawartości
George Boobyer zrewidował ten Gist 12 years ago. Przejdź do rewizji
1 file changed, 0 insertions, 0 deletions
readme.txt zmieniono nazwę na __readme.txt
Zmieniono nazwę pliku bez modyfikacji zawartości
George Boobyer zrewidował ten Gist 12 years ago. Przejdź do rewizji
4 files changed, 144 insertions
blocked-adresses.conf(stworzono plik)
| @@ -0,0 +1,21 @@ | |||
| 1 | + | ## A list of known problem IPs | |
| 2 | + | ||
| 3 | + | # pen test on FKEditor | |
| 4 | + | SetEnvIfNoCase REMOTE_ADDR "175\.44\.30\.180" BlockedAddress | |
| 5 | + | SetEnvIfNoCase REMOTE_ADDR "175\.44\.29\.92" BlockedAddress | |
| 6 | + | SetEnvIfNoCase REMOTE_ADDR "175\.44\.30\.180" BlockedAddress | |
| 7 | + | SetEnvIfNoCase REMOTE_ADDR "174\.139\.240\.74" BlockedAddress | |
| 8 | + | ||
| 9 | + | ||
| 10 | + | # looking for backups | |
| 11 | + | SetEnvIfNoCase REMOTE_ADDR "192\.99\.12\.128" BlockedAddress | |
| 12 | + | ||
| 13 | + | # Bad Crawler | |
| 14 | + | SetEnvIfNoCase REMOTE_ADDR "144\.76\.195\.72" BlockedAddress | |
| 15 | + | SetEnvIfNoCase REMOTE_ADDR "54\.189\.47\.213" BlockedAddress | |
| 16 | + | ||
| 17 | + | # Java scraper | |
| 18 | + | SetEnvIfNoCase REMOTE_ADDR "62\.116\.110\.111" BlockedAddress | |
| 19 | + | ||
| 20 | + | # Big hitter - known spammer | |
| 21 | + | SetEnvIfNoCase REMOTE_ADDR "109\.201\.137\.166" BlockedAddress | |
blocked-agents.conf(stworzono plik)
| @@ -0,0 +1,37 @@ | |||
| 1 | + | # list obtained from 3rd party | |
| 2 | + | ||
| 3 | + | SetEnvIfNoCase User-Agent ^$ bad_bot #this is for blank user-agents | |
| 4 | + | SetEnvIfNoCase User-Agent "Jakarta" BlockedAgent | |
| 5 | + | SetEnvIfNoCase User-Agent "User-Agent" BlockedAgent | |
| 6 | + | SetEnvIfNoCase User-Agent "libwww," BlockedAgent | |
| 7 | + | SetEnvIfNoCase User-Agent "lwp-trivial" BlockedAgent | |
| 8 | + | SetEnvIfNoCase User-Agent "Snoopy" BlockedAgent | |
| 9 | + | SetEnvIfNoCase User-Agent "PHPCrawl" BlockedAgent | |
| 10 | + | SetEnvIfNoCase User-Agent "WEP Search" BlockedAgent | |
| 11 | + | SetEnvIfNoCase User-Agent "Missigua Locator" BlockedAgent | |
| 12 | + | SetEnvIfNoCase User-Agent "ISC Systems iRc" BlockedAgent | |
| 13 | + | SetEnvIfNoCase User-Agent "lwp-trivial" BlockedAgent | |
| 14 | + | ||
| 15 | + | SetEnvIfNoCase User-Agent "GbPlugin" BlockedAgent | |
| 16 | + | SetEnvIfNoCase User-Agent "Wget" BlockedAgent | |
| 17 | + | SetEnvIfNoCase User-Agent "EmailSiphon" BlockedAgent | |
| 18 | + | SetEnvIfNoCase User-Agent "EmailWolf" BlockedAgent | |
| 19 | + | SetEnvIfNoCase User-Agent "libwww-perl" BlockedAgent | |
| 20 | + | ||
| 21 | + | ## end of 3rd party list (note could also block them in Robots.txt see article) | |
| 22 | + | ||
| 23 | + | ## List derived from actual activity | |
| 24 | + | # Mozilla/5.0 (compatible; BLEXBot/1.0; +http://webmeup-crawler.com/) | |
| 25 | + | SetEnvIfNoCase User-Agent "BLEXBot" BlockedAgent | |
| 26 | + | ||
| 27 | + | # Mozilla/5.0 (compatible; 007ac9 Crawler; http://crawler.007ac9.net/) | |
| 28 | + | SetEnvIfNoCase User-Agent "007ac9 Crawler" BlockedAgent | |
| 29 | + | ||
| 30 | + | #Mozilla/5.0 (compatible; MJ12bot/v1.4.5; http://www.majestic12.co.uk/bot.php?+) | |
| 31 | + | SetEnvIfNoCase User-Agent "MJ12bot" BlockedAgent | |
| 32 | + | ||
| 33 | + | # Fetchbot (https://github.com/PuerkitoBio/fetchbot) | |
| 34 | + | SetEnvIfNoCase User-Agent "Fetchbot" BlockedAgent | |
| 35 | + | ||
| 36 | + | #Mozilla/5.0 (compatible; SISTRIX Crawler; http://crawler.sistrix.net/) | |
| 37 | + | SetEnvIfNoCase User-Agent "SISTRIX" BlockedAgent | |
readme.txt(stworzono plik)
| @@ -0,0 +1,2 @@ | |||
| 1 | + | Following on from other Gists I have posted, this one shows a neat way of using Includes to centralise general blocking rules for Bat Bots, creepy crawlers and irritating IPs | |
| 2 | + | see the full post at http://technology.blue-bag.com | |
vhost-sample.conf(stworzono plik)
| @@ -0,0 +1,84 @@ | |||
| 1 | + | <VirtualHost *:80> | |
| 2 | + | ## Note this is heavily reduced just to show the relevant lines | |
| 3 | + | ## Expires and security options have been removed | |
| 4 | + | ## Don't just paste this - but refer to it along with your customisations | |
| 5 | + | ||
| 6 | + | ServerName www.example.com | |
| 7 | + | ||
| 8 | + | DocumentRoot /var/www/example.com/live/htdocs | |
| 9 | + | ||
| 10 | + | <Directory /var/www/example.com/live/htdocs> | |
| 11 | + | Options +FollowSymLinks | |
| 12 | + | ||
| 13 | + | # Disable .htaccess files (remember to account for any rules they implement) | |
| 14 | + | AllowOverride None | |
| 15 | + | ||
| 16 | + | # Include our blocked lists | |
| 17 | + | Include /etc/apache2/blocked-addresses.conf | |
| 18 | + | Include /etc/apache2/blocked-agents.conf | |
| 19 | + | ||
| 20 | + | Order allow,deny | |
| 21 | + | Allow from all | |
| 22 | + | # Deny from our blocked lists | |
| 23 | + | deny from env=BlockedAddress | |
| 24 | + | deny from env=BlockedAgent | |
| 25 | + | ||
| 26 | + | <IfModule mod_rewrite.c> | |
| 27 | + | RewriteEngine on | |
| 28 | + | ||
| 29 | + | ||
| 30 | + | # Intercept Microsoft Office Protocol Discovery | |
| 31 | + | # OPTION requests for this were hitting site regularly | |
| 32 | + | RewriteCond %{REQUEST_METHOD} ^OPTIONS | |
| 33 | + | RewriteCond %{HTTP_USER_AGENT} ^Microsoft\ Office\ Protocol\ Discovery [OR] | |
| 34 | + | RewriteCond %{HTTP_USER_AGENT} ^Microsoft\ Office\ Existence\ Discovery [OR] | |
| 35 | + | RewriteCond %{HTTP_USER_AGENT} ^Microsoft\-WebDAV\-MiniRedir.*$ | |
| 36 | + | RewriteRule .* - [R=405,L] | |
| 37 | + | ||
| 38 | + | ##### Security hardening #### | |
| 39 | + | ## DENY REQUEST BASED ON REQUEST METHOD ### | |
| 40 | + | RewriteCond %{REQUEST_METHOD} ^(TRACE|TRACK|OPTIONS|HEAD)$ [NC] | |
| 41 | + | RewriteRule ^.*$ - [F] | |
| 42 | + | ||
| 43 | + | ||
| 44 | + | </IfModule> | |
| 45 | + | </Directory> | |
| 46 | + | ||
| 47 | + | ## the following log details are included to show | |
| 48 | + | ## how to use SetEnvIf to include/exclude certain requests for images etc | |
| 49 | + | ## Also turn on robots.txt logging to check robots behaviour | |
| 50 | + | ||
| 51 | + | ## Custom Logging for combined logs - note they are filtered to not log images, robots.txt, cs, js etc | |
| 52 | + | UseCanonicalName On | |
| 53 | + | LogFormat "%V %h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"" vcommon | |
| 54 | + | ||
| 55 | + | ErrorLog /var/www/log/customer-error.log | |
| 56 | + | ||
| 57 | + | # Possible values include: debug, info, notice, warn, error, crit, | |
| 58 | + | # alert, emerg. | |
| 59 | + | LogLevel warn | |
| 60 | + | ||
| 61 | + | ## we aren't logging images, css, js etc | |
| 62 | + | ||
| 63 | + | ## flag robots.txt requests - allow these to test robot behaviour | |
| 64 | + | SetEnvIf Request_URI "^/robots\.txt$" robots-request=0 | |
| 65 | + | ## flag favicon requests | |
| 66 | + | SetEnvIf Request_URI "^/favicon\.ico$" favicon-request=1 | |
| 67 | + | ||
| 68 | + | ## flag image requests | |
| 69 | + | SetEnvIf Request_URI "(\.gif|\.png|\.jpg)$" image-request=1 | |
| 70 | + | ||
| 71 | + | ## flag Css and JS requests | |
| 72 | + | SetEnvIf Request_URI \.css css-request=1 | |
| 73 | + | SetEnvIf Request_URI \.js js-request=1 | |
| 74 | + | ||
| 75 | + | ## set do_not_log if any of the above flags are set | |
| 76 | + | SetEnvIf robots-request 1 do_not_log=1 | |
| 77 | + | SetEnvIf favicon-request 1 do_not_log=1 | |
| 78 | + | SetEnvIf image-request 1 do_not_log=1 | |
| 79 | + | SetEnvIf css-request 1 do_not_log=1 | |
| 80 | + | SetEnvIf js-request 1 do_not_log=1 | |
| 81 | + | ||
| 82 | + | ## only log if do_not_log is not set | |
| 83 | + | CustomLog /var/www/log/customer-access.log vcommon env=!do_not_log | |
| 84 | + | </VirtualHost> | |