Le robot Google se connecte t-il sur toutes les aborescences mêmes inexistantes ?

WRInaute occasionnel
Bonjour,

Je me permets de poster un message concernant les crawl de Google.
Depuis plusieurs jours, je subis des attaques et j'ai mis en place les filtres avec fail2ban.
Cependant, je remarque en même temps qu'une adresse IP cherche à maintes reprises (à la volée), des fichiers "standards" sous une arborescence "standard" qui n'existe pas sur mon serveur.
Les voici :

[Sat Oct 10 03:56:43 2015] [error] [client 66.249.64.84] File does not exist: /var/www/robots.txt
[Sat Oct 10 03:56:56 2015] [error] [client 66.249.64.84] File does not exist: /var/www/themes
[Sat Oct 10 03:56:58 2015] [error] [client 66.249.64.89] File does not exist: /var/www/themes
[Sat Oct 10 03:56:58 2015] [error] [client 66.249.64.89] File does not exist: /var/www/themes
[Sat Oct 10 03:56:59 2015] [error] [client 66.249.64.84] File does not exist: /var/www/modules
[Sat Oct 10 03:57:00 2015] [error] [client 66.249.64.89] File does not exist: /var/www/themes
[Sat Oct 10 03:57:01 2015] [error] [client 66.249.64.84] File does not exist: /var/www/themes
[Sat Oct 10 03:57:01 2015] [error] [client 66.249.64.89] File does not exist: /var/www/themes
[Sat Oct 10 03:57:03 2015] [error] [client 66.249.64.89] File does not exist: /var/www/themes
[Sat Oct 10 03:57:04 2015] [error] [client 66.249.64.89] File does not exist: /var/www/themes
[Sat Oct 10 03:57:04 2015] [error] [client 66.249.64.89] File does not exist: /var/www/themes
[Sat Oct 10 03:57:05 2015] [error] [client 66.249.64.89] File does not exist: /var/www/themes
[Sat Oct 10 03:57:06 2015] [error] [client 66.249.64.79] File does not exist: /var/www/themes
[Sat Oct 10 03:57:07 2015] [error] [client 66.249.64.89] File does not exist: /var/www/themes
[Sat Oct 10 03:57:08 2015] [error] [client 66.249.64.79] File does not exist: /var/www/js
[Sat Oct 10 03:57:08 2015] [error] [client 66.249.64.84] File does not exist: /var/www/modules
[Sat Oct 10 03:57:09 2015] [error] [client 66.249.64.89] File does not exist: /var/www/themes
[Sat Oct 10 03:57:10 2015] [error] [client 66.249.64.79] File does not exist: /var/www/modules
[Sat Oct 10 03:57:11 2015] [error] [client 66.249.64.84] File does not exist: /var/www/themes
[Sat Oct 10 04:04:02 2015] [error] [client 50.28.56.15] File does not exist: /var/www/modules
[Sat Oct 10 04:04:02 2015] [error] [client 50.28.56.15] script '/var/www/index.php' not found or unable to stat
[Sat Oct 10 04:04:02 2015] [error] [client 50.28.56.15] File does not exist: /var/www/modules
[Sat Oct 10 04:04:02 2015] [error] [client 50.28.56.15] File does not exist: /var/www/modules
[Sat Oct 10 04:04:03 2015] [error] [client 50.28.56.15] File does not exist: /var/www/modules
[Sat Oct 10 04:05:15 2015] [error] [client 173.242.127.190] File does not exist: /var/www/phpMyAdmin

Concernant 50.28.56.15 et 173.242.127.190, ils ont été bannis automatiquement.
Concernant l'adresse IP 66.249.64.89 qui correspond à Google, selon le whois, je n'ai entamé aucune action.
Je n'ai aucun sous dossier /var/www/xxxxxx

Qu'en pensez-vous svp ?
 
WRInaute accro
Bonjour,

Le robot Google ignore les arborescences. Il essaie de se connecter sur des adresses qu'il a trouvées quelque part. Ces adresses peuvent évidemment être fausses. Trouver l'origine de ces adresses n'est pas évident.

Jean-Luc
 
Olivier Duffez (admin)
Membre du personnel
il y a peut-être sur ton site (ou ailleurs sur le web) un lien vers cette URL
quel est le site ?
 
WRInaute occasionnel
Il s'agit d'un serveur dédié, contenant plusieurs sites internet.
L’arborescence est comme suit :

var/www/site1.fr/web/contenu-du-site
var/www/site2.fr/web/contenu-du-site
var/www/site3.fr/web/contenu-du-site
etc.

Sous le répertoire du contenu du site, on retrouve bien des répertoires "standards" comme modules, thèmes, etc.
Je ne pense donc pas qu'il y a un lien externe pointant vers ces "erreurs" de crawl (si c'est le cas).
Je n'ai jamais créé un répertoire /var/www/themes par exemple.
 
WRInaute occasionnel
Un autre exemple reçu tout à l'heure :
NetName: GOOGLE-CLOUD
Est-ce vraiment en rapport direct avec Google ?
Car vouloir accéder à /var/www/phpMyAdmin , je ne vois pas pourquoi il le ferait...!


Hi,

The IP 146.148.31.234 has just been banned by Fail2Ban after
1 attempts against apache-bruteforce.


Here are more information about 146.148.31.234:


#
# ARIN WHOIS data and services are subject to the Terms of Use
# available at: https://www.arin.net/whois_tou.html
#
# If you see inaccuracies in the results, please report at
# http://www.arin.net/public/whoisinaccuracy/index.xhtml
#


#
# Query terms are ambiguous. The query is assumed to be:
# "n 146.148.31.234"
#
# Use "?" to get help.
#

#
# The following results may also be obtained via:
# http://whois.arin.net/rest/nets;q=146.148.31.234?showDetails=true&show ... xt=netref2
#

NetRange: 146.148.0.0 - 146.148.127.255
CIDR: 146.148.0.0/17
NetName: GOOGLE-CLOUD
NetHandle: NET-146-148-0-0-1
Parent: NET146 (NET-146-0-0-0-0)
NetType: Direct Allocation
OriginAS: AS15169
Organization: Google Inc. (GOOGL-2)
RegDate: 2014-03-26
Updated: 2015-09-21
Comment: ** The IP addresses under this netblock are in use by Google Cloud customers **
Comment:
Comment: Direct all copyright and legal complaints to
Comment: https://support.google.com/legal/go/report
Comment:
Comment: Direct all spam and abuse complaints to
Comment: https://support.google.com/code/go/gce_abuse_report
Comment:
Comment: For fastest response, use the relevant forms above.
Comment:
Comment: Complaints can also be sent to the GC Abuse desk
Comment: (google-cloud-compliance@google.com)
Comment: but may have longer turnaround times.
Comment:
Comment: Complaints sent to any other POC will be ignored.
Ref: http://whois.arin.net/rest/net/NET-146-148-0-0-1


OrgName: Google Inc.
OrgId: GOOGL-2
Address: 1600 Amphitheatre Parkway
City: Mountain View
StateProv: CA
PostalCode: 94043
Country: US
RegDate: 2006-09-29
Updated: 2015-09-21
Comment: *** The IP addresses under this Org-ID are in use by Google Cloud customers ***
Comment:
Comment: Direct all copyright and legal complaints to
Comment: https://support.google.com/legal/go/report
Comment:
Comment: Direct all spam and abuse complaints to
Comment: https://support.google.com/code/go/gce_abuse_report
Comment:
Comment: For fastest response, use the relevant forms above.
Comment:
Comment: Complaints can also be sent to the GC Abuse desk
Comment: (google-cloud-compliance@google.com)
Comment: but may have longer turnaround times.
Comment:
Comment: Complaints sent to any other POC will be ignored.
Ref: http://whois.arin.net/rest/org/GOOGL-2


OrgTechHandle: ZG39-ARIN
OrgTechName: Google Inc
OrgTechPhone: +1-650-253-0000
OrgTechEmail: arin-contact@google.com
OrgTechRef: http://whois.arin.net/rest/poc/ZG39-ARIN

OrgAbuseHandle: GCABU-ARIN
OrgAbuseName: GC Abuse
OrgAbusePhone: +1-650-253-0000
OrgAbuseEmail: google-cloud-compliance@google.com
OrgAbuseRef: http://whois.arin.net/rest/poc/GCABU-ARIN

OrgNOCHandle: GCABU-ARIN
OrgNOCName: GC Abuse
OrgNOCPhone: +1-650-253-0000
OrgNOCEmail: google-cloud-compliance@google.com
OrgNOCRef: http://whois.arin.net/rest/poc/GCABU-ARIN


#
# ARIN WHOIS data and services are subject to the Terms of Use
# available at: https://www.arin.net/whois_tou.html
#
# If you see inaccuracies in the results, please report at
# http://www.arin.net/public/whoisinaccuracy/index.xhtml
#


Lines containing IP:146.148.31.234 in /var/log/apache2/error*.log

[Sat Oct 10 19:20:16 2015] [error] [client 146.148.31.234] File does not exist: /var/www/phpMyAdmin


Regards,

Fail2Ban
 
WRInaute accro
Non, ce n'est pas Google. C'est un client mal intentionné de Google.

Ce serait bien de comparer avec les logs des serveurs web. Il y aurait d'autres détails que dans le log d'erreurs.

Jean-Luc
 
WRInaute occasionnel
Malin... :)
merci pour vos réponses !

Concernant

[client 66.249.64.84] File does not exist: /var/www/themes

C'est bien Google ?
 
WRInaute accro
t'es sur un dédié ? parce que la fréquence du crawl est importante quand même. Tu sais que tu peux limiter la vitesse de crawl de gg
 
Discussions similaires
Haut