Home » 博客建设

几大著名网站及博客的Robots.txt 文件搜集

给你的博客建立一个合适的robots.txt 文件将对你的博客的SEO(搜索引擎优化)起着很大的作用。虽然我们会读过很多关于如何建立robots.txt 文件的文章(例如我上一篇文章:给你的WordPress 博客建立一个 robots.txt 文件),但与其看人家怎么说不如看人家怎么做,下面就是dailyblogtips为我们搜集的几大著名网站及博客的robots.txt 文件,我们不妨参考一下那些成功的博客是怎么做的。

The Minimalistic Guys


Problogger.net

User-agent: *
Disallow:


Marketing Pilgrim

User-agent: *
Disallow:

Search Engine Journal

User-agent: *
Disallow:

Matt Cutts

User-agent: *
Allow:
User-agent: *
Disallow: /files/

Pronet Advertising

User-agent: *
Disallow: /mt
Disallow: /*.cgi$

TechCrunch

User-agent: *
Disallow: /*/feed/
Disallow: /*/trackback/

The Structured Ones

Online Marketing Blog

User-agent: Googlebot
Disallow: */feed/

User-agent: *
Disallow: /Blogger/
Disallow: /wp-admin/
Disallow: /stats/
Disallow: /cgi-bin/
Disallow: /2005x/

Shoemoney

User-Agent: Googlebot
Disallow: /link.php
Disallow: /gallery2
Disallow: /gallery2/
Disallow: /category/
Disallow: /page/
Disallow: /pages/
Disallow: /feed/
Disallow: /feed

Scoreboard Media

User-agent: *
Disallow: /cgi-bin/

User-agent: Googlebot
Disallow: /category/
Disallow: /page/
Disallow: */feed/
Disallow: /2007/
Disallow: /2006/
Disallow: /wp-*

SEOMoz.org

User-agent: *
Disallow: /blogdetail.php?ID=537
Disallow: /blog?page
Disallow: /blog/author/
Disallow: /blog/category/
Disallow: /tracker
Disallow: /ugc?page
Disallow: /ugc/author/
Disallow: /ugc/category/

Wolf-Howl

User-agent: *
Disallow: /cgi-bin/
Disallow: /images/
Disallow: /noindex/
Disallow: /privacy-policy/
Disallow: /about/
Disallow: /company-biographies/
Disallow: /press-media-room/
Disallow: /newsletter/
Disallow: /contact-us/
Disallow: /terms-of-service/
Disallow: /terms-of-service/
Disallow: /information/comment-policy/
Disallow: /faq/
Disallow: /contact-form/
Disallow: /advertising/
Disallow: /information/licensing-information/
Disallow: /2005/
Disallow: /2006/
Disallow: /2007/
Disallow: /2008/
Disallow: /2009/
Disallow: /2004/
Disallow: /*?*
Disallow: /page/
Disallow: /iframes/

John Chow

sitemap: http://www.johnchow.com/sitemap.xml

User-agent: *
Disallow: /cgi-bin/
Disallow: /go/
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /author/
Disallow: /page/
Disallow: /category/
Disallow: /wp-images/
Disallow: /images/
Disallow: /backup/
Disallow: /banners/
Disallow: /archives/
Disallow: /trackback/
Disallow: /feed/

User-agent: Googlebot-Image
Allow: /wp-content/uploads/

User-agent: Mediapartners-Google
Allow: /

User-agent: duggmirror
Disallow: /

Smashing Magazine

Sitemap: http://www.smashingmagazine.com/sitemap.xml

User-agent: Mediapartners-Google*
Disallow:

User-agent: *
Disallow: /styles/
Disallow: /inc/
Disallow: /tag/
Disallow: /cc/
Disallow: /category/

User-agent: MSIECrawler
Disallow: /

User-agent: psbot
Disallow: /

User-agent: Fasterfox
Disallow: /

User-agent: Slurp
Crawl-delay: 200

Gizmodo

User-Agent: Googlebot
Disallow: /index.xml$
Disallow: /excerpts.xml$
Allow: /sitemap.xml$
Disallow: /*view=rss$
Disallow: /*?view=rss$
Disallow: /*format=rss$
Disallow: /*?format=rss$
Sitemap: http://gizmodo.com/sitemap.xml

Lifehacker

User-Agent: Googlebot
Disallow: /index.xml$
Disallow: /excerpts.xml$
Allow: /sitemap.xml$
Disallow: /*view=rss$
Disallow: /*?view=rss$
Disallow: /*format=rss$
Disallow: /*?format=rss$
Sitemap: http://lifehacker.com/sitemap.xml

The Mainstream Media

Wall Street Journal

User-agent: *
Disallow: /article_email/
Disallow: /article_print/
Disallow: /PA2VJBNA4R/
Sitemap: http://online.wsj.com/sitemap.xml

ZDNet

User-agent: *
Disallow: /Ads/
Disallow: /redir/
# Disallow: /i/ is removed per 190723
Disallow: /av/
Disallow: /css/
Disallow: /error/
Disallow: /clear/
Disallow: /mac-ad
Disallow: /adlog/
# URS per bug 239819, these were expanded
Disallow: /1300-
Disallow: /1301-
Disallow: /1302-
Disallow: /1303-
Disallow: /1304-
Disallow: /1305-
Disallow: /1306-
Disallow: /1307-
Disallow: /1308-
Disallow: /1309-
Disallow: /1310-
Disallow: /1311-
Disallow: /1312-
Disallow: /1313-
Disallow: /1314-
Disallow: /1315-
Disallow: /1316-
Disallow: /1317-

NY Times

# robots.txt, www.nytimes.com 6/29/2006
#
User-agent: *
Disallow: /pages/college/
Disallow: /college/
Disallow: /library/
Disallow: /learning/
Disallow: /aponline/
Disallow: /reuters/
Disallow: /cnet/
Disallow: /partners/
Disallow: /archives/
Disallow: /indexes/
Disallow: /thestreet/
Disallow: /nytimes-partners/
Disallow: /financialtimes/
Allow: /pages/
Allow: /2003/
Allow: /2004/
Allow: /2005/
Allow: /top/
Allow: /ref/
Allow: /services/xml/

User-agent: Mediapartners-Google*
Disallow:

YouTube

# robots.txt file for YouTube

User-agent: Mediapartners-Google*
Disallow:
User-agent: *
Disallow: /profile
Disallow: /results
Disallow: /browse
Disallow: /t/terms
Disallow: /t/privacy
Disallow: /login
Disallow: /watch_ajax
Disallow: /watch_queue_ajax

Bonus

Google

User-agent: *
Allow: /searchhistory/
Disallow: /news?output=xhtml&
Allow: /news?output=xhtml
Disallow: /search
Disallow: /groups
Disallow: /images
Disallow: /catalogs
Disallow: /catalogues
Disallow: /news
Disallow: /nwshp
Disallow: /?
Disallow: /addurl/image?
Disallow: /pagead/
Disallow: /relpage/
Disallow: /relcontent
Disallow: /sorry/
Disallow: /imgres
Disallow: /keyword/
Disallow: /u/
Disallow: /univ/
Disallow: /cobrand
Disallow: /custom
Disallow: /advanced_group_search
Disallow: /advanced_search
Disallow: /googlesite
Disallow: /preferences
Disallow: /setprefs
Disallow: /swr
Disallow: /url
Disallow: /default
Disallow: /m?
Disallow: /m/search?
Disallow: /wml?
Disallow: /wml/search?
Disallow: /xhtml?
Disallow: /xhtml/search?
Disallow: /xml?
Disallow: /imode?
Disallow: /imode/search?
Disallow: /jsky?
Disallow: /jsky/search?
Disallow: /pda?
Disallow: /pda/search?

Popularity: 42% [?]

Share/Bookmark this!

Leave a reply

Add your comment below, or trackback from your own site. You can also subscribe to these comments via RSS.

Be nice. Keep it clean. Stay on topic. No spam.

You can use these tags:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

This is a Gravatar-enabled weblog. To get your own globally recognized avatar, please register at Gravatar.

Side Notes

This entry was posted by 莫 涯 on December 6, 2008 at 8:48 am and filed under 博客建设 category.

You can add your comments or trackback from your own site. To keep you updated to the latest discussion, you can subscribe to these comments via RSS.

This is a Gravatar-enabled weblog. To get your own globally recognized avatar, please register at Gravatar.

最新文章

最新评论

Tag Cloud

301 Redirect 301重定向 2010 about All in one seo pack AutosCraze blog CSS Darren feed FreshMag Global Translator Google GreenFun HTML Insider name.com Problogger robots.txt文件 RSS Feed seo Wordpress Wordpress 主题 Wordpress主题 代码 优惠码 关于 博客 博客建设 域名 工具 建设 技巧 提交 插件 文章 杂志型主题 标签 注册 绿色 网赚 评论 选择 链接 页面