Fetch and parse a site's /robots.txt and /sitemap.xml — the two files Google checks first.
/robots.txt
/sitemap.xml