python - Scrapy xml output <![CDATA[<html lang="en" -
in scrapy script crawling webpage source , save output in xml format. in xml output content starting
"<!doctype html> <!--[if ie 7]><html lang="en" "
but need in output xml
" <![cdata[<html lang="en" "
how can in scrapy script?
my code given below..
import scrapy scrapy.spider import basespider scrapy.selector import htmlxpathselector dell.items import dellitem scrapy.http.request import request scrapy.contrib.spiders import crawlspider, rule scrapy.contrib.linkextractors.sgml import sgmllinkextractor scrapy.contrib.spiders import csvfeedspider class dellspider(scrapy.spider): name = "dell" allowed_domains = ["dell.com"] start_urls = ( 'http://jobs.dell.com/united-states-jobs/', ) def parse(self, response): item = response.meta['item'] response = requests.get(response.url) html =response.content item['source']=str(html) return item
Comments
Post a Comment