python 2.7 - How to concatenate XPath in scrapy -
i struggling quite time on issue. table need extract annual div
instead of annual div yield
.
<table class="horizontaltable col1of3 lastcol"> <tbody> <tr class="first"> <th>annual div <span class="sub">(ttm)</span></th> <td>5.49 <span class="currencycode">gbx</span></td> </tr> <tr> <th>annual div yield <span class="sub">(ttm)</span></th> <td>6.04%</td> </tr> <tr> <th>div ex-date</th> <td><span class="nowrap">sep 25 2013</span></td> </tr> <tr class="last"> <th>div pay-date</th> <td><span class="nowrap">nov 22 2013</span></td> </tr> </tbody> </table>
i wrote xpath query bringing both annual div
, annual div yield
annual_div = sel.xpath('//table[contains(@class, "horizontaltable col1of3")]/tbody/tr[th[contains(.,"annual div")]]').extract()
result:
<tr class="first"><th>annual div <span class="sub">(ttm)</span></th><td>5.49 <span class="currencycode">gbx</span></td></tr>', u'<tr><th>annual div yield <span class="sub">(ttm)</span></th><td>5.83%</td></tr>
when write match on excact text result not yield result:
annual_div = sel.xpath('//table[contains(@class, "horizontaltable col1of3")]/tbody/tr[th[text()="annual div"]]').extract()
it seems span (ttm) not sure how concatenate both annual div (ttm) come exact match.
please me.
regards
to compare exact match missing whitespace @ end. should work:
//table[contains(@class, "horizontaltable col1of3")]/tbody/tr[th/text() = "annual div "]]
however, if want remove leading , trailing spaces can use nornmalize-space()
so:
//table[contains(@class, "horizontaltable col1of3")]/tbody/tr[normalize-space(th/text()) = 'annual div']
Comments
Post a Comment