python 2.7 - How to concatenate XPath in scrapy -


i struggling quite time on issue. table need extract annual div instead of annual div yield.

<table class="horizontaltable col1of3 lastcol">   <tbody>     <tr class="first">       <th>annual div <span class="sub">(ttm)</span></th>       <td>5.49 <span class="currencycode">gbx</span></td>     </tr>     <tr>       <th>annual div yield <span class="sub">(ttm)</span></th>       <td>6.04%</td>     </tr>     <tr>       <th>div ex-date</th>       <td><span class="nowrap">sep 25 2013</span></td>     </tr>     <tr class="last">       <th>div pay-date</th>       <td><span class="nowrap">nov 22 2013</span></td>     </tr>   </tbody> </table> 

i wrote xpath query bringing both annual div , annual div yield

annual_div = sel.xpath('//table[contains(@class, "horizontaltable col1of3")]/tbody/tr[th[contains(.,"annual div")]]').extract() 

result:

<tr class="first"><th>annual div <span class="sub">(ttm)</span></th><td>5.49 <span class="currencycode">gbx</span></td></tr>', u'<tr><th>annual div yield <span class="sub">(ttm)</span></th><td>5.83%</td></tr> 

when write match on excact text result not yield result:

annual_div = sel.xpath('//table[contains(@class, "horizontaltable col1of3")]/tbody/tr[th[text()="annual div"]]').extract() 

it seems span (ttm) not sure how concatenate both annual div (ttm) come exact match.

please me.

regards

to compare exact match missing whitespace @ end. should work:

//table[contains(@class, "horizontaltable col1of3")]/tbody/tr[th/text() = "annual div "]] 

however, if want remove leading , trailing spaces can use nornmalize-space() so:

//table[contains(@class, "horizontaltable col1of3")]/tbody/tr[normalize-space(th/text()) = 'annual div'] 

Comments

Popular posts from this blog

PHPMotion implementation - URL based videos (Hosted on separate location) -

javascript - Using Windows Media Player as video fallback for video tag -

c# - Unity IoC Lifetime per HttpRequest for UserStore -