【发布时间】:2012-02-03 16:00:23
【问题描述】:
我正在尝试使用以下代码从我们的电话提供商的 Web 应用程序中抓取通话记录,以将信息输入到我的 Ruby on Rails 应用程序中。
desc "Import incoming calls"
task :fetch_incomingcalls => :environment do
# Logs into manage.phoneprovider.co.uk and retrieved list of incoming calls.
require 'rubygems'
require 'mechanize'
require 'logger'
# Create a new mechanize object
agent = Mechanize.new { |a| a.log = Logger.new(STDERR) }
# Load the Phone Provider website
page = agent.get("https://manage.phoneprovider.co.uk/login")
# Select the first form
form = agent.page.forms.first
form.username = 'username
form.password = 'password
# Submit the form
page = form.submit form.buttons.first
# Click on link called Call Logs
page = agent.page.link_with(:text => "Call Logs").click
# Click on link called Incoming Calls
page = agent.page.link_with(:text => "Incoming Calls").click
# Prints out table rows
# puts doc.css('table > tr')
# Print out the body as a test
# puts page.body
end
从最后五行可以看出,我已经测试了“puts page.body”可以成功运行,并且上面的代码也可以正常运行。它成功登录,然后导航到呼叫日志,然后是来电。来电表如下所示:
| Timestamp | Source | Destination | Duration |
| 03 Jan 13:40 | 12345678 | 12345679 | 00:01:01 |
| 03 Jan 13:40 | 12345678 | 12345679 | 00:01:01 |
| 03 Jan 13:40 | 12345678 | 12345679 | 00:01:01 |
| 03 Jan 13:40 | 12345678 | 12345679 | 00:01:01 |
由以下代码生成:
<thead>
<tr>
<td>Timestamp</td>
<td>Source</td>
<td>Destination</td>
<td>Duration</td>
<td>Cost</td>
<td class='centre'>Recording</td>
</tr>
</thead>
<tbody>
<tr class='o'>
<tr>
<td>03 Jan 13:40</td>
<td>12345678</td>
<td>12345679</td>
<td>00:01:14</td>
<td></td>
<td class='opt recording'>
</td>
</tr>
</tr>
<tr class='e'>
<tr>
<td>30 Dec 20:31</td>
<td>12345678</td>
<td>12345679</td>
<td>00:02:52</td>
<td></td>
<td class='opt recording'>
</td>
</tr>
</tr>
<tr class='o'>
<tr>
<td>24 Dec 00:03</td>
<td>12345678</td>
<td>12345679</td>
<td>00:00:09</td>
<td></td>
<td class='opt recording'>
</td>
</tr>
</tr>
<tr class='e'>
<tr>
<td>23 Dec 14:56</td>
<td>12345678</td>
<td>12345679</td>
<td>00:00:07</td>
<td></td>
<td class='opt recording'>
</td>
</tr>
</tr>
<tr class='o'>
<tr>
<td>21 Dec 13:26</td>
<td>07793770851</td>
<td>12345679</td>
<td>00:00:26</td>
<td></td>
<td class='opt recording'>
</td>
</tr>
</tr>
我正在尝试研究如何只选择我想要的单元格(时间戳、源、目标和持续时间)并输出它们。然后我可以担心将它们输出到数据库而不是终端中。
我曾尝试使用 Selector Gadget,但如果我选择多个,它只会显示 'td' 或 'tr:nth-child(6) td , tr:nth-child(2) td'。
任何帮助或指点将不胜感激!
【问题讨论】:
标签: html ruby-on-rails html-table nokogiri mechanize