使用BS4,Python和Selenium后解析文本(Parse the text after using BS4, Python and Selenium)
使用我的scrape脚本后:
from selenium import webdriver from bs4 import BeautifulSoup import csv browser = webdriver.Firefox() browser.get('http://dyn.com/about/events/') html = browser.page_source soup = BeautifulSoup(html) titles = [tag.text for tag in soup.find_all('p','pubdate')]
我得到的结果如下:
[u'\ n \ n \ t \ t \ t \ BWEBINAR:如何扩大您的全球覆盖范围到中国\ xa0 \ n \ t \ t \ t \ n \ t \ t \ t设置22,2014 \ t \ t \ t \ nspeak \ n',u'\ n \ n \ t \ t \ t LAUNCH Scale \ u2013旧金山,CA \ xa0 \ n \ t \ t \ t \ n \ t \ t \ t \ tOct 23 - 24,2014 \ t \ t \ t \ nattend \ n',u'\ n \ n \ t \ t \ tAcquia参与用户会议\ u2013 Boston,MA \ xa0 \ n \ t \ t \ t \ n \ t \ t \ t \ t 3 - 5 ,2014 \ t \ t \ t \ nexhibitattend \ n',u'\ n \ n \ t \ t \ t \ tCloud Expo \ u2013圣克拉拉,加利福尼亚\ xa0 \ n \ t \ t \ t \ n \ t \ t \ tNov 4 - 6,2014 \ t \ t \ t \ nexhibit \ n',u'\ n \ n \ t \ t \ t \ 2014年全球运营商奖项\ u2013阿姆斯特丹\ xa0 \ n \ t \ t \ t \ n \ n \ t \ t \ tNov 4,2014 \ t \ t \ t \ n \ n',u'\ n \ n \ t \ t \ t \ t \ twit \ Summit \ u2013都柏林,爱尔兰\ xa0 \ n \ t \ t \ t \ n \ t \ t \ t \ tNov 4 - 6,2014 \ t \ t \ t \ n \ n \ n \ n',u'\ n \ n \ t \ t \ t \ t \ tVelocity Europe \ u2013巴塞罗那,西班牙\ xa0 \ n \ t \ t \ t \ n \ t \ t \ tNov 17 - 19,2014 \ t \ t \ t \ nexhibit \ n',u'\ n \ n \ t \ t \ tNH / VT第一届乐高联赛冠军赛\ xa0 \ n \ t \ t \ t \ n \ t \ t \ tDec 6,2014 \ t \ t \ t \ n \ n \ n \ n \ n \ n \ n \ n \ n \ n \ n“>
我是python的新手,所以你能建议我如何从这个结果中获取事件名称,日期,事件类型?
谢谢!
after using my scrape script:
from selenium import webdriver from bs4 import BeautifulSoup import csv browser = webdriver.Firefox() browser.get('http://dyn.com/about/events/') html = browser.page_source soup = BeautifulSoup(html) titles = [tag.text for tag in soup.find_all('p','pubdate')]
I have got the result that looks like:
[u'\n\n\t\t\tWEBINAR: How To Expand Your Global Reach To China\xa0\n\t\t\t\n\t\t\tOct 22, 2014\t\t\t\nspeak \n', u'\n\n\t\t\tLAUNCH Scale \u2013 San Francisco, CA\xa0\n\t\t\t\n\t\t\tOct 23 - 24, 2014\t\t\t\nattend \n', u'\n\n\t\t\tAcquia Engage User Conference \u2013 Boston, MA\xa0\n\t\t\t\n\t\t\tNov 3 - 5, 2014\t\t\t\nexhibitattend \n', u'\n\n\t\t\tCloud Expo \u2013 Santa Clara, CA\xa0\n\t\t\t\n\t\t\tNov 4 - 6, 2014\t\t\t\nexhibit \n', u'\n\n\t\t\tThe Global Carrier Awards 2014 \u2013 Amsterdam\xa0\n\t\t\t\n\t\t\tNov 4, 2014\t\t\t\n\n', u'\n\n\t\t\tWeb Summit \u2013 Dublin, Ireland\xa0\n\t\t\t\n\t\t\tNov 4 - 6, 2014\t\t\t\nspeak \n', u'\n\n\t\t\tVelocity Europe \u2013 Barcelona, Spain\xa0\n\t\t\t\n\t\t\tNov 17 - 19, 2014\t\t\t\nexhibit \n', u'\n\n\t\t\tNH/VT FIRST LEGO League Championship Event\xa0\n\t\t\t\n\t\t\tDec 6, 2014\t\t\t\nspeak \n']
I am new to python, so could you suggest how can I get Event Name, Date, Event Type from this result?
Thanks!
原文:https://stackoverflow.com/questions/26484951
满意答案
您应该使用外部联接。
select A.ID, A.DataA1, A.DataA2, B.A_ID, B.DataB1, B.DataB2, C.A_ID, C.DataC1, C.DataC2 from A left join B on A.ID = B.A_ID left join C on A.ID = C.A_ID
有关SQL连接的详细解释, 请访问 : http : //www.codinghorror.com/blog/2007/10/a-visual-explanation-of-sql-joins.html
You should use an outer join.
select A.ID, A.DataA1, A.DataA2, B.A_ID, B.DataB1, B.DataB2, C.A_ID, C.DataC1, C.DataC2 from A left join B on A.ID = B.A_ID left join C on A.ID = C.A_ID
For a good explanation of SQL joins checkout: http://www.codinghorror.com/blog/2007/10/a-visual-explanation-of-sql-joins.html
相关问答
更多为什么此查询仅返回非空子表的结果?(Why does this query only return results with non-empty child tables?)
SQL用于查询具有两个子表的父表中的行,如果不存在子行,则使用空值(SQL to query rows from parent tables with two child tables, with blank values if no child row exists)
从父表和子表中删除行(Deleting rows from parent and child tables)
一个父行,另一个表中的多个子行。(One parent row, multiple child rows in another table. How to get them all in one row?)
SQL父子查询 - 关系在两个表中定义(SQL Parent Child query - relation is defined in two tables)
Oracle SQL插入查询 - 进入父表和子表(Oracle SQL insert query - into parent and child tables)
简单的SQL来检查父项是否有任何子行(Simple SQL to check if parent has any child rows or not)
优化python csv处理到父和EAV子表(Optimize python csv processing into parent and EAV child table)
如何查询子表值(How to query child tables values)
查找具有完全相同的子行集的Sql父行(Find Sql parent rows with exactly same set of child rows)
相关文章
更多Python解析XML文档
探索 Python,第 1 部分: Python 的内置数值类型
python2和python3的区别
python的下载与安装
Python资源索引 【转载】
python字典操作
【转帖】Python 资源索引
Python内建函数(A)
Python的文件类型
最新问答
更多获取MVC 4使用的DisplayMode后缀(Get the DisplayMode Suffix being used by MVC 4)
如何通过引用返回对象?(How is returning an object by reference possible?)
矩阵如何存储在内存中?(How are matrices stored in memory?)
每个请求的Java新会话?(Java New Session For Each Request?)
css:浮动div中重叠的标题h1(css: overlapping headlines h1 in floated divs)
无论图像如何,Caffe预测同一类(Caffe predicts same class regardless of image)
xcode语法颜色编码解释?(xcode syntax color coding explained?)
在Access 2010 Runtime中使用Office 2000校对工具(Use Office 2000 proofing tools in Access 2010 Runtime)
从单独的Web主机将图像传输到服务器上(Getting images onto server from separate web host)
从旧版本复制文件并保留它们(旧/新版本)(Copy a file from old revision and keep both of them (old / new revision))
Copyright ©2023 peixunduo.com All Rights Reserved.粤ICP备14003112号
本站部分内容来源于互联网,仅供学习和参考使用,请莫用于商业用途。如有侵犯你的版权,请联系我们(neng862121861#163.com),本站将尽快处理。谢谢合作!