Light Mode

Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

jmcarp/robobrowser

Repository files navigation

RoboBrowser: Your friendly neighborhood web scraper

Homepage: http://robobrowser.readthedocs.org/

RoboBrowser is a simple, Pythonic library for browsing the web without a standalone web browser. RoboBrowser can fetch a page, click on links and buttons, and fill out and submit forms. If you need to interact with web services that don't have APIs, RoboBrowser can help.

form['q'].value = 'porcupine tree' browser.submit_form(form) # Look up the first song songs = browser.select('.song_link') browser.follow_link(songs[0]) lyrics = browser.select('.lyrics') lyrics[0].text # \nHear the sound of music ... # Back to results page browser.back() # Look up my favorite song song_link = browser.get_link('trains') browser.follow_link(song_link) # Can also search HTML using regex patterns lyrics = browser.find(class_=re.compile(r'\blyrics\b')) lyrics.text # \nTrain set and match spied under the blind...">import re
from robobrowser import RoboBrowser

# Browse to Genius
browser = RoboBrowser(history=True)
browser.open('http://genius.com/')

# Search for Porcupine Tree
form = browser.get_form(action='/search')
form #
form['q'].value = 'porcupine tree'
browser.submit_form(form)

# Look up the first song
songs = browser.select('.song_link')
browser.follow_link(songs[0])
lyrics = browser.select('.lyrics')
lyrics[0].text # \nHear the sound of music ...

# Back to results page
browser.back()

# Look up my favorite song
song_link = browser.get_link('trains')
browser.follow_link(song_link)

# Can also search HTML using regex patterns
lyrics = browser.find(class_=re.compile(r'\blyrics\b'))
lyrics.text # \nTrain set and match spied under the blind...

RoboBrowser combines the best of two excellent Python libraries: Requests and BeautifulSoup. RoboBrowser represents browser sessions using Requests and HTML responses using BeautifulSoup, transparently exposing methods of both libraries:

# #
, # ... browser.find(class_=re.compile(r'column', re.I)) #

About

No description, website, or topics provided.

Resources

Readme

License

BSD-3-Clause license

Contributing

Contributing

Stars

Watchers

Forks

Packages

Contributors