복붙노트

[PYTHON] HTTPError : HTTP 오류 403 : 금지됨

PYTHON

HTTPError : HTTP 오류 403 : 금지됨

나는 개인용 파이썬 스크립트를 만들고 있지만 위키 피 디아에서는 작동하지 않습니다 ...

이 일:

import urllib2, sys
from bs4 import BeautifulSoup

site = "http://youtube.com"
page = urllib2.urlopen(site)
soup = BeautifulSoup(page)
print soup

이것은 작동하지 않습니다 :

import urllib2, sys
from bs4 import BeautifulSoup

site= "http://en.wikipedia.org/wiki/StackOverflow"
page = urllib2.urlopen(site)
soup = BeautifulSoup(page)
print soup

이것은 오류입니다.

Traceback (most recent call last):
  File "C:\Python27\wiki.py", line 5, in <module>
    page = urllib2.urlopen(site)
  File "C:\Python27\lib\urllib2.py", line 126, in urlopen
    return _opener.open(url, data, timeout)
  File "C:\Python27\lib\urllib2.py", line 406, in open
    response = meth(req, response)
  File "C:\Python27\lib\urllib2.py", line 519, in http_response
    'http', request, response, code, msg, hdrs)
  File "C:\Python27\lib\urllib2.py", line 444, in error
    return self._call_chain(*args)
  File "C:\Python27\lib\urllib2.py", line 378, in _call_chain
    result = func(*args)
  File "C:\Python27\lib\urllib2.py", line 527, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
HTTPError: HTTP Error 403: Forbidden

해결법

  1. ==============================

    1.현재 코드 내에서 :

    현재 코드 내에서 :

    import urllib2, sys
    from BeautifulSoup import BeautifulSoup
    
    site= "http://en.wikipedia.org/wiki/StackOverflow"
    hdr = {'User-Agent': 'Mozilla/5.0'}
    req = urllib2.Request(site,headers=hdr)
    page = urllib2.urlopen(req)
    soup = BeautifulSoup(page)
    print soup
    
    from bs4 import BeautifulSoup
    from urllib.request import Request, urlopen
    
    site= "http://en.wikipedia.org/wiki/StackOverflow"
    hdr = {'User-Agent': 'Mozilla/5.0'}
    req = Request(site,headers=hdr)
    page = urlopen(req)
    soup = BeautifulSoup(page)
    print(soup)
    
    from selenium import webdriver as driver
    
    browser = driver.PhantomJS()
    p = browser.get("http://en.wikipedia.org/wiki/StackOverflow")
    assert "Stack Overflow - Wikipedia" in browser.title
    

    수정 된 버전이 작동하는 이유는 Wikipedia가 사용자 에이전트가 "인기있는 브라우저"

  2. from https://stackoverflow.com/questions/13055208/httperror-http-error-403-forbidden by cc-by-sa and MIT license