전체 페이지를 다운로드하지 않고 웹 페이지가 있는지 확인하는 Python 스크립트?

웹 페이지의 존재 여부를 테스트하는 스크립트를 작성하려고합니다. 전체 페이지를 다운로드하지 않고 확인하면 좋을 것입니다.

이것은 제가 뛰어 내리는 지점입니다. 동일한 예제에서 여러 예제가 httplib을 사용하는 것을 보았습니다. 그러나 확인한 모든 사이트는 false를 반환합니다.

import httplib
from httplib import HTTP
from urlparse import urlparse

def checkUrl(url):
    p = urlparse(url)
    h = HTTP(p[1])
    h.putrequest('HEAD', p[2])
    h.endheaders()
    return h.getreply()[0] == httplib.OK

if __name__=="__main__":
    print checkUrl("http://www.stackoverflow.com") # True
    print checkUrl("http://stackoverflow.com/notarealpage.html") # False

어떤 아이디어?

편집하다

누군가 제안했지만, 그들의 게시물은 삭제되었습니다. urllib2가 전체 페이지를 다운로드하지 않습니까?

import urllib2

try:
    urllib2.urlopen(some_url)
    return True
except urllib2.URLError:
    return False

해결법

==============================

1.이것은 어떤가요:

이것은 어떤가요:

import httplib
from urlparse import urlparse

def checkUrl(url):
    p = urlparse(url)
    conn = httplib.HTTPConnection(p.netloc)
    conn.request('HEAD', p.path)
    resp = conn.getresponse()
    return resp.status < 400

if __name__ == '__main__':
    print checkUrl('http://www.stackoverflow.com') # True
    print checkUrl('http://stackoverflow.com/notarealpage.html') # False

HTTP HEAD 요청을 보내고 응답 상태 코드가 <400이면 True를 반환합니다.

==============================
2.요청을 사용하면 다음과 같이 간단합니다.

요청을 사용하면 다음과 같이 간단합니다.
```
import requests

ret = requests.head('http://www.example.com')
print(ret.status_code)
```
이것은 단지 웹 사이트의 헤더를로드합니다. 이것이 성공했는지 테스트하려면 status_code 결과를 확인하십시오. 또는 연결이 성공적이지 않은 경우 예외를 발생시키는 raise_for_status 메소드를 사용하십시오.

==============================

3.이것은 어떤가요.

이것은 어떤가요.

import requests

def url_check(url):
    #Description

    """Boolean return - check to see if the site exists.
       This function takes a url as input and then it requests the site 
       head - not the full html and then it checks the response to see if 
       it's less than 400. If it is less than 400 it will return TRUE 
       else it will return False.
    """
    try:
            site_ping = requests.head(url)
            if site_ping.status_code < 400:
                #  To view the return status code, type this   :   **print(site.ping.status_code)** 
                return True
            else:
                return False
    except Exception:
        return False

==============================
4.당신은 시도 할 수 있습니다

당신은 시도 할 수 있습니다
```
import urllib2

try:
    urllib2.urlopen(url='https://someURL')
except:
    print("page not found")
```

from https://stackoverflow.com/questions/6471275/python-script-to-see-if-a-web-page-exists-without-downloading-the-whole-page by cc-by-sa and MIT license

'PYTHON' 카테고리의 다른 글

[PYTHON] PHP 스크립트에서 Python 스크립트로 값 전달하기 (0)	2018.11.16
[PYTHON] 'int'유형의 목록에있는 모든 항목을 테스트하는 방법은 무엇입니까? (0)	2018.11.16
[PYTHON] Numpy에서 어떻게 2-D 배열을 압축 할 수 있을까요? (0)	2018.11.15
[PYTHON] 문자열에서 QpushButton을 누르면 QlineEdit에서 텍스트를 가져 오는 방법은 무엇입니까? (0)	2018.11.15
[PYTHON] webdriver를 통해 javascript 팝업을 클릭하십시오. (0)	2018.11.15

복붙노트

[PYTHON] 전체 페이지를 다운로드하지 않고 웹 페이지가 있는지 확인하는 Python 스크립트?

전체 페이지를 다운로드하지 않고 웹 페이지가 있는지 확인하는 Python 스크립트?

해결법

1.이것은 어떤가요:

2.요청을 사용하면 다음과 같이 간단합니다.

3.이것은 어떤가요.

4.당신은 시도 할 수 있습니다

'PYTHON' 카테고리의 다른 글

티스토리툴바