IT박스

Python urllib2, 기본 HTTP 인증 및 tr.im

itboxs 2020. 9. 20. 09:11
반응형

Python urllib2, 기본 HTTP 인증 및 tr.im


URL을 단축하기 위해 tr.im API를 사용하는 코드를 작성하려고합니다 .

http://docs.python.org/library/urllib2.html을 읽은 후 다음을 시도했습니다.

   TRIM_API_URL = 'http://api.tr.im/api'
   auth_handler = urllib2.HTTPBasicAuthHandler()
   auth_handler.add_password(realm='tr.im',
                             uri=TRIM_API_URL,
                             user=USERNAME,
                             passwd=PASSWORD)
   opener = urllib2.build_opener(auth_handler)
   urllib2.install_opener(opener)
   response = urllib2.urlopen('%s/trim_simple?url=%s'
                              % (TRIM_API_URL, url_to_trim))
   url = response.read().strip()

response.code는 200입니다 (202 여야한다고 생각합니다). url은 유효하지만 단축 된 URL이 내 URL 목록 ( http://tr.im/?page=1 )에 없기 때문에 기본 HTTP 인증이 작동하지 않는 것 같습니다 .

http://www.voidspace.org.uk/python/articles/authentication.shtml#doing-it-properly를 읽은 후 다음 을 시도했습니다.

   TRIM_API_URL = 'api.tr.im/api'
   password_mgr = urllib2.HTTPPasswordMgrWithDefaultRealm()
   password_mgr.add_password(None, TRIM_API_URL, USERNAME, PASSWORD)
   auth_handler = urllib2.HTTPBasicAuthHandler(password_mgr)
   opener = urllib2.build_opener(auth_handler)
   urllib2.install_opener(opener)
   response = urllib2.urlopen('http://%s/trim_simple?url=%s'
                              % (TRIM_API_URL, url_to_trim))
   url = response.read().strip()

그러나 나는 같은 결과를 얻습니다. (response.code는 200이고 URL은 유효하지만 http://tr.im/의 내 계정에 기록되지 않았습니다 .)

다음과 같이 기본 HTTP 인증 대신 쿼리 문자열 매개 변수를 사용하는 경우 :

   TRIM_API_URL = 'http://api.tr.im/api'
   response = urllib2.urlopen('%s/trim_simple?url=%s&username=%s&password=%s'
                              % (TRIM_API_URL,
                                 url_to_trim,
                                 USERNAME,
                                 PASSWORD))
   url = response.read().strip()

... URL이 유효 할뿐만 아니라 내 tr.im 계정에 기록됩니다. (response.code는 여전히 200입니다.)

하지만 내 코드에 문제가 있어야합니다 (tr.im의 API가 아님).

$ curl -u yacitus:xxxx http://api.tr.im/api/trim_url.json?url=http://www.google.co.uk

...보고:

{"trimpath":"hfhb","reference":"nH45bftZDWOX0QpVojeDbOvPDnaRaJ","trimmed":"11\/03\/2009","destination":"http:\/\/www.google.co.uk\/","trim_path":"hfhb","domain":"google.co.uk","url":"http:\/\/tr.im\/hfhb","visits":0,"status":{"result":"OK","code":"200","message":"tr.im URL Added."},"date_time":"2009-03-11T10:15:35-04:00"}

... 그리고 URL이 http://tr.im/?page=1 의 URL 목록에 나타납니다 .

그리고 내가 실행하면 :

$ curl -u yacitus:xxxx http://api.tr.im/api/trim_url.json?url=http://www.google.co.uk

... 다시, 나는 얻는다 :

{"trimpath":"hfhb","reference":"nH45bftZDWOX0QpVojeDbOvPDnaRaJ","trimmed":"11\/03\/2009","destination":"http:\/\/www.google.co.uk\/","trim_path":"hfhb","domain":"google.co.uk","url":"http:\/\/tr.im\/hfhb","visits":0,"status":{"result":"OK","code":"201","message":"tr.im URL Already Created [yacitus]."},"date_time":"2009-03-11T10:15:35-04:00"}

메모 코드는 201이고 메시지는 "tr.im URL이 이미 생성됨 [yacitus]"입니다.

기본 HTTP 인증을 올바르게 수행하지 않아야합니다 (두 시도 모두). 내 문제를 찾을 수 있습니까? 아마도 나는 유선으로 전송되는 것을보고 봐야할까요? 전에 해본 적이 없습니다. 사용할 수있는 Python API가 있습니까 (아마도 pdb에 있음)? 또는 사용할 수있는 다른 도구 (Mac OS X 권장)가 있습니까?


This seems to work really well (taken from another thread)

import urllib2, base64

request = urllib2.Request("http://api.foursquare.com/v1/user")
base64string = base64.encodestring('%s:%s' % (username, password)).replace('\n', '')
request.add_header("Authorization", "Basic %s" % base64string)   
result = urllib2.urlopen(request)

Really cheap solution:

urllib.urlopen('http://user:xxxx@api.tr.im/api')

(which you may decide is not suitable for a number of reasons, like security of the url)

Github API example:

>>> import urllib, json
>>> result = urllib.urlopen('https://personal-access-token:x-oauth-basic@api.github.com/repos/:owner/:repo')
>>> r = json.load(result.fp)
>>> result.close()

Take a look at this SO post answer and also look at this basic authentication tutorial from the urllib2 missing manual.

In order for urllib2 basic authentication to work, the http response must contain HTTP code 401 Unauthorized and a key "WWW-Authenticate" with the value "Basic" otherwise, Python won't send your login info, and you will need to either use Requests, or urllib.urlopen(url) with your login in the url, or add a the header like in @Flowpoke's answer.

You can view your error by putting your urlopen in a try block:

try:
    urllib2.urlopen(urllib2.Request(url))
except urllib2.HTTPError, e:
    print e.headers
    print e.headers.has_key('WWW-Authenticate')

The recommended way is to use requests module:

#!/usr/bin/env python
import requests # $ python -m pip install requests
####from pip._vendor import requests # bundled with python

url = 'https://httpbin.org/hidden-basic-auth/user/passwd'
user, password = 'user', 'passwd'

r = requests.get(url, auth=(user, password)) # send auth unconditionally
r.raise_for_status() # raise an exception if the authentication fails

Here's a single source Python 2/3 compatible urllib2-based variant:

#!/usr/bin/env python
import base64
try:
    from urllib.request import Request, urlopen
except ImportError: # Python 2
    from urllib2 import Request, urlopen

credentials = '{user}:{password}'.format(**vars()).encode()
urlopen(Request(url, headers={'Authorization': # send auth unconditionally
    b'Basic ' + base64.b64encode(credentials)})).close()

Python 3.5+ introduces HTTPPasswordMgrWithPriorAuth() that allows:

..to eliminate unnecessary 401 response handling, or to unconditionally send credentials on the first request in order to communicate with servers that return a 404 response instead of a 401 if the Authorization header is not sent..

#!/usr/bin/env python3
import urllib.request as urllib2

password_manager = urllib2.HTTPPasswordMgrWithPriorAuth()
password_manager.add_password(None, url, user, password,
                              is_authenticated=True) # to handle 404 variant
auth_manager = urllib2.HTTPBasicAuthHandler(password_manager)
opener = urllib2.build_opener(auth_manager)

opener.open(url).close()

It is easy to replace HTTPBasicAuthHandler() with ProxyBasicAuthHandler() if necessary in this case.


I would suggest that the current solution is to use my package urllib2_prior_auth which solves this pretty nicely (I work on inclusion to the standard lib.


Same solutions as Python urllib2 Basic Auth Problem apply.

see https://stackoverflow.com/a/24048852/1733117; you can subclass urllib2.HTTPBasicAuthHandler to add the Authorization header to each request that matches the known url.

class PreemptiveBasicAuthHandler(urllib2.HTTPBasicAuthHandler):
    '''Preemptive basic auth.

    Instead of waiting for a 403 to then retry with the credentials,
    send the credentials if the url is handled by the password manager.
    Note: please use realm=None when calling add_password.'''
    def http_request(self, req):
        url = req.get_full_url()
        realm = None
        # this is very similar to the code from retry_http_basic_auth()
        # but returns a request object.
        user, pw = self.passwd.find_user_password(realm, url)
        if pw:
            raw = "%s:%s" % (user, pw)
            auth = 'Basic %s' % base64.b64encode(raw).strip()
            req.add_unredirected_header(self.auth_header, auth)
        return req

    https_request = http_request

Try python-request or python-grab

참고URL : https://stackoverflow.com/questions/635113/python-urllib2-basic-http-authentication-and-tr-im

반응형