URL地址编码和解码0. 参考
1.
2.1. The main parts of URLs
A full BNF description of the URL syntax is given in Section 5.
In general, URLs are written as follows:
<scheme>:<scheme-specific-part>
A URL contains the name of the scheme being used (<scheme>) followed
by a colon and then a string (the <scheme-specific-part>) whose
interpretation depends on the scheme.
Scheme names consist of a sequence of characters. The lower case
letters "a"--"z", digits, and the characters plus ("+"), period
("."), and hyphen ("-") are allowed. For resiliency, programs
interpreting URLs should treat upper case letters as equivalent to
lower case in scheme names (e.g., allow "HTTP" as well as "http").
注意字母不区分⼤⼩写
2. python2
2.1
1 >>> import urlliburl编码和utf8区别
2 >>> url = 'web page'
3 >>> url_en = urllib.quote(url)    #空格编码为“%20”
4 >>> url_plus = urllib.quote_plus(url)    #空格编码为“+”
5 >>> url_en_twice = urllib.quote(url_en)
6 >>> url
7'web page'
8 >>> url_en
9'http%3A//web%20page'
10 >>> url_plus
11'http%3A%2F%2Fweb+page'
12 >>> url_en_twice
13'http%253A//web%2520page'#出现%25说明是⼆次编码
14#相应解码
15 >>> urllib.unquote(url_en)
16'web page'
17 >>> urllib.unquote_plus(url_plus)
18'web page'
2.2 URL含有中⽂
1 >>> import urllib
2 >>> url_zh = u'movie.douban/tag/美国'
3 >>> url_zh_en = urllib.quote(de('utf-8'))    #参数为string
4 >>> url_zh_en
5'http%3A//movie.douban/tag/%E7%BE%8E%E5%9B%BD'
6 >>> print urllib.unquote(url_zh_en).decode('utf-8')
7 movie.douban/tag/美国
3. python3
3.1
1 >>> import urllib
2 >>> url = 'web page'
3 >>> url_en = urllib.parse.quote(url)    #注意是urllib.parse.quote
4 >>> url_plus = urllib.parse.quote_plus(url)
5 >>> url_en
6'http%3A//web%20page'
7 >>> url_plus
8'http%3A%2F%2Fweb+page'
9 >>> urllib.parse.unquote(url_en)
10'web page'
11 >>> urllib.parse.unquote_plus(url_plus)
12'web page'
3.2 URl含中⽂
1 >>> import urllib
2 >>> url_zh = 'movie.douban/tag/美国'
3 >>> url_zh_en = urllib.parse.quote(url_zh)
4 >>> url_zh_en
5'http%3A//movie.douban/tag/%E7%BE%8E%E5%9B%BD'
6 >>> urllib.parse.unquote(url_zh_en)
7'movie.douban/tag/美国'
4. 其他
1 >>> help(urllib.urlencode)
2 Help on function urlencode in module urllib:
3
4 urlencode(query, doseq=0)
5    Encode a sequence of two-element tuples or dictionary into a URL query string. 6
7    If any values in the query arg are sequences and doseq is true, each
8    sequence element is converted to a separate parameter.
9
10    If the query arg is a sequence of two-element tuples, the order of the
11    parameters in the output will match the order of parameters in the
12    input.
13
14 >>>

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。