Python: Why you can’t use Arrays in Query Strings with Requests and urllib
And how to fix it
I’m calling an eBay endpoint that allows me to dictate which fields are returned by the endpoint by sending a parameter ouputSelector
. This parameter takes a list of names and expects the parameter in the URL query string to look like this:
...&outputSelector(0)=SellerInfo&outputSelector(1)=StoreInfo...
This the expected format for query strings with array values (they use parentheses but brackets are the norm). Simple enough, I tried passing the parameter to Python’s Requests library with an array value, extra_fields
:
extra_fields = ["PictureURLLarge", "PictureURLSuperSize", "GalleryInfo", "UnitPriceInfo"]query_params = {
"RESPONSE-DATA-FORMAT": "JSON",
"REST-PAYLOAD": "true",
"paginationInput.entriesPerPage": 100,
"paginationInput.pageNumber": page_number,
"outputSelector": extra_fields,
}response = requests.get(request_url, params = query_params)
It failed. Turns out Requests does not handle arrays in query strings. What a jarring oversight. For each value in extra_fields
Requests created an outputSelector
parameters in the generated URL:
https://svcs.ebay.com/services/search/FindingService/v1?RESPONSE-DATA-FORMAT=JSON&REST-PAYLOAD=true&paginationInput.entriesPerPage=100&paginationInput.pageNumber=1&
outputSelector=PictureURLLarge&
outputSelector=PictureURLSuperSize&
outputSelector=GalleryInfo&
outputSelector=UnitPriceInfo
eBay sees this and ignores all the outputSelector
parameters except for the last one. This makes sense, each copy of outputSelector
is overwriting the prior copy. But why doesn’t Requests handle arrays in query strings?
Side note, using “outputSelector[]”
as the parameter name doesn’t fix this issue. Requests treats “outputSelector[]”
like a regular parameter name.
Down the Rabbit Hole
I did some Googling and found this illuminating Stack Overflow question. One of the answers suggested using urllib
to generate a properly formatted URL. This gave me hope and led me down this rabbit hole. In summary, using urllib
didn't work, I had to hack together a solution.
I found that Requests uses urllib
’s urlencode
function to encode query parameters here and here. The culprit was urlencode
! urlencode
stuffs all the elements in the array into one parameter or multiple parameters, without indexing them, in the query string. This is why Requests fails to properly handle query parameters with array values.
Essentially, Requests checks if the passed parameter has an iterable as a value. Then it loops over the parameter’s value and puts each element and the parameter name into an array which is passed to urlencode
. Then urlencode
puts each value into separate parameters when an argument, doseq
, is true which is the case when Requests calls urlencode
. If doseq
is false all the values are put into one parameter.
This bug or oversight reaches all the way into the Cython bowls of Python. Absolutely amazing. I remember working in PHP and how miserable I was but everything did what you expected it to do when it came to requests. This was a surprising gotcha for a language as mature and user-friendly as Python. People write code and people forget I guess.
Solution
In the end, I decided to use a loop to populate the parameters:
extra_fields = [ "PictureURLLarge", "PictureURLSuperSize", "GalleryInfo", "UnitPriceInfo"]
query_params = {
"RESPONSE-DATA-FORMAT": "JSON",
"REST-PAYLOAD": "true",
"paginationInput.entriesPerPage": 100,
"paginationInput.pageNumber": page_number,
}
for index in range(0, len(extra_fields)):
field = extra_fields[index]
query_params[f"outputSelector[{index}]"] = fieldresponse = requests.get(REQUEST_URL, params = query_params)
Using the for
loop encodes the parameter in the query string correctly:
RESPONSE-DATA-FORMAT=JSON&REST-PAYLOAD=true&paginationInput.entriesPerPage=100&paginationInput.pageNumber=1&outputSelector%5B0%5D=PictureURLLarge&outputSelector%5B1%5D=PictureURLSuperSize&outputSelector%5B2%5D=GalleryInfo&outputSelector%5B3%5D=UnitPriceInfo
The whole URL:
https://svcs.ebay.com/services/search/FindingService/v1?RESPONSE-DATA-FORMAT=JSON&REST-PAYLOAD=true&paginationInput.entriesPerPage=100&paginationInput.pageNumber=1&outputSelector%5B0%5D=PictureURLLarge&outputSelector%5B1%5D=PictureURLSuperSize&outputSelector%5B2%5D=GalleryInfo&outputSelector%5B3%5D=UnitPriceInfo
And it works!
Sources