From . in regex to SSRF - part 1
In test of one application I have encountered bug in regex that leaded to Server Side Request Forgery (SSRF). Way of finding it was huge fun and excitement. It was also my first bug on production system ever.
During a recon I have found service called image-converter. It was definitely interesting, but not straight forward to exploit. I had no example of usage it and on simple GET request I was just getting:
That was first major problem for me. I was trying with some simple query parameters like:
?url=
?width=
?name=
and so on but without luck. Then I tried with https://github.com/s0md3v/Arjun which is tool for automated parameter discovery. This also failed. I was pretty sure that there is something out there, but I couldn’t force it to work.
Then I started digging in what is this error message that I see all the time: "Cannot read property 'groups' of null"
. This leads me to stackoverflow question about JavaScript and regex error. After that I was wondering: “How the hell they have implemented this?”. After hour of trying and failure, I got it:
https://api.example.org/image-converter/width=100/http://google.com
In my almost 10 years IT career, I didn’t see service implementation like that 😉
My positive energy went down, as I realized that there is domain whitelisting implemented. I have picked main domain www.example.org
and in fact it was working:
https://api.example.org/image-converter/width=100/https://www.example.org
I got response:
In this moment I was sure about SSRF, but still had whitelisting to bypass.
My first approach was to take SSRF from PayloadAllTheThings and test it. I don’t want to copy all that here. There is dozen of payloads. Sadly not of it worked. I got very interested in Orange: A New Era SSRF, but that was also death end.
I was pretty puzzled. Having high hope on some nice bug, but it looked like this service was secured. Good thing was that I have learned a lot, especially from Orange paper.
Next day with fresh head I took different way. During recon I have noted two other domains connected with main one: www.example.net
and www.example.com
. It turn out that those domains where also whitelisted. Having a background in programming I knew that developers have a tendency to write “nice code”, so maybe they used regex to check domain suffix? And guess what? They did! For request:
https://api.example.org/image-converter/width=100/https://www-example.org
I got response:
Hurray!
What exactly regex they used ? I think something like this regex101:
And what they should use is: www\.example\.(com|net|org)
Next I have registered www-example.com
domain and started playing with escalation this. More about it in part 2.
Thanks for reading! You can follow me on Twitter.